Comparing PPC and ARM code size, we have Linux 5.17 running on a P1014 and LS1021A with the same peripheral configuration. The PPC kernel size is 3.8MB while the LS1021A is 8.6MB.
I was wondering why the ARM kernel was so much bigger. Any clue?
1) The compiler is not the same.
2) The compiling arguments may be not the same.
3) We usually use uImage for PPC which is compressed, and use Image for ARM. Can you check the file size of vmlinux or Image.gz for the two arch?
I compared load time of a FIT image from a PNOR with UBI/UBIFS on our boot loaders
On LS1021A:
* From the boot loader:
ls -l /mnt/primary/
lrwxrwxrwx 12 fit.itb -> fitImage.itb
-rwxr-xr-x 6757428 fitImage.itb
time cp /mnt/primary/fit.itb /
time: 3093ms
This gives about 2MiB/S
Broadcom iProc CPU: 3MB/s
* From Linux:
time cp /mnt/primary/fit.itb /tmp
real 0m1.608s
Or about 4.02 MB/s
========================================================
On P1014:
Our boot loader:
time cp /mnt/primary/fit.itb /
time: 742ms
Or about 5.34MB/s
Linux:
[root@openware]# time cp /mnt/primary/fit.itb /tmp/
real 0m0.907s
Or about 4MB/s
Note that at the MTD level, U-boot and our boot loader have the same performance. At the MTD level, mostly it is a memcpy.
Under Linux the data rate is about the same for both CPU but I get a great difference under the boot loaders. All platforms only uses one CPU.
The PPC boot loader is more than twice as fast as the LS1021A in term of data rate and in our measurement 4 times as fast to get to the prompt. Both boot loader do the same thing as the LS1021A is a replacement for the P1014 on our system.
Also we noticed that calculating the kernel sha256 is much slower on LS1021A than P1014. On both our boot loader and NXP U-boot it takes more than 5s. While the PPC is below 1s.
Some measurements of SHA1 below as there is no SHA256 on U-boot:
File size 6757428
Our boot loader: 6.458ms
U-boot: 28.178
Linux:0.565
My understanding is that U-boot uses the hash driver in drivers/crypto/fsl for sha1 but our bootloader and Linux uses a driver support in arch/arm/crypto for.
And also MD5 sum on a 5818764 bytes file:
md5sum ls1021a (u-boot): 27s
BCOM iProc: 0.359s
Any clue of what could be missing. As far as I know the L2 cache is fully under hardware control on the LS1021A. Why are U-boot, our bootloader not able to get the same performance as Linux?
28s for sha1 test in u-boot is abnormal.
We run following command in u-boot, it returns around 1s.
=> hash sha1 0x80000000 6757428
sha1 for 80000000 ... 86757427 ==> e1ba2ad7fdfde87015ea9c3aadc33575c7b9416e
=>
How did you test? Are you using LSDK to test? Which version?