I compared load time of a FIT image from a PNOR with UBI/UBIFS on our boot loaders
On LS1021A:
* From the boot loader:
ls -l /mnt/primary/
lrwxrwxrwx 12 fit.itb -> fitImage.itb
-rwxr-xr-x 6757428 fitImage.itb
time cp /mnt/primary/fit.itb /
time: 3093ms
This gives about 2MiB/S
Broadcom iProc CPU: 3MB/s
* From Linux:
time cp /mnt/primary/fit.itb /tmp
real 0m1.608s
Or about 4.02 MB/s
========================================================
On P1014:
Our boot loader:
time cp /mnt/primary/fit.itb /
time: 742ms
Or about 5.34MB/s
Linux:
[root@openware]# time cp /mnt/primary/fit.itb /tmp/
real 0m0.907s
Or about 4MB/s
Note that at the MTD level, U-boot and our boot loader have the same performance. At the MTD level, mostly it is a memcpy.
Under Linux the data rate is about the same for both CPU but I get a great difference under the boot loaders. All platforms only uses one CPU.
The PPC boot loader is more than twice as fast as the LS1021A in term of data rate and in our measurement 4 times as fast to get to the prompt. Both boot loader do the same thing as the LS1021A is a replacement for the P1014 on our system.
Also we noticed that calculating the kernel sha256 is much slower on LS1021A than P1014. On both our boot loader and NXP U-boot it takes more than 5s. While the PPC is below 1s.
Some measurements of SHA1 below as there is no SHA256 on U-boot:
File size 6757428
Our boot loader: 6.458ms
U-boot: 28.178
Linux:0.565
My understanding is that U-boot uses the hash driver in drivers/crypto/fsl for sha1 but our bootloader and Linux uses a driver support in arch/arm/crypto for.
And also MD5 sum on a 5818764 bytes file:
md5sum ls1021a (u-boot): 27s
BCOM iProc: 0.359s
Any clue of what could be missing. As far as I know the L2 cache is fully under hardware control on the LS1021A. Why are U-boot, our bootloader not able to get the same performance as Linux?