If we run completely in kernel and use kmalloc to allocate 1M buffers, it takes about 2ms to copy one to another. If run completely in userspace and use malloc to allocate 1M buffers, it takes about .5ms to copy them. We tried both standard memcpy and a NEON optimized copy for both, no difference. We changed the size of buffers and made no difference, always about 1/4 the speed.
Why is kmalloc'ed memory slower then malloc'ed memory? Is the cache different?
Running iMX6 DL Sabre SDP using latest Freescale ltib generated kernel (3.0.5.35-2039)
已解决! 转到解答。
I thought maybe this related to the pressure of Kernel Normal Zone and High Memroy.
User malloc is all allocate from Hign Zone, and it was NOT phycal continue.
But Kmalloc is allocate from Normal Zone, and it was Physical continue.
So you maybe want to check:
1. what your memory free pages situation of each zone by /proc/zoneinfo
2. try the vmalloc in kernel, this can be compare to malloc in user space, kmalloc is physical continue, it's was very few 1M continue phy memory after kernel boot.
I thought maybe this related to the pressure of Kernel Normal Zone and High Memroy.
User malloc is all allocate from Hign Zone, and it was NOT phycal continue.
But Kmalloc is allocate from Normal Zone, and it was Physical continue.
So you maybe want to check:
1. what your memory free pages situation of each zone by /proc/zoneinfo
2. try the vmalloc in kernel, this can be compare to malloc in user space, kmalloc is physical continue, it's was very few 1M continue phy memory after kernel boot.
Sorry to say I had tried vmalloc(), exact same results.
zone normal had 172K free, zone DMA had 42K free, Both changed by ~1K pages when program was run, which was what they should have.