Inefficient code in nadk_memcpy.h

Showing results for 
Search instead for 
Did you mean: 

Inefficient code in nadk_memcpy.h

Contributor II


I just had a look at nadk_memcpy.h delivered as part of ls2085a-sdk-ear5 and to me the code seems to be rather inefficient.
It seems to be based on a hand-optimized memcpy routine for Intel SSE capable CPUs and defines macros to perform block-wise SIMD moves which for the freescale-version have been replaced with calls to memcpy:


static inline void nadk_mov64(uint8_t *dst, const uint8_t *src) { memcpy(dst, src, 64); }

static inline void nadk_mov128(uint8_t *dst, const uint8_t *src) { memcpy(dst, src, 128); }


Later those macros are called from within nadk_memcpy_func(), which handles all the alignment issues one would have to care about when those macros would actually be *real* assembler. While most likely the generated code isn't as horrible as the C-code suggests, I still don't understand why nadk_memcpy isn't simply redirecting to memcpy?
Memcpy most likely is already SIMD optimized, and for small memcpys the compiler can use fast inline-versions. At least it would remove a lot of code which most likely doesn't do what it has been designed for (to provide a fast and efficient version of memcpy I presume)

Best regards

Labels (1)
Tags (1)
1 Reply

Contributor II

(When) will this be fixed?

0 Kudos