AnsweredAssumed Answered

Inefficient code in nadk_memcpy.h

Question asked by Clemens Eisserer on Feb 17, 2016
Latest reply on Mar 16, 2016 by Clemens Eisserer

Hi,

 

I just had a look at nadk_memcpy.h delivered as part of ls2085a-sdk-ear5 and to me the code seems to be rather inefficient.
It seems to be based on a hand-optimized memcpy routine for Intel SSE capable CPUs and defines macros to perform block-wise SIMD moves which for the freescale-version have been replaced with calls to memcpy:

 

[code]

static inline void nadk_mov64(uint8_t *dst, const uint8_t *src) { memcpy(dst, src, 64); }

static inline void nadk_mov128(uint8_t *dst, const uint8_t *src) { memcpy(dst, src, 128); }

[/code]

 

Later those macros are called from within nadk_memcpy_func(), which handles all the alignment issues one would have to care about when those macros would actually be *real* assembler. While most likely the generated code isn't as horrible as the C-code suggests, I still don't understand why nadk_memcpy isn't simply redirecting to memcpy?
Memcpy most likely is already SIMD optimized, and for small memcpys the compiler can use fast inline-versions. At least it would remove a lot of code which most likely doesn't do what it has been designed for (to provide a fast and efficient version of memcpy I presume)

 

Best regards

Outcomes