Hi,

Does 2.45 milliseconds for a single 512 point FFT (32-bit fixed-point, executing on a single core from M3 memory) seem outrageous for the MSC8144ADS ? I've seen some benchmarks for TI processors indicating ~150 usec.

Thanks,

zpyatt

edit: I forgot to mention this is with optimization level 4 for execution speed, and seems to be independent of SmartDSP or C vs. C+++.

Tom, thanks for the input I really appreciate it.

So I made to massive discoveries.

1.) I was setting the compiler directive FIXED_POINT for KISS FFT to 32, I thought this was what I wanted becuase I wanted a 32 bit number. However KISS FFT uses a 64-bit overflow when yo do this, setting FIXED_POINT=1 decreased my time to ~ 800 usec. / 512-pt FFT.

2.) Moving my arrays from shared DDR, M3, M2 to shared cacheable DDR, M3, M2 gets me to ~ 83 usec. / 512-pt FFT. in fact when using the cacheable areas of memory DDR vs. M2 makes no noticeable difference.

/zpyatt