I am running a simple filter algorithm on the K60 but this is taking too long and using up almost all of the processor's power (when using CodeWarrior).
As long as the length of the filter's taps is 32, 64 etc. it is fast enough (because a divide by 64 is performed by a shift in the code) but as soon as this is not the case the divide increases its time dramatically.
The divide is performed by __FSL_s32_div_f() but I was expecting that the Cortex SDIV would be used since it is a signed 32 bit divide which I believe takes about 12 clocks to complete. The subroutine presently being called is presumably doing the work in software.
The build settings are for Cortex M4 (with FP in SW) but I don't see how maybe the simple divide can be controlled.
Question - is this a (CW) settings issue or does one have to write routines in assembler to make use of instruction set capabilities?