My processor is running at 40MHz bus.
I'm working with an array of uint32_t and converting it to floats to another array. The maximum size is 128 elements in each array.
I verified my application takes 6ms+ to loop and narrowed down the problem to the code below. This code takes 6ms to run:
for ( i = 0; i < currentBufferSize && i < nSamples; i++ )
{
FsamplesA[i] = (float)samplesA[i]*3.3/4096.0 - fZeroCurrent;
FsamplesB[i] = (float)samplesB[i]*3.3/4096.0 - fZeroCurrent;
FsamplesC[i] = (float)samplesC[i]*3.3/4096.0 - fZeroCurrent;
}
So there is an assignment after a cast, multiplication, division and a simple offset subtraction.
I calculated it takes 624 instruction cycles to do the calcs and populate just a single float buffer element!
Does this seem right for a Cortex-M4F?
Using MCUXpresso V10.2.0.
Thanks!
Thanks for the input Erich. I tired using the f-suffix, but I'm getting about the same results as before. Anyway, I've since changed the code so that I do the conversions after each ADC sample is ready and just populate the float buffer with that value - no intermediate buffers/conversions. With that, I have the main loop at about 400us (from 6ms).
Hi Ed,
see as well the discussion in Be aware: Floating Point Operations on ARM Cortex-M4F | MCU on Eclipse on that subject. You can use a gcc compiler option to force single precision constants too.
I hope this helps,
Erich