RESTART: Which is faster: floating point division or casting to 64-bit int?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

RESTART: Which is faster: floating point division or casting to 64-bit int?

1,835 Views
myke_predko
Senior Contributor III

@aberger @ErichStyger 

I'm requesting that this thread be restarted because every post to it results in "This widget could not be displayed."

Could you please repost the original question and replies?  

To be honest, this is more for intellectual curiosity than anything else.  

Thanx!

0 Kudos
Reply
4 Replies

1,821 Views
ErichStyger
Specialist I

I'm not able to access the original post and answers, and I'm getting the same error message.

As for the original question: that would depend on the core used, and if 'floating point' is double or single precision. If it has an FPU available, my take is that using the FPU will be the fastest method anyway, but to be sure you would have to look at the generated instructions and measure them, for example with the cylce counter (https://mcuoneclipse.com/2018/06/28/measuring-arm-cortex-m-cpu-cycles-spent-with-the-mcuxpresso-ecli... ).

0 Kudos
Reply

1,815 Views
myke_predko
Senior Contributor III

Hi @ErichStyger 

The reason why I asked because (according to the subject) it was comparing floating point division to 64bit integer division.  

I'm curious to see what were the comments.  

0 Kudos
Reply

1,810 Views
danielchen
NXP TechSupport
NXP TechSupport

The original post is:

danielchen_0-1629934831169.png

 

Regards

Daniel

0 Kudos
Reply

1,805 Views
myke_predko
Senior Contributor III

Thank you @danielchen 

These questions are interesting because they're multi-dimensional and dependent on how much control you want over the process.  

I just did some research and the answer isn't obvious - I would probably recommend building a test application and trying out different methods to find out which method is fastest as well as the most accurate (see below).  

The approach *I* would try, after characterizing (timing) the two examples you listed would be to break "scaleFactor" into high and low 32bit parts, do the floating multiplication on everything and, finally, add the products together after they're converted from floats to 64bit integers.  

float    scaleFactorHigh = (float)(scaleFactor >> 32);
float scaleFactorLow  = (float)(scaleFactor & 0xFFFFFF);
float  abRatio = (float)(A / B);
uint64_t result = ((uint64_t)(scaleFactorHigh * abRatio) << 32) +
(uint64_t)(scaleFactorLow * abRatio));

The big issue that I can see with this approach is what version of the M4 "VCT" instruction (convert float to int) does the compiler use?  Straight "VCT" truncates the product while "VCTR" rounds it (which is what you want in this case).  That could introduce an error.  I have made all types float until calculating "result" because there seems to be an ineffiiciency multiplying floats and integers together.  

If speed was of the absolute essence along with absolutely accurate values, I would write the above statements in assembler making sure that the errors that occur in the conversion from integers to floats is minimized and no clock cycle costrly instruction cycles are used.  

0 Kudos
Reply