Unsigned Long to Double - value is truncated?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Unsigned Long to Double - value is truncated?

1,237 Views
jscott
Contributor II

I am trying to read a double-precision floating point number from Flash into RAM.  It is correctly stored in flash as:

 

0x3e3ce7b697e010B6 = 6.7299999999999997139836383094E-9

 

Since Processor Expert provides methods for reading (at most) Long data types, I am loading 4 bytes at a time using:

     double myDbl;

     unsigned long temp0, temp1;

     IFsh1_TAddress Addr = FLASH_LOC;                 //this is where the 8-byte double is correctly stored as 0x3e3ce7b6 97e010B6

     IFsh1_GetLongFlash(Addr, &temp0);                    //here temp0 is correctly populated with 0x3E3CE7B6

     IFsh1_GetLongFlash(Addr+4, &temp1);                //here temp1 is correctly populated with 0x97E010B6

     myDbl = ((unsigned long long)(temp0)<<32) | (unsigned long long)(temp1)      //adding/ORing produces the same result

 

But, no matter how i cast or combine these variables in RAM, the last byte of the fractional part of myDbl is truncated.  I wind up with:

 

myDbl = 0x3e3ce7b697e01000 = 4.4847141003722793e+018 (according to the debugger)

 

I can manually edit the fraction in the debugger, but it won't let me change "00" to "B6".  It's almost like it thinks that would be out of range for a double, which it's not.  I've tried casting temp0, temp1, and their combination as long, ulong, long long, and unsigned long long.  I've tried adding, ORing, and anything else i could think of. 

 

The result is the same even if I hard code it as myDbl = (double)(0x3E3CE7B697E010B6);

 

This is the FRDM-KL02Z board, and I'm using KDS 2.0.0 with the Cross ARM GNU Assembler and Cross ARM C Compiler and GDB PEMicro (OpenSDA) debugger.  Optimizations are set to none and Endianness is Little.

 

Casting issue?  Compiler issue?  Can anyone help???

 

Thank you in advance,

Jonathan

Labels (1)
7 Replies

877 Views
Carlos_Mendoza
NXP Employee
NXP Employee

Hi Jonathan,

Can you share your project so we can test it on our side?

Best Regards,

Carlos Mendoza

Technical Support Engineer

0 Kudos

877 Views
jscott
Contributor II

Hello Carlos,

Did you receive my project? Have you had a chance to try it out?

Thanks,

Jonathan

0 Kudos

877 Views
Carlos_Mendoza
NXP Employee
NXP Employee

Hi Jonathan,

Yes, I received your project and I'm already reviewing it. I will come back to you asap.

Best Regards,

Carlos Mendoza

Technical Support Engineer

0 Kudos

877 Views
Carlos_Mendoza
NXP Employee
NXP Employee

Hi Jonathan,

Apologies for the delay, the behavior you see of the lost of precision in your double variable happens because not all 64 bits of your integer number fit on the significant field (52 bits) of your double precision floating-point variable.

The representations of floating-point variables are encoded in the following three fields:

pastedImage_0.png

The values of k, p, t, w, and bias for binary interchange formats are listed on the following table:

pastedImage_1.png

You can find more information on the following link:

http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=4610935

Please let me know if this helps or if you have any question.

Best Regards,

Carlos Mendoza

Technical Support Engineer

-----------------------------------------------------------------------------------------------------------------------

Note: If this post answers your question, please click the Correct Answer button. Thank you!

-----------------------------------------------------------------------------------------------------------------------

0 Kudos

877 Views
jscott
Contributor II

Hi Carlos,

Thank you for the information, but I’m afraid I still don’t see the problem.

In the example I gave with 0x3E 3C E7 B6 97 E0 10 B6, the binary representation is 64 bits long:

00111110 00111100 11100111 10110110 10010111 11100000 00010000 10110110

Where,

The sign is 1 bit long:

0

The exponent (w) is 11 bits long (995 decimal, below the maximum of 1023):

01111100011

And the significand (t) is indeed 52 bits long:

1100111001111011011010010111111000000001000010110110

As far as I can tell, this is a completely valid double precision value according to the references you provided. An online converter here confirms:

http://www.binaryconvert.com/result_double.html?hexadecimal=3E3CE7B697E010B6

Interestingly enough, if you plug the truncated value 0x3e3ce7b697e01000 into this converter, it generates a number very close to the 6.72999…E-9 shown above. But the KDS debugger (GDB PEMicro Interface) shows 4.4847141003722793e+018. You should be able to replicate this very easily with the project I provided.

If, as it appears, the fields are indeed the correct length and within range, why does the debugger show such a different value?

Thank you,

Jonathan

877 Views
jscott
Contributor II

I found the problem, and it had more to do with endianness than anything else.  If I store the double in flash as [b4..b7][b0..b3], then just do an 8-byte memcopy to RAM, everything works.

877 Views
jscott
Contributor II

Hi Carlos,

Sure, I can share it with Freescale.  Please see the attached.  The brute force attempt to convert a value is on line 62 of init.c.

Thank you,

Jonathan

0 Kudos