If your compiler is 'forcing' the type 'uint32' for the input/output of that intrinsic function, then I simply suggest you declare the associated vars with a 'union' of float and uint32 (and int32?) and let the compiler 'look' at them the appropriate way in the appropriate places (var.fval, var.uval, var.sub.s32, var.sub.s16.lo, var.sub.s8.hh etc.).
| typedef | union{ | //Any of three ways to look at a 32-bit number |
| float | fval; |
| uint32_t | uval; |
| | | union{ | //Byte/Word/Dword, little-endian |
struct{ } s16; struct{ | uint8_t ll; | | uint8_t lm; | | uitn8_t hm; | | int8_t hh; |
} s8; int32_t s32; }sub; |
| }Multi_t; | |
For example:
Multi_t inval, outval;
inval.fval = 3.1415927f;
outval.uval = __builtin_bswap32(inval.uval); //For the IAR compiler, use __REV( );
The net result in outval.uval being 0xDB 0F 49 40
You can also use this union to 'assemble' (or disassemble!) something, say from a 'byte peripheral' (big endian order in this case!)
outval.sub.s8.hh = GetSPI( ); //Or some such byte-fetch
outval.sub.s8.hm = GetSPI( );
outval.sub.s8.lm = GetSPI( );
outval.sub.s8.ll = GetSPI( );
Now outval.fval is your whole floating-point number (for instance), assembled using the direct 'byte' instructions of the CPU without << and &0xFF ops (and reliance on the compiler to make THAT efficient). Furthermore, if you move this code to a 'big endian' CPU environment all you have to do is swap the order of the s8 and s16 'sub elements' in the union and the rest of your code will STILL have the correct results!