I am using the MIMXRT1170-EVK board and MCUXpresso IDE and trying to use the assembler instruction for SQRT with double precision on the Cortex-M7. I can get this working for single-precision using:
float arm_sqrt_f32(float x)
{
float returnValue;
__asm__("VSQRT.F32 %0, %1" : "=t" (returnValue) : "t" (x));
return returnValue;
}
Is the VSQRT.F64-instruction available for iMXRT1170? When I use the __asm__-keyword:
double arm_sqrt_f64(double x)
{
double returnValue;
__asm__("VSQRT.F64 %0, %1" : "=w" (returnValue) : "w" (x));
return returnValue;
}
I get the error message:
Error: invalid instruction shape -- `vsqrt.f64 s0,s0'
So it seems to be available but the asm-keyword is not generating the right registers. I found an old post about this in Bug #1856486 “constraint “w” produces access to single precissio...” : Bugs : GNU Arm Embedded Toolc..., where it is reported as a bug in GCC but that was several years ago.
How can I enable VSQRT.F64?
Solved! Go to Solution.
I eventually got this working myself by defining my own C-callable assembler function.
But it would still be nice to be able to use GCC inline assembler instead of a seperate .asm-file so if anyone know how to do this, please post me.
Hi @magro732 ,
You can force it as inline.
__attribute__((always_inline)) inline float arm_sqrt_f32(float x)
{
...
}
Regards,
Jing
Thanks for your reply but I'm not having problems with the single precision version, it is the double version I have problems with.
Right now I have made a custom assembler function for double precision looking like:
.global arm_sqrt_f64
.section .text
.type arm_sqrt_f64,%function
arm_sqrt_f64:
.fnstart
vsqrt.f64 d0, d0
bx lr
.fnend
But this function cannot be inlined so I still don't get optimal performance.
Is there a way for me to define a double precision sqrt that is possible to inline?
Hi,
We had the same problem trying to use double precision float with the inline assembler for ARM and your solution to use %P worked. So many thanks for your post. Where on Earth did you find out about %P. Try as I might I can't find this wonderful secret documented anywhere.
Thanks! This works for me.
Do you know why the standard sqrt included from math.h isn't using this assembler instruction? The single precision version sqrtf is using the assembler instruction but not the double version.
Don't you NXP-people think this is a flaw in the library support? If I had not stumbled upon this I would just have used the math.h version of SQRT which is not using the assembler instruction. And the CMSIS-DSP library does not include a 64-bit version of SQRT either.
Should it be necessary to write your own inline-assembler to get full performance for 64-bit SQRT from the M7 processor?
Hi @magro732 ,
Yes, I agree with you. CMSIS-DAP is released by ARM. But if ARM doesn't add this feature, we can make a patch. I'll escalate your suggestion.
Regards,
Jing