Using native assembler instruction VSQRT.F64 instead of double sqrt(double x).

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Using native assembler instruction VSQRT.F64 instead of double sqrt(double x).

Jump to solution
2,406 Views
magro732
Contributor I

I am using the MIMXRT1170-EVK board and MCUXpresso IDE and trying to use the assembler instruction for SQRT with double precision on the Cortex-M7. I can get this working for single-precision using:

float arm_sqrt_f32(float x)
{
  float returnValue;
  __asm__("VSQRT.F32 %0, %1" : "=t" (returnValue) : "t" (x));
  return returnValue;
}

Is the VSQRT.F64-instruction available for iMXRT1170? When I use the __asm__-keyword:

double arm_sqrt_f64(double x)
{
  double returnValue;
  __asm__("VSQRT.F64 %0, %1" : "=w" (returnValue) : "w" (x));
  return returnValue;
}

I get the error message:
Error: invalid instruction shape -- `vsqrt.f64 s0,s0'

So it seems to be available but the asm-keyword is not generating the right registers. I found an old post about this in Bug #1856486 “constraint “w” produces access to single precissio...” : Bugs : GNU Arm Embedded Toolc..., where it is reported as a bug in GCC but that was several years ago.

How can I enable VSQRT.F64?

0 Kudos
Reply
1 Solution
2,372 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

I modified your code, it can be compiled without any problem.

jingpan_0-1640154053790.png

 

Regards,

Jing

 

 

View solution in original post

0 Kudos
Reply
9 Replies
2,396 Views
magro732
Contributor I

I eventually got this working myself by defining my own C-callable assembler function.

But it would still be nice to be able to use GCC inline assembler instead of a seperate .asm-file so if anyone know how to do this, please post me.

0 Kudos
Reply
2,389 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

You can force it as inline.

__attribute__((always_inline)) inline float arm_sqrt_f32(float x)

{

...

}

 

Regards,

Jing

0 Kudos
Reply
2,377 Views
magro732
Contributor I

Thanks for your reply but I'm not having problems with the single precision version, it is the double version I have problems with.

Right now I have made a custom assembler function for double precision looking like:

  .global arm_sqrt_f64
  .section .text
  .type arm_sqrt_f64,%function

arm_sqrt_f64:
  .fnstart
  vsqrt.f64 d0, d0
  bx lr
  .fnend

But this function cannot be inlined so I still don't get optimal performance.

Is there a way for me to define a double precision sqrt that is possible to inline?

0 Kudos
Reply
2,373 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

I modified your code, it can be compiled without any problem.

jingpan_0-1640154053790.png

 

Regards,

Jing

 

 

0 Kudos
Reply
1,121 Views
Ahlan
Contributor III

Hi,

We had the same problem trying to use double precision float with the inline assembler for ARM and your solution to use %P  worked. So many thanks for your post. Where on Earth did you find out about %P. Try as I might I can't find this wonderful secret documented anywhere.

0 Kudos
Reply
2,346 Views
magro732
Contributor I

Thanks! This works for me.

Do you know why the standard sqrt included from math.h isn't using this assembler instruction? The single precision version sqrtf is using the assembler instruction but not the double version.

0 Kudos
Reply
2,339 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

Sorry I can't find related information.

 

Regards,

Jing

0 Kudos
Reply
2,334 Views
magro732
Contributor I

Don't you NXP-people think this is a flaw in the library support? If I had not stumbled upon this I would just have used the math.h version of SQRT which is not using the assembler instruction. And the CMSIS-DSP library does not include a 64-bit version of SQRT either.

Should it be necessary to write your own inline-assembler to get full performance for 64-bit SQRT from the M7 processor?

0 Kudos
Reply
2,326 Views
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

Yes, I agree with you. CMSIS-DAP is released by ARM. But if ARM doesn't add this feature, we can make a patch. I'll escalate your suggestion.

 

Regards,

Jing

0 Kudos
Reply