Using native assembler instruction VSQRT.F64 instead of double sqrt(double x).

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

Using native assembler instruction VSQRT.F64 instead of double sqrt(double x).

ソリューションへジャンプ
2,409件の閲覧回数
magro732
Contributor I

I am using the MIMXRT1170-EVK board and MCUXpresso IDE and trying to use the assembler instruction for SQRT with double precision on the Cortex-M7. I can get this working for single-precision using:

float arm_sqrt_f32(float x)
{
  float returnValue;
  __asm__("VSQRT.F32 %0, %1" : "=t" (returnValue) : "t" (x));
  return returnValue;
}

Is the VSQRT.F64-instruction available for iMXRT1170? When I use the __asm__-keyword:

double arm_sqrt_f64(double x)
{
  double returnValue;
  __asm__("VSQRT.F64 %0, %1" : "=w" (returnValue) : "w" (x));
  return returnValue;
}

I get the error message:
Error: invalid instruction shape -- `vsqrt.f64 s0,s0'

So it seems to be available but the asm-keyword is not generating the right registers. I found an old post about this in Bug #1856486 “constraint “w” produces access to single precissio...” : Bugs : GNU Arm Embedded Toolc..., where it is reported as a bug in GCC but that was several years ago.

How can I enable VSQRT.F64?

0 件の賞賛
返信
1 解決策
2,375件の閲覧回数
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

I modified your code, it can be compiled without any problem.

jingpan_0-1640154053790.png

 

Regards,

Jing

 

 

元の投稿で解決策を見る

0 件の賞賛
返信
9 返答(返信)
2,399件の閲覧回数
magro732
Contributor I

I eventually got this working myself by defining my own C-callable assembler function.

But it would still be nice to be able to use GCC inline assembler instead of a seperate .asm-file so if anyone know how to do this, please post me.

0 件の賞賛
返信
2,392件の閲覧回数
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

You can force it as inline.

__attribute__((always_inline)) inline float arm_sqrt_f32(float x)

{

...

}

 

Regards,

Jing

0 件の賞賛
返信
2,380件の閲覧回数
magro732
Contributor I

Thanks for your reply but I'm not having problems with the single precision version, it is the double version I have problems with.

Right now I have made a custom assembler function for double precision looking like:

  .global arm_sqrt_f64
  .section .text
  .type arm_sqrt_f64,%function

arm_sqrt_f64:
  .fnstart
  vsqrt.f64 d0, d0
  bx lr
  .fnend

But this function cannot be inlined so I still don't get optimal performance.

Is there a way for me to define a double precision sqrt that is possible to inline?

0 件の賞賛
返信
2,376件の閲覧回数
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

I modified your code, it can be compiled without any problem.

jingpan_0-1640154053790.png

 

Regards,

Jing

 

 

0 件の賞賛
返信
1,124件の閲覧回数
Ahlan
Contributor III

Hi,

We had the same problem trying to use double precision float with the inline assembler for ARM and your solution to use %P  worked. So many thanks for your post. Where on Earth did you find out about %P. Try as I might I can't find this wonderful secret documented anywhere.

0 件の賞賛
返信
2,349件の閲覧回数
magro732
Contributor I

Thanks! This works for me.

Do you know why the standard sqrt included from math.h isn't using this assembler instruction? The single precision version sqrtf is using the assembler instruction but not the double version.

0 件の賞賛
返信
2,342件の閲覧回数
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

Sorry I can't find related information.

 

Regards,

Jing

0 件の賞賛
返信
2,337件の閲覧回数
magro732
Contributor I

Don't you NXP-people think this is a flaw in the library support? If I had not stumbled upon this I would just have used the math.h version of SQRT which is not using the assembler instruction. And the CMSIS-DSP library does not include a 64-bit version of SQRT either.

Should it be necessary to write your own inline-assembler to get full performance for 64-bit SQRT from the M7 processor?

0 件の賞賛
返信
2,329件の閲覧回数
jingpan
NXP TechSupport
NXP TechSupport

Hi @magro732 ,

Yes, I agree with you. CMSIS-DAP is released by ARM. But if ARM doesn't add this feature, we can make a patch. I'll escalate your suggestion.

 

Regards,

Jing

0 件の賞賛
返信