Is it aware of optimizations for M4? It seems not to be......
would you not expect
static inline int MULSHIFT32(register int x, register int y)
{
return (((Word64)x*(Word64)y)>>32);
}
SMULL R3, R2, R1, R0
MOVS R0, R2
to be compiled to a couple op-codes?
Richard.
This is along the lines of my question as well.
I want to use multiply two 32 bit numbers and get the top 32 bits of the result. How do I do that most efficiently from C?
Look in the CMSIS DSP libraries. I believe all you have to do is cast to a "long long" for the result and bit shift
I have tried this and it doesn't seem to work.
int my_mult(int a, int b) { int c; c = ((long long)a*b) >> 32; return c;}
This function generates the following assembly:
my_mult:1fff0238: push {r3-r4}41 return c;1fff023a: asr r12,r1,#asr #311fff023e: umull r2,r3,r0,r11fff0242: mla r3,r0,r12,r31fff0246: asrs r4,r0,#311fff0248: mla r3,r4,r1,r31fff024c: cpy r0,r342 }
No smull there. Furthermore, I don't understand why it is using an unsigned multiply (umull) when I'm using signed ints.
I'm not familiar with the CMSIS library. I'll try to figure that out.