GCC compiler in CW10.1 optimized for M4?

tmf · ‎10-14-2011

Is it aware of optimizations for M4? It seems not to be......

would you not expect

static inline int MULSHIFT32(register int x, register int y)
{
return (((Word64)x*(Word64)y)>>32);
}

SMULL R3, R2, R1, R0
MOVS R0, R2

to be compiled to a couple op-codes?

Richard.

MikeAdv · ‎12-03-2011

This is along the lines of my question as well.

I want to use multiply two 32 bit numbers and get the top 32 bits of the result. How do I do that most efficiently from C?

eli_hughes · ‎12-03-2011

Look in the CMSIS DSP libraries. I believe all you have to do is cast to a "long long" for the result and bit shift

MikeAdv · ‎12-03-2011

I have tried this and it doesn't seem to work.

int my_mult(int a, int b) { int c; c = ((long long)a*b) >> 32; return c;}

This function generates the following assembly:

          my_mult:1fff0238:   push {r3-r4}41         return c;1fff023a:   asr r12,r1,#asr #311fff023e:   umull r2,r3,r0,r11fff0242:   mla r3,r0,r12,r31fff0246:   asrs r4,r0,#311fff0248:   mla r3,r4,r1,r31fff024c:   cpy r0,r342        }

No smull there. Furthermore, I don't understand why it is using an unsigned multiply (umull) when I'm using signed ints.

I'm not familiar with the CMSIS library. I'll try to figure that out.