CW code generation

badaboom · ‎02-26-2009

Hi,

i've got a problem with code generation by CW 7.1 build 14.

Some settings:

__fourbyteints__ 0 (defined 0, by clearing ' 4-byte integers' checkbox on the setting panel )
MCF52235

Very simple clearing of a bit:

Enable int. by clearing corresp. bit mask:

MCF_INTC0_IMRL &= ~(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

Used definitions:

#define MCF_INTC0_IMRL (*(vuint32*)(&__IPSBAR[0xC0C]))

#define MCF_INTC_IMRL_INT_MASK15 (0x8000)

#define MCF_INTC_IMRL_MASKALL (0x1)

MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

Generated code:

; 398: MCF_INTC0_IMRL &= ~(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000D4  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000DA  0x203C00007FFE           move.l   #32766,d0             ; '....'
0x000000E0  0xC190                   and.l    d0,(a0)

This also clears b31...16; and THAT was not intended. It seems that 'it' generated the mask 16 bit wide.

When you clear f.e. b14 then there's no problem.

MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL );

;397: MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000C6  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000CC  0x203CFFFFBFFE           move.l   #-16386,d0            ; '....'
0x000000D2  0xC190                   and.l    d0,(a0)

Also when you cast to a uint32 the code is correct.

MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

;399:MCF_INTC0_IMRL & = ~(uint32 (MCF_INTC_IMRL_INT_MASK15 |  MCF_INTC_IMRL_MASKALL ); 
;
0x000000E2  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000E8  0x203CFFFF7FFE           move.l   #-32770,d0            ; '....'
0x000000EE  0xC190                   and.l    d0,(a0)

Also when you try to clear b31, the compiler uses 64 bit sizes??

;400: MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK31 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000F0  0x4DF900000000           lea      ___IPSBAR+3084,a6
0x000000F6  0x2F560004               move.l   (a6),4(a7)
0x000000FA  0x41EF0014               lea      20(a7),a0
0x000000FE  0x2E88                   move.l   a0,(a7)
0x00000100  0x4EB900000000           jsr      ___rt_ultoi64
0x00000106  0x72FF                   moveq    #-1,d1
0x00000108  0x203C7FFFFFFE           move.l   #2147483646,d0        ; '....'
0x0000010E  0x2F400010               move.l   d0,16(a7)
0x00000112  0x2F41000C               move.l   d1,12(a7)
0x00000116  0x2010                   move.l   (a0),d0
0x00000118  0x2F6800040008           move.l   4(a0),8(a7)
0x0000011E  0x2F400004               move.l   d0,4(a7)
0x00000122  0x41EF001C               lea      28(a7),a0
0x00000126  0x2E88                   move.l   a0,(a7)
0x00000128  0x4EB900000000           jsr      ___rt_and64
0x0000012E  0x2CA80004               move.l   4(a0),(a6)
;
;  401:  MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK31 | MCF_INTC_IMRL_MASKALL ); 
;
0x00000132  0x41F900000000           lea      ___IPSBAR+3084,a0
0x00000138  0x203C7FFFFFFE           move.l   #2147483646,d0        ; '....'
0x0000013E  0xC190                   and.l    d0,(a0)
;

But when you: #define __fourbyteints__ 1

then the generated code is ok.

My question is, how can avoid this code generation error without the need of

explicitly casting when there seems no need to do so.

thanks,

René

jbezem · ‎03-10-2009

I can see no immediate problem with the code you provided. The instruction you highlighted might very well result from some form of optimization.

- the variable 'res' (implicit 'int', like "volatile int res;"?) is not used in the snippet;

- the variable 'state' is set to '2', after which the if condition is unconditionally true, and can be optimized-away.

Sadly, you do not list the rest of the code. Could you provide some more context, and state, why you think the code is wrong? And list the options you compiled with (optimization, target processor, size of an integer)?

FWIW,

Johan

homeness · ‎03-10-2009

The whole code:

unsigned char state;
volatile res;
int main(void)
{
state = 2;
if ( state == 2 || state == 3 )
{
res = 1;
}
else
{
res = 0;
}
return res;
}

; 6: if ( state == 2 || state == 3 )

; 7: {

;

0x0000000C 0x7200 moveq #0,d1

0x0000000E 0x123900000000 move.b _state,d1

0x00000014 0x0681000000FE addi.l #254,d1 ; '....'

0x0000001A 0x7001 moveq #1,d0

0x0000001C 0xB280 cmp.l d0,d1

0x0000001E 0x6208 bhi.s *+10

The code here should do the following:

1. Get state value;

2. Subtract 2;

3. Compare with 1;

4. Branch if zero;

Instead of this the code does the following:

1. Get state value;

2. Add 254 (the result is 256 or 257...)

3. Compare with 1 (!);

4. Branch always...

So the compiler should use FFFFFFFE instead of FE (for chars) or FFFE (for shorts)...

My environment:

1. Targer processor - 5282;

2. Optimization level 1 or above;

jbezem · ‎03-10-2009

I cannot reproduce your results. When using no optimization, each step in the C code is visible in assembly, also the check for '3'. When using '-O2', '2' is stored in 'state', and '1' in 'res', return value read from 'res'. When using '-O1' I get this:

0x00000000                    _main:;                             main:0x00000000  0x4E560000               link     a6,#00x00000004  0x7002                   moveq    #2,d00x00000006  0x13C000000000           move.b   d0,_state0x0000000C  0x103900000000           move.b   _state,d00x00000012  0x49C0                   extb.l   d00x00000014  0x2040                   movea.l  d0,a00x00000016  0x41E800FE               lea      254(a0),a00x0000001A  0x7200                   moveq    #0,d10x0000001C  0x3208                   move.w   a0,d10x0000001E  0x0281000000FF           andi.l   #0xff,d1              ; '....'0x00000024  0x7001                   moveq    #1,d00x00000026  0xB280                   cmp.l    d0,d10x00000028  0x6208                   bhi.s    *+10                  ; 0x000000320x0000002A  0x23C000000000           move.l   d0,_res0x00000030  0x6006                   bra.s    *+8                   ; 0x000000380x00000032  0x42B900000000           clr.l    _res0x00000038  0x203900000000           move.l   _res,d00x0000003E  0x4E5E                   unlk     a60x00000040  0x4E75                   rts

which looks OK to me. Did you really try optimization level 2? I get this:

0x00000000                    _main:;                             main:0x00000000  0x4E560000               link     a6,#00x00000004  0x7002                   moveq    #2,d00x00000006  0x13C000000000           move.b   d0,_state0x0000000C  0x7001                   moveq    #1,d00x0000000E  0x23C000000000           move.l   d0,_res0x00000014  0x203900000000           move.l   _res,d00x0000001A  0x4E5E                   unlk     a60x0000001C  0x4E75                   rts

Please try and elaborate more on the exact options used. I got these results using the command line compiler using: "Command_Line_Tools\mwccmcf.exe" -dis -O2 test.c

(or '-O1' or '-O0', as appropriate).

FWIW,

Johan

homeness · ‎03-10-2009

Can you please try the example in the attachment?

Thanks,

Peter

Project_1.zip

Message Edited by t.dowe on 2009-10-15 10:36 AM

jbezem · ‎03-10-2009

OK, after I saw that main.c contained the exact source you presented, I loaded the project, right-clicked on main.c and recompiled, then right-clicked again and disassembled.

(Before that, I just disassembled and saw your code as presented.)

It came out just like I said in a previous message.

So maybe all you have to do is recompile unconditionally, and then disassemble again. There maybe some remains from a previous compilation(?) using different options or processor type, different source code, whatever...

HTH,

Johan

jbezem · ‎03-10-2009

Sorry, but this is not a small example anymore. And you didn't even tell me what I should look for.

My CW installation is at work, and being active in the forum is one thing, but testing a full application during working hours is another.

And please be aware that your ZIP file is free for everyone to download...

Something much smaller would be welcome.

BR,

Johan

homeness · ‎03-10-2009

Ok, I did smaller bug demo project.

Thanks,

Peter

cw7bug.zip

Message Edited by t.dowe on 2009-10-15 10:37 AM

jbezem · ‎03-10-2009

OK, the relevant portion of the smaller sample comes out like this:

;   15:  if ( uchar == 2 || uchar == 3 ) // incorrect code ;   16:   { ;0x00000022  0x103900000000           move.b   _uchar,d00x00000028  0x49C0                   extb.l   d00x0000002A  0x2040                   movea.l  d0,a00x0000002C  0x41E800FE               lea      254(a0),a00x00000030  0x7200                   moveq    #0,d10x00000032  0x3208                   move.w   a0,d10x00000034  0x0281000000FF           andi.l   #0xff,d1              ; '....'0x0000003A  0x7001                   moveq    #1,d00x0000003C  0xB280                   cmp.l    d0,d10x0000003E  0x632C                   bls.s    *+46                  ; 0x0000006c;;   19:  if ( ushort == 2 || ushort == 3 ) // incorrect code ;   20:   { ;0x00000040  0x303900000000           move.w   _ushort,d00x00000046  0x06800000FFFE           addi.l   #65534,d0             ; '....'0x0000004C  0x7200                   moveq    #0,d10x0000004E  0x3200                   move.w   d0,d10x00000050  0x7001                   moveq    #1,d00x00000052  0xB280                   cmp.l    d0,d10x00000054  0x6320                   bls.s    *+34                  ; 0x00000076;;   23:  if ( uint == 2 || uint == 3 ) // ok ;   24:   { ;0x00000056  0x207900000000           movea.l  _uint,a00x0000005C  0x5588                   subq.l   #2,a00x0000005E  0x7001                   moveq    #1,d00x00000060  0xB1C0                   cmpa.l   d0,a00x00000062  0x631C                   bls.s    *+30                  ; 0x00000080;

[Beware: Different source code than in the previous messages!]

All three look good to me. What's your problem with this translation?

FWIW,

Johan

CompilerGuru · ‎03-10-2009

Which exact compiler version is causing the bug?

I saw a 7.1.1 update with the CW Updater, I wonder if it does contain a compiler update.

Daniel

homeness · ‎03-10-2009

Unfortunately, neither 7.1.1 nor 7.1.2 (our company has it) doesn't resolve this problem...

CompilerGuru · ‎03-10-2009

I did compile it with and without the update, but I get the same code both times.

And as far as I see it looks correct (could be optimized more though).

I get for the byte case:

0x00000022  0x103900000000           move.b   _uchar,d0
0x00000028  0x49C0                   extb.l   d0
0x0000002A  0x2040                   movea.l  d0,a0
0x0000002C  0x41E800FE               lea      254(a0),a0
0x00000030  0x7200                   moveq    #0,d1
0x00000032  0x3208                   move.w   a0,d1
0x00000034  0x0281000000FF           andi.l   #0xff,d1              ; '....'
0x0000003A  0x7001                   moveq    #1,d0
0x0000003C  0xB280                   cmp.l    d0,d1
0x0000003E  0x632C                   bls.s    *+46                  ; 0x0000006c

Note the andi.l, which cuts the result of "uchar + 254" back to a char. That cut is not present in the initial snippet (so the initial snippet may show a bug), but here it is. Note that adding 254 or subtracting 2 is the same as long as the result is cut back the right type. With the subtracting, the cut could be optimized away.

Does the simplified version reproduce the bug, or does it need more context in order to trigger it?

(Or is it a bug, after all...).

I did not try the full app.

homeness · ‎03-10-2009

Just a mystery... I stably have incorrect code with optimization level 1 or above.

;   15:  if ( uchar == 2 || uchar == 3 ) // incorrect code
;   16:   {
;
0x00000022  0x7200                   moveq    #0,d1
0x00000024  0x123900000000           move.b   _uchar,d1
0x0000002A  0x7001                   moveq    #1,d0
0x0000002C  0x0681000000FE           addi.l   #254,d1               ; '....'
0x00000032  0xB280                   cmp.l    d0,d1
0x00000034  0x632A                   bls.s    *+44                  ; 0x00000060
;
;   19:  if ( ushort == 2 || ushort == 3 ) // incorrect code
;   20:   {
;
0x00000036  0x7200                   moveq    #0,d1
0x00000038  0x323900000000           move.w   _ushort,d1
0x0000003E  0x06810000FFFE           addi.l   #65534,d1             ; '....'
0x00000044  0x7001                   moveq    #1,d0
0x00000046  0xB280                   cmp.l    d0,d1
0x00000048  0x6320                   bls.s    *+34                  ; 0x0000006a

And yes, the simplified version reproduce the bug

J2MEJediMaster · ‎03-11-2009

Time to get to the bottom of this problem. Please file a service request on the issue by clicking here.

homeness · ‎03-31-2009

Finally we have the following:

1) 7.1.1 generates correct code but crashes compiling our project (if any oprimization level is set globally or with pragma).

2) 7.1.2 doesn't crash but generates incorrect code...

All who tries to compile my examples have no problem with code because use 7.1.1 version.

p_vagnoni · ‎10-14-2009

Hello,

I am writing just to add that the last patch 7.1.2a (date 7/30/2009) doesn't solve the problem because with optimization the problem is still present.

I verify that without optimization the compiler works properly.

CompilerGuru · ‎10-15-2009

I was not able to reproduce the bug with the provided arvchive :smileysad:

Was the bug reported to support as J2MEJediMaster recommended?

Daniel

p_vagnoni · ‎10-15-2009

Hello Daniel,

to reproduce dthe bug you have to compile the attached file with optimization level 1 and without optimization. in first case the variable res will be 0 and 3, in other case you will be res = 0, after res = 1 and after res = 3 (as you expected reading the C file).

In the second cicle for you could check that if you divide the if control it works also with optimization.

Let me know, meanwhile I report the bug to the support but really I hope that BIG problem was managed in other way.... Someone that read the forum works for Freescale or not?

Bye

main.c

Message Edited by t.dowe on 2009-10-15 10:38 AM

CrasyCat · ‎10-16-2009

Hello

This forum is not the appropriate way to report a compiler bug and get it fixed.

If you feel like compiler is generating incorrect code you need to submit a service request to get the bug recorded. This is the only way you can get a chance to see the bug being fixed some day.

CrasyCat

CompilerGuru · ‎03-03-2009

>The reasoning remains almost the same even if the evaluation order changes a little;

> the anomaly lies in the difference between 0x8000 and 32768

Is that not the same?

jbezem · ‎03-03-2009

From a mathematical point of view, they are the same, but on a 16-bit integer machine, unadorned constants between 2^15 and 2^16-1 are treated differently if specified as hex/octal (0xFFFF is unsigned int) or if specified in decimal (65535 is signed long).

Did I write 'long'? Yes I did, and correctly so... I may have to revise my four answers then once again. Sorry, MrBean! The day was long, I'll leave that for tomorrow.

Update: No, of course not. All constants in the four cases of MrBean were in hex, so I do not have to revise my answers. OK, shut down the machine and relax a bit.

Bye,

Johan

Message Edited by jbezem on 2009-03-03 05:59 PM

jbezem · ‎03-03-2009

OK, assuming ints are 16-bit:

- in the bit-or expression, the negation of MCF_INTC0_IMRL has the highest precedence; the macro has type uint32, and is loaded into d0.

- then the negation is performed using d0.

- since the negated expression still has the type uint32, the integer 0x8000 is implicitly converted to uint32 (left-to-right associativity of operator bit-or); however, I read-up on conversion of constants: Since the constant is given as 0x8000 (and not 32768, the same value but a different notation!) it is considered of type 'unsigned int', not signed. See H&S 5th edition, 2.7.1: "An interesting point to note ... is that integers in the range of 2^15 through 2^16-1 will have positive values when written as decimal constants but negative values when written as octal or hexadecimal constants (and cast to type 'int')");

- and or-ed into the result (later optimized and taken together with the next constant)

- then 0x0001 is taken (type: signed int), and also converted to uint32 (sign-extension is trivial here), and ored into the result.

And finally the result is negated again, and stored into the variable/register.

QED IMHO.

FWIW,

Johan

CW code generation

CW code generation

General