CW code generation

badaboom · ‎02-26-2009

Hi,

i've got a problem with code generation by CW 7.1 build 14.

Some settings:

__fourbyteints__ 0 (defined 0, by clearing ' 4-byte integers' checkbox on the setting panel )
MCF52235

Very simple clearing of a bit:

Enable int. by clearing corresp. bit mask:

MCF_INTC0_IMRL &= ~(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

Used definitions:

#define MCF_INTC0_IMRL (*(vuint32*)(&__IPSBAR[0xC0C]))

#define MCF_INTC_IMRL_INT_MASK15 (0x8000)

#define MCF_INTC_IMRL_MASKALL (0x1)

MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

Generated code:

; 398: MCF_INTC0_IMRL &= ~(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000D4  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000DA  0x203C00007FFE           move.l   #32766,d0             ; '....'
0x000000E0  0xC190                   and.l    d0,(a0)

This also clears b31...16; and THAT was not intended. It seems that 'it' generated the mask 16 bit wide.

When you clear f.e. b14 then there's no problem.

MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL );

;397: MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000C6  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000CC  0x203CFFFFBFFE           move.l   #-16386,d0            ; '....'
0x000000D2  0xC190                   and.l    d0,(a0)

Also when you cast to a uint32 the code is correct.

MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );

;399:MCF_INTC0_IMRL & = ~(uint32 (MCF_INTC_IMRL_INT_MASK15 |  MCF_INTC_IMRL_MASKALL ); 
;
0x000000E2  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000E8  0x203CFFFF7FFE           move.l   #-32770,d0            ; '....'
0x000000EE  0xC190                   and.l    d0,(a0)

Also when you try to clear b31, the compiler uses 64 bit sizes??

;400: MCF_INTC0_IMRL &= ~ (MCF_INTC_IMRL_INT_MASK31 | MCF_INTC_IMRL_MASKALL ); 
;
0x000000F0  0x4DF900000000           lea      ___IPSBAR+3084,a6
0x000000F6  0x2F560004               move.l   (a6),4(a7)
0x000000FA  0x41EF0014               lea      20(a7),a0
0x000000FE  0x2E88                   move.l   a0,(a7)
0x00000100  0x4EB900000000           jsr      ___rt_ultoi64
0x00000106  0x72FF                   moveq    #-1,d1
0x00000108  0x203C7FFFFFFE           move.l   #2147483646,d0        ; '....'
0x0000010E  0x2F400010               move.l   d0,16(a7)
0x00000112  0x2F41000C               move.l   d1,12(a7)
0x00000116  0x2010                   move.l   (a0),d0
0x00000118  0x2F6800040008           move.l   4(a0),8(a7)
0x0000011E  0x2F400004               move.l   d0,4(a7)
0x00000122  0x41EF001C               lea      28(a7),a0
0x00000126  0x2E88                   move.l   a0,(a7)
0x00000128  0x4EB900000000           jsr      ___rt_and64
0x0000012E  0x2CA80004               move.l   4(a0),(a6)
;
;  401:  MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK31 | MCF_INTC_IMRL_MASKALL ); 
;
0x00000132  0x41F900000000           lea      ___IPSBAR+3084,a0
0x00000138  0x203C7FFFFFFE           move.l   #2147483646,d0        ; '....'
0x0000013E  0xC190                   and.l    d0,(a0)
;

But when you: #define __fourbyteints__ 1

then the generated code is ok.

My question is, how can avoid this code generation error without the need of

explicitly casting when there seems no need to do so.

thanks,

René

jbezem · ‎02-27-2009

I didn't check any settings of the compiler and your variant, but if you clear '__fourbyteints__', you'd have 16-bit integers, I assume. That means that your 0x8000 bit definition is taken to be a negative number (signed!).

Bitwise negation of signed numbers is never a good idea, but since we're operating on 16-bit integers, the negated mask becomes 0x7FFE (positive again!), and the generated code seems OK to me, even if it is not what you intended.

With 4-byte integers the situation is different: 0x8000 is effectively 0x00008000, so your inverted bit-mask becomes 0xFFFF7FFE, what you intended.

To remedy this without too many casts, you could use 0x8000U, which in case of 2-byte ints would be converted to long (still signed, but 32-bit!), and everything would work as expected; I didn't verify this, though.

In case of 4-byte ints this mask gets converted to 'unsigned int', so it will work also.

Or you add at least one cast to convert one of the bit-masks to the type of your target, like ((vuint32)0x8000U), to make everything clear.

IMHO this is not a compiler problem.

HTH,

Johan

badaboom · ‎02-27-2009

Thanks for the replies..

I think it's better to cast at the operation then

to check the 'Coldfire module header files' for all '0x8000's

MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL );
MCF_INTC0_IMRL &= ~ (uint32)(MCF_INTC_IMRL_INT_MASK31 | MCF_INTC_IMRL_MASKALL );
becomes:

;398: MCF_INTC0_IMRL &= ~(uint32)(MCF_INTC_IMRL_INT_MASK15|
                                  MCF_INTC_IMRL_MASKALL);
;
0x000000DC  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000E2  0x203CFFFF7FFE           move.l   #-32770,d0            ; '....'
0x000000E8  0xC190                   and.l    d0,(a0)
;
;399: MCF_INTC0_IMRL &= ~(uint32)(MCF_INTC_IMRL_INT_MASK31 |
                                  MCF_INTC_IMRL_MASKALL);
;
0x000000EA  0x41F900000000           lea      ___IPSBAR+3084,a0
0x000000F0  0x203C7FFFFFFE           move.l   #2147483646,d0        ; '....'
0x000000F6  0xC190                   and.l    d0,(a0)

Just redefining the constant as: #define MCF_INTC_IMRL_INT_MASK15 (0x8000u) doesn't force the compiler to use unsigned. After negation the result is positive and will fit in THE 16bit wide int, and therefore clearing the high word (as positive result).

Only making it 32 bit wide solves this.

#define MCF_INTC_IMRL_INT_MASK15 (0x8000uL) or

#define MCF_INTC_IMRL_INT_MASK15 (uint32)(0x8000)

I think I agree it's not a compiler issue but let's say I wasn't forwarned because the operation output is an vuint32 and there's wasn't a compiler warning "Implicit arithmetic conversion from 'signed xxx' to 'unsigned xxx'.

René

jbezem · ‎02-27-2009

badaboom wrote:
...
I think it's better to cast at the operation ...

Agreed.

...
Just redefining the constant as: #define MCF_INTC_IMRL_INT_MASK15 (0x8000u) doesn't force the compiler to use unsigned.

Yes, it does, but contrary to what I suggested, it doesn't convert to 'signed long', but leaves it to be an 'unsigned int'; my mistake. So it remains 16-bit wide until after the negation, and then gets zero-extended to 32-bits before and-ing with the resulting value. Not what you intended.

...
Only making it 32 bit wide solves this.
#define MCF_INTC_IMRL_INT_MASK15 (0x8000uL) or
#define MCF_INTC_IMRL_INT_MASK15 (uint32)(0x8000)

Correct. I'd use the cast as you initially indicated, since it will provide correct results independent from the size of an 'int'. Using 'uL' may prove to extend to 64-bits when using 4-byte ints, which works, but is an exaggeration and slow. If you don't care about portability, though, you may go ahead.

I think I agree it's not a compiler issue but let's say I wasn't forwarned because the operation output is an vuint32 and there's wasn't a compiler warning "Implicit arithmetic conversion from 'signed xxx' to 'unsigned xxx'.

...

That's why I use PC Lint from Gimpel.

BR,

Johan

CompilerGuru · ‎02-27-2009

Silly me, I did not realize that the compiler actually was correct.

With 16 bit ints, the constant 0x8000 has type unsigned int, hence the expression

(0x8000 | 1) has type unsigned int to and the value 0x8001u.

Therefore the complement (still type unsigned int) has the value 0x7FFEu, and &= 0x7FFEu does indeed clear the more significant 16 bits too.

Not sure why the compiler uses 64 bit for clearing the 31st bit. How are the constants defined?

That could also be a compiler issue, although not wrong, just inefficient.

If the operation should always work on at least 32 bits I would make the constants of type unsigned long by using

a uL suffix

> #define MCF_INTC_IMRL_INT_MASK15 (0x8000uL)

Daniel

MrBean · ‎03-03-2009

CompilerGuru wrote:

With 16 bit ints, the constant 0x8000 has type unsigned int, hence the expression
(0x8000 | 1) has type unsigned int to and the value 0x8001u.
Therefore the complement (still type unsigned int) has the value 0x7FFEu, and &= 0x7FFEu does indeed clear the more significant 16 bits too.

If that were true, then 0x4000|1 would result in the complement 0xCFFEu.

Extending 0xCFFEu to 32 bits would result in 0x0000CFFEu.

It does not.

The problem starts with the fact that then 16bit int constant is apparently signed.

When this is expanded to 32bits because &= other argument is 32bit, the compiler does not take in account that this other argument and the destination is NOT signed (it is vuint32).

This is what happens:

0x4000|1 = 0x4001, ~0x4001 = 0xCFFE, ->32bit = 0xFFFFCFFE, &=

0x8000|1 = 0x8001, ~0x8001 = 0x7FFE, ->32bit = 0x00007FFE, &=

The conversion to 32 bits is not done at the same point as the conversion to unsigned.

The compiler seems to think it is signed.

This does not make sense to me as the result is stored in a vuint32.

Message Edited by MrBean on 2009-03-03 13:34 01:34 PM

CompilerGuru · ‎03-03-2009

> If that were true, then 0x4000|1 would result in the complement 0xCFFEu.

> Extending 0xCFFEu to 32 bits would result in 0x0000CFFEu.

> It does not.

Well my text was using 0x8000 not 0x4000. Different number, different rule :smileyhappy:.

The constant 0x8000 has type unsigned int because it does not fit into a (16 bit signed) int type.

The constant 0x4000 fits, and therefore has type (signed) int. Therefore the extension to long in

(long)(~(0x4000+1))

is a sign extension, the long gets negative.

The following expressions though will result in a positive long <= 0xFFFF, the extension to long was done from a unsigned type (assuming 16 bit ints):

(long)(~(0x4000u+1))

(long)((unsigned int)~(0x4000+1))

(long)(~(0x8000 - 0x4000 + 1))

So if you intend to use 32 bit arithmetic in ANSI-C I would recommend to use 32 bit constants, say with a 16 bit int type to write it in one of those forms:

0x4000L

0x4000uL

(long)0x4000

(unsigned long)0x4000

Daniel

MrBean · ‎03-03-2009

CompilerGuru wrote:
The constant 0x8000 has type unsigned int because it does not fit into a (16 bit signed) int type.
The constant 0x4000 fits, and therefore has type (signed) int. Therefore the extension to long in
(long)(~(0x4000+1))
is a sign extension
....

*That* makes sense !

That also means all header files provided by Freescale are wrong/dangerous, once the '4-byte integers' checkbox on the setting panel is switched off.

So: Leave the checkbox on, or change all header files so that constants have a type that fits the register they are meant for.

The first is easiest, the latter would be more correct.

Thanx for all the input everyone !

Message Edited by MrBean on 2009-03-03 18:48 06:48 PM

jbezem · ‎03-03-2009

MrBean wrote:
...
This is what happens:
0x4000|1 = 0x4001, ~0x4001 = 0xCFFE, ->32bit = 0xFFFFCFFE, &=
0x8000|1 = 0x8001, ~0x8001 = 0x7FFE, ->32bit = 0x00007FFE, &=
Sorry to disagree:
0x4000|1 = 0x4001, ~0x4001 = 0xBFFE, ->32bit = 0xFFFFBFFE, &=
0x8000|1 = 0x8001, ~0x8001 = 0x7FFE, ->32bit = 0x00007FFE, &=

(Note the 'B' instead of 'C'). Other than that, for 16-bit ints, I agree.
And AFAIK, unadorned constants are (usually?/always?/must be?) considered 'int' and signed, even according to the Standard.

FWIW,

Johan

MrBean · ‎03-03-2009

sorry for the calculation typo, BFFE is right.

But still, i think it should either warn or not do it in this fashion.

MrBean · ‎03-03-2009

To show how inconsistent the behaviour is to me, try explaining this one :

;  401:  MCF_INTC0_IMRL = ~( ~MCF_INTC0_IMRL |  ( MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL ) ;
0x0000017E  0x203900000000           move.l   ___IPSBAR+3084,d00x00000184  0x4680                   not.l    d00x00000186  0x008000008001           ori.l    #0x8001,d0            ; '....'0x0000018C  0x4680                   not.l    d00x0000018E  0x23C000000000           move.l   d0,___IPSBAR+3084

Message Edited by MrBean on 2009-03-03 15:05 03:05 PM

jbezem · ‎03-03-2009

BTW, isn't there a parenthesis missing in the commented source-line? I see two opening and only one closing paren. Cut&Paste error?

The reasoning remains almost the same even if the evaluation order changes a little; the anomaly lies in the difference between 0x8000 and 32768.

BR,

Johan

MrBean · ‎03-03-2009

Stop seeing all my typo's

jbezem wrote:
The reasoning remains almost the same even if the evaluation order changes a little; the anomaly lies in the difference between 0x8000 and 32768.

The 0x8000 and 32768 difference is a result, not a cause, it is compiler ouput.

The definitions are not changed.

The output indeed shows that evaluation order is of influence.

The logic of it is still unclear to me.

For completeness:

;  394: MCF_INTC0_IMRL = ~( ~MCF_INTC0_IMRL |   ( MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL )) ;
;
0x00000150  0x203900000000           move.l   ___IPSBAR+3084,d0
0x00000156  0x4680                   not.l    d0
0x00000158  0x008000004001           ori.l    #0x4001,d0            ; '..@.'
0x0000015E  0x4680                   not.l    d0
0x00000160  0x23C000000000           move.l   d0,___IPSBAR+3084
;
;  395: MCF_INTC0_IMRL = ~( ~MCF_INTC0_IMRL |   ( MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL )) ;
;  396: 
;
0x00000166  0x203900000000           move.l   ___IPSBAR+3084,d0
0x0000016C  0x4680                   not.l    d0
0x0000016E  0x008000008001           ori.l    #0x8001,d0            ; '....'
0x00000174  0x4680                   not.l    d0
0x00000176  0x23C000000000           move.l   d0,___IPSBAR+3084
;
;  397: MCF_INTC0_IMRL =  (  MCF_INTC0_IMRL &  ~( MCF_INTC_IMRL_INT_MASK14 | MCF_INTC_IMRL_MASKALL )) ;
;
0x0000017C  0x203CFFFFBFFE           move.l   #-16386,d0            ; '....'
0x00000182  0xC1B900000000           and.l    d0,___IPSBAR+3084
;
;  398: MCF_INTC0_IMRL =  (  MCF_INTC0_IMRL &  ~( MCF_INTC_IMRL_INT_MASK15 | MCF_INTC_IMRL_MASKALL )) ;
;0x00000188  0x203C00007FFE           move.l   #32766,d0             ; '....'0x0000018E  0xC1B900000000           and.l    d0,___IPSBAR+3084;

Message Edited by MrBean on 2009-03-03 17:12 05:12 PM

jbezem · ‎03-03-2009

From what I've seen, and IMHO, the compiler is correct in all cases, and I'll explain why:

- the first is trivial: MASK14 means a signed int 0x4000 is used, or-ed with a signed int 0x0001 to produce a positive signed int, and then sign-extended and or-ed with the negated uint32: 0x0004001 is correct.

- the second is a repetition of your previous post; my argument changes a little because of the parentheses: 0x8000 is unsigned int, combined with 0x0001 as signed int, gives an unsigned int 0x8001; zero-extended to be combined with a negated uint32: 0x00008001 is correct.

- the third combines two signed int values into a signed int 0x4001, negates all bits to get a signed int 0xBFFE (negative!), and gets sign-extended to uint32: 0xFFFFBFFE is correct.

- the fourth case is different again: 0x8000 is unsigned int, combined with signed int 0x0001 to give unsigned int 0x8001, all bis negated to unsigned int 0x7FFE, and zero-extended to be combined with uint32: 0x00007FFE is correct.

Any objections?

FWIW,

Johan

MrBean · ‎03-03-2009

Hmm.. maybe it has to do with a 2's complement being not exactly the same as a bitwise not ...

MrBean · ‎03-04-2009

I'll highlight what i said before:

MrBean wrote:
Hmm.. maybe it has to do with a 2's complement being not exactly the same as a bitwise not ...

Think about the difference keeping the type casting of the compiler, as explained by CompilerGuru, in mind :

-(x) - (1)

(-1) ^ (x)

It does also explain the use of a 64bit value with the 31th bit ....

jbezem · ‎03-03-2009

I didn't check the 'for completeness' yet, but the difference between 0x8000 and 32768 is in the language definition, and therefore in the compiler. The quote I gave is from " C A Reference Manual" by Harbison and Steele, the best reference on C I have ever seen.

To make it even more clear: The compiler _has_ to interpret 0x8000 and 32768 differently.

I'll have a look at the rest later.

BR,

Johan

CompilerGuru · ‎03-03-2009

AFAIK: There is no difference in between 0x8000 and 32768, both behave the same (regardless the size of an int and other compiler differerences).

BTW: Note that in ANSI C, the type of -32768 is unsigned int (given 16 bit ints), as the type of 32768 is unsigned int and the minus just negates the value afterwards. So to write it as int, a more complex expression like (-32767-1) or ((int)-32768) has to be used.

See also, Page 54, chapter Integer constants,

http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf

Daniel

jbezem · ‎03-04-2009

Quoting the standard: "The type of an integer constant is the first of the corresponding list in which its value can be represented." (page 55 of the PDF document you indicated). Then follows a table with 3 columns where the difference appears: For all constants not adorned with 'U' the entries differ between 'Decimal Constant' and 'Octal or Hexadecimal Constant'.
If we now take our case: 32768 is a decimal constant. Since it doesn't fit into a 'signed int' anymore, the next in the list ist 'long int'.
However, for the same value in hex, 0x8000, the list is different: Since it doesn't fit into a 'signed int', the next indicated type is 'unsigned int'. Only when this range has been exceeded (i.e. the constant is larger than 0xFFFF), 'signed long' is to be used.

I used H&S, but the Standard seems to agree IMHO.

The type of 32768 is long (given 16-bit ints), see above, so '-32768' also is long. The unary minus is an operator, not part of the constant. You are correct IMHO as to the solutions you offer: (-32767-1) or (always better) use an explicit cast.

As to the differences between 32768 and 0x8000 again: ((long)~32768) should be 0xFFFF7FFF, whereas ((long)~0x8000) should give 0x00007FFF.

Try this:

#define C_DECIMAL     32768
#define C_HEXADECIMAL 0x8000

int main(int argc, char *argv[])
{
  volatile long long_dec = ((long)~C_DECIMAL);
  volatile long long_hex = ((long)~C_HEXADECIMAL);

  return;
}

Using '-intsize 2', I get:

0x00000000                    _main:
;                             main:
0x00000000  0x4E560000               link     a6,#0
0x00000004  0x518F                   subq.l   #8,a7
0x00000006  0x223CFFFF7FFF           move.l   #-32769,d1
0x0000000C  0x2D41FFF8               move.l   d1,-8(a6)
0x00000010  0x223C00007FFF           move.l   #32767,d1
0x00000016  0x2D41FFFC               move.l   d1,-4(a6)
0x0000001A  0x4E5E                   unlk     a6
0x0000001C  0x4E75                   rts

Any objections? Comments?

BR,

Johan

CompilerGuru · ‎03-05-2009

Yes, I remebered wrong. The type of 0x8000 (with 16 bit int) is a unsigned int while for 32768 it is long.

Daniel

homeness · ‎03-10-2009

Hi All.

I have the similar problem with unextended chars/shorts...

CW7.1.1

;

; 1: unsigned char state;

; 2: volatile res;

; 3: int main(void)

; 4: {

;

0x00000000 _main:

; main:

0x00000000 0x4E560000 link a6,#0

;

; 5: state = 2;

;

0x00000004 0x7002 moveq #2,d0

0x00000006 0x13C000000000 move.b d0,_state

;

; 6: if ( state == 2 || state == 3 )

; 7: {

;

0x0000000C 0x7200 moveq #0,d1

0x0000000E 0x123900000000 move.b _state,d1

0x00000014 0x0681000000FE addi.l #254,d1 ; '....'

0x0000001A 0x7001 moveq #1,d0

0x0000001C 0xB280 cmp.l d0,d1

0x0000001E 0x6208 bhi.s *+10

Best regards,

Peter

CW code generation

CW code generation

General