The following source code exposes a bug in the inline assembler of DSC's compiler:
typedef unsigned long UINT32;
inline void u64_SHR3(register UINT64 *RReg )
{
register UINT32 AA;
register UINT32 BB;
__asm{
move.l X:(RReg)+,AA
move.l X:(RReg),BB
move.w #0,BB.2
asr BB
ror.l AA
asr BB
ror.l AA
asr BB
ror.l AA
move.l BB,X:(RReg)-
move.l AA,X:(RReg)
}
}
#define tick2mSec(tmp) {u64_SHR3(tmp);} // myRTC/8 --> time in mSec
It is an "optimized" triple right shift of a 64bit unsigned value, but when clearing the accumulator extension register, the generated code clears the whole 36bit accumulator:
; 100: tick2mSec(&tp); // 8Khz ticks to milliSeconds
;
0x00000036 0x8AB4FFFB adda #-5,SP,R0
0x00000038 0x8121 tfra R0,R1
0x00000039 0xF021 move.l X:(R1)+,A
0x0000003A 0xF135 move.l X:(R1),B
0x0000003B 0xE180 move.w #0,B
0x0000003C 0x70EB asr B
0x0000003D 0x7647 ror.l A
0x0000003E 0x70EB asr B
0x0000003F 0x7647 ror.l A
0x00000040 0x70EB asr B
0x00000041 0x7647 ror.l A
0x00000042 0xD131 move.l B10,X:(R1)-
0x00000043 0xD035 move.l A10,X:(R1)
Notice what happens at offset 0x0000003B:
instead of emitting an error because instruction move.w #0,B2 (clear upper 4bits of 36bit accumulator B ) DOES NOT EXIST, or replacing it with the functionally equivalent clr.w B2 ...
the inline assembler emits move #0,B clearing the whole 36bit accumulator B.
The obvious workaround is to use clr.w, but NXP personell working on DSC's compiler should give a look at the root cause of this bug, maybe it's a symptom of worse bugs still lurking around in the inline assembler.