Joel Fieber

CodeWarrior 5.6.1 MCF5271 - Compiler Question

Discussion created by Joel Fieber on Mar 27, 2007
Latest reply on Mar 28, 2007 by Joel Fieber
Hi all,

I'm running CodeWarrior with a ColdFire 5271.

I've been trying to review and optimize our application, running on a MCF5270. I'm finding when we dereference a pointer to a structure, or equate structures, the compiler generates code larger and slower than we would like.

compiling this:
*(InternalPVPoint*)ptr1 = *(InternalPVPoint*)ptr2;

generates this:
    movea.l 4(a4),a1
    movea.l (a4),a0
    move.b (a0),d1
    move.b d1,(a1)
    move.b 1(a0),d1
    move.b d1,1(a1)
    move.b 2(a0),d1
    move.b d1,2(a1)
    move.b 3(a0),d1
    move.b d1,3(a1)

The compiler obviously knows the structure is 4 bytes, and the processor is a 32-bit ColdFire..

SO: why did CodeWarrior not generate a single longword move? How can I affect the compiler output to do this?

this example is perhaps a bit overly complicated as the type InternalPVPoint is a union containing bitfields:

typdef union tagInternalPVPoint
        uint_32 TYPE     : 10;
        uint_32 INSTANCE : 22;
    uint_32 RAW;
} InternalPVPoint;

However, using even simple structures:

typedef struct tagMyStruct
    uint_32 fourbytes;
} MyStruct;

typedef struct tagOtherStruct
    uint_32 otherfourbytes;
} OtherStruct;

void foo()
    MyStruct*     first;
    OtherStruct*  second;

    *first = *((MyStruct*)second);

produces this:
     move.w   (a0),d1
     move.w   d1,(a1)
     move.w   2(a0),d1
     move.w   d1,2(a1)

Why just do word moves?

But maybe this example is too artificial. Anyways, can anyone explain the compilers behaviour here, and how to write optimal code in these situations?

Where is the trade-off for calling an optimized (inline?) memcpy function?

Alban Edit: CW version number and core in subject line.

Message Edited by Alban on 2007-03-27 11:36 PM