accessing individual bytes of int variable

stevec · ‎08-10-2007

Is there an easy way to access the individual bytes of an integer (16 bit) variable? Something like #HIGH int_variable and #LOW int_variable. Or do I have to resort to masking/shifting operations?

Steve

Shortbus · ‎10-07-2007

If you use bitwise operators to get the low and high bytes of a word, this should work the same on all microcontollers...

x&0xff is the least significant byte of x,

and (x>>8)&0xff is the next lowest byte,

and (x>>16)&0xff is the next lowest byte, etc..

HOWEVER, there is an important gotcha that must be dealt with especially when transferring data between Intel and Freescale processors... ENDIANNESS.

In Freescale microcontroller, a 16 bit word is stored in memory with the most significant byte first.

For example, the decimal number 12345 (0x3039 hex) would be stored in memory as

0x30 0x39 . This is known as "Big Endian."

In many Intel based environments, a 16 bit word's bytes are stored in memory BACKWARDS with the least significant byte first! This is known as "Little Endian"

For example, the decimal number 12345 (0x3039 hex) would be stored in memory as

0x39 0x30 .

Please consider this code:

void foo(void)

{

unsigned short x=12345;

unsigned char MH,ML,H,L;

H=(x>>8)&0xFF; // get most significant byte of x

L=x&0xFF; // get least significant byte of x

MH=*(unsigned char *)(&x); // get first byte of x in memory

ML=*(unsigned char *)(&x+1); // get second byte of x in memory

}

on a freescale processor, at the end of this function, H will equal MH, and L will equal ML.

on an intel processor, H will equal ML and L will equal MH!

On projects what involve data transfer between a freescale device and a windows-based controller, you will need to flip a word's bytes around to convert between big endian and little endian formats!

I hope this helps.

Regards,

Craig

Stephen · ‎08-29-2007

If it's any help, I have always declared a structure for this type of thing. E.g.

typedef union uREG16    /*16 bit register with word and byte access*/
{
tU16 word;           /*access whole word    */
struct                /*access byte at a time*/
    {
    tU08 msb;
    tU08 lsb;
    }byte;
}tREG16;

To use it, declare a variable e.g.

tREG16 intData;

To access this as a 16-bit word, use:

intData.word = 0x4055;

To access the individual bytes:

intData.byte.msb = 0x40;

intData.byte.lsb = 0x55;

I use CodeWarrior version 4.5. If you selecte the "Code completion" options in "IDE Preferences" then you can get the IDE to give you the options as soon as you hit the period key "."

Hope this helps

Stephen · ‎08-29-2007

PS I should have explained - in the above example, tU08 is an 8-bit variable (equivalent to unsigned char) and tU16 is a 16-bit variable (equivalent to int).

Technoman1964 · ‎08-12-2007

I use c MACROS to access high low btyes, words as follows. By using macros the compiler will handle all byte/memory ordering automatically.

c example

WORD TestWord = 0xab;

BYTE HighByte, LowByte;

HighByte = HIBYTE(TestWord);

LowByte = LOBYTE(TestWord);

Code:

/* Type defines */typedef long  DWORD;typedef int  WORD;typedef char BYTE;typedef unsigned long UDWORD;typedef unsigned int UWORD;typedef unsigned char UBYTE;typedef DWORD INT32;typedef WORD INT16;typedef BYTE    INT8;typedef UDWORD UINT32;typedef UWORD UINT16;typedef UBYTE UINT8;typedef BYTE BOOLEAN;typedef WORD STACK;

Code:

/* Returns the low byte of the word  */#define LOBYTE(w)  ((BYTE)(w))/* Returns the high byte of the word  */#define HIBYTE(w)  ((BYTE)(((WORD)(w) >> 8) & 0xFF))/* Makes a word from \a low byte and \a high byte. */#define MAKEWORD(low, high) ((WORD)(((BYTE)(low)) | (((WORD)((BYTE)(high))) << 8)))/* Returns the low word of the double word */#define LOWORD(l)  ((UINT16)(UINT32)(l))/* Returns the high word of the double word */#define HIWORD(l)  ((UINT16)((((UINT32)(l)) >> 16) & 0xFFFF))/* Returns the low signed word of the double word */#define LOSWORD(l)  ((WORD)(DWORD)(l))/* Returns the high signed word of the double word */#define HISWORD(l)  ((WORD)((((DWORD)(l)) >> 16) & 0xFFFF))/* Makes a double word from low word and a high word. */#define MAKELONG(low, high) ((UINT32)(((UINT16)(low)) | (((UINT32)((UINT16)(high))) << 16)))

CompilerGuru · ‎08-10-2007

Which language?
From C, the masking and shifting looks like the "easy way" to me.
Of course you can also cast the address to a unsigned char pointer, and then operate on that, but I think the masking/shifting approach is cleaner.
What is more appropriate probably also also on why you want to access the individual bytes in the first place. When accessing the bytes via their address, you are responsible for the endianess adaptation, when using shift and masking, the compiler does that for you.
As third approach you can also use a union, but that does help much compared to the access via "unsigned char*".

Daniel

stevec · ‎08-10-2007

Thanks for that.
My thinking is that if the compiler knows where the variable is in memory it can access each byte of the int individually. I have used something similar but can't remember if it was with a C compiler (Keil for 8051) or an assembler. I have a data string which I need to substitute an integer value for two of the bytes in the string.
e.g. buff[i] = upper byte of int
buff[i+1] = lower byte of int

Or can I just say
buff[i] = int and it will do the same

Shifting 8 times seems a little inefficient.

CompilerGuru · ‎08-10-2007

Writing a shift by 8 does not mean that the compiler will emit 8 single bit shifts. Well, it could, but it does not have to in case it knows a better pattern instead.
What it does for you, you have to see with your compiler.

For

Code:

int i;unsigned char c0[2];unsigned char c1[2];union {  char c[2];  int i;} c2;void f0(void) {  c0[0]=(unsigned char)(i >> 8);  c0[1]=(unsigned char)(i);}void f1(void) {  *(int*)c1=i;}void f2(void) {  c2.i= i; // read as c2.c;}

I get with HC08 V6.0:

Code:

    8:  void f0(void) {    9:    c0[0]=(unsigned char)(i >> 8);  0000 c60000   [4]             LDA   i  0003 c70000   [4]             STA   c0   10:    c0[1]=(unsigned char)(i);  0006 c60001   [4]             LDA   i:1  0009 c70001   [4]             STA   c0:1   11:  }  000c 81       [4]             RTS      12:  void f1(void) {   13:    *(int*)c1=i;  0000 c60001   [4]             LDA   i:1  0003 c70001   [4]             STA   c1:1  0006 c60000   [4]             LDA   i  0009 c70000   [4]             STA   c1   14:  }  000c 81       [4]             RTS      15:  void f2(void) {   16:    c2.i= i; // read as c2.c;  0000 c60001   [4]             LDA   i:1  0003 c70001   [4]             STA   c2:1  0006 c60000   [4]             LDA   i  0009 c70000   [4]             STA   c2   17:  }  000c 81       [4]             RTS      18:

So the different syntaxes to write do not differ much in their generated code (actually for those cases not at all).
The results could vary however depending on other details, like if the variables are locals or pointers (or others).

Anyway, the syntax I would choose depends on what I would try to do, not on whether the particular compiler wastes a few bytes. With the arithmetic shift, the C code explicitly defines the endianess of the encoded characters, so I would probably prefer that in many cases. When copying the an int or the union, then the endianess encoded is defined by the architecture and that code also silently assumes the int is 2 char's wide.

Daniel

accessing individual bytes of int variable

accessing individual bytes of int variable

General