Mark Butcher - Your Opinion on Variable Usage

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Mark Butcher - Your Opinion on Variable Usage

888 Views
JHinkle
Senior Contributor I

I'm looking at ways to minimize my ram usage and increase performance.

Based on your past experience - please comment on an opinion I have.  I'm looking for confirmation or - "Your Head is up your Ass on that one" reply.

I'm using an RTOS, so each task associated with the RTOS has a while loop - so you never exit from the function (standard RTOS implementation).  Local stack variables are always consumed and never reused.

I'm using Kinetis ARM micros - so they are all 32 bit cpus.

Local variables used within the RTOS task function always utilize the space of a 32 bit word (I'll call it a DWORD), whether ii's a BYTE (8 bits), or a WORD (16 bits).  

So a local BYTE consumes 4 bytes from the stack.  The compiler also has to add additional code to make sure the outcome of operations stay within the BYTE range.

SO .. local BYTEs and WORDs - consume more space than required and adds a small performance hit due to the additional processing required to make sure the operations are within range.

My opinion -- NOW ....

To save memory (actually stack space which could then be reduced) - move local BYTE and WORD variable out into global space.  

When a BYTE is declare global -- you only consume 1 byte of memory instead of 4 on the stack and the additional range operations are not required.

I looked at a task where I had 6 byte variables defined - that's 24 bytes of stack consumed.  By making them global -- I only consume 6.

Your thoughts based on your experience.

Thanks.

Joe

0 Kudos
4 Replies

640 Views
JHinkle
Senior Contributor I

Thanks Earl.

Insights I did not know about.

Joe

0 Kudos

640 Views
JHinkle
Senior Contributor I

Thanks Mark.

Local variables associated with a task function are for the most part global in the sense that their location never changes and their scope is forever.

I considered struct packing (I use it a lot) - but it did not make sense for the RTOS variables I am using as the example since added code for struct variable location (base + offset) adds a performance hit.

Thanks again for your comments.

Joe

0 Kudos

640 Views
egoodii
Senior Contributor III

A couple other comments:

ARM is a 32bit processor -- it does NOT do 8-bit-math.  So while it can directly read and write bytes, to do any math it has to be extended with 'XB' extend-byte instructions -- truncation on write-back is 'automatic' in the byte-write.  So, for instance, NEVER would I use 8-bits for a loop counter -- might as well be 32 (most likely staying in a register!) to avoid unnecessary extensions.

The other is that the ARM instruction-set can do 'very nice' indexed-access (i.e., stack-relative) 'within the range that fits' in those fields of overall 32bit instructions.  But what ARM CANNOT do is load items directly from a full 32bit memory address, i.e. a globally-defined address-fixed-by-linker var.  Any such access takes TWO instructions, one to load a 'constant' from PC-relative-memory (using the basic indexing instruction) that IS the global address (as fixed-up in the linking process!) allocated by your compiler at the end of each program-module, then the actual 'access' instruction.  So while 'global' variables have their place, keep this 'cost' in mind!.

And one final 'Kinetis specific' comment about 'aligned stack'.  IF you ever allowed your stack to 'grow' to where it would cross the SRAM_U/SRAM_L boundary (center of RAM space) you would HAVE to insure 'aligned accesses' at that point -- else you get a bus-fault.  In that eventually, stack-packing would be 'dangerous'.

0 Kudos

640 Views
mjbcswitzerland
Specialist V

Joe

What happens if you don't use local variables as you do at the moment but instead use them in a single struct that then uses a pack option (so that it allows packing bytes as bytes). You may find that the stack size reduces accordingly.

Eg.

BYTE var1 = 0;

BYTE var2 = 0;

BYTE var3 = 0;
BYTE var4 = 0;

presumably consumes 4 long words on the stack (which I expect to be the compiler behavior and not related to the OS (?)).

typedef struct _PACK stSTACK_VARS
{

BYTE var1;

BYTE var2;

BYTE var3;
BYTE var4;
} STACK_VARS;

STACK_VARS myVars = {{0}};

For GCC (as used by KDS)

#define _PACK      __attribute__((__packed__))

In use, var1 = (2 * 4); becomes myVars.var1 = (2 * 4); There shouldn't be any code impact apart from the stack not being able to locate some variables directly in registers (only relevant for very few variables).

Maybe there is also a compiler option to force packing on the stack too? (I avoid compiler options as far as possible to aid in IDE portability since there is often a generic technique that can be used at the coding level).

Regards

Mark

0 Kudos