Does stack changes between C function lines? (aside from arguments and local variables stacking)

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Does stack changes between C function lines? (aside from arguments and local variables stacking)

2,501 Views
carloscuev
Contributor V

Hello, I'm trying to conceive a cooperative RTOS  without the use of switch-case in every task to divide it in small chunks and exit C function properly as a lot of us have done. This way, the code for each task gets big and nasty pretty quick, not to say a switch-case in my opinion wastes a lot of CPU cycles and flash memory. Any way I've used this approach several times.

 

Almost a year ago I coded a small Preemptive RTOS for HCS08 (except memory paged devices) but found out that a full preemptive RTOS is almost useless in RAM-constrained devices because you have to give each task a small independent stack, enough to store all the program counters up to the deepest function it can enter, and CPU registers in stack when entering interrupts. So giving each task an independent stack enough for the worst case scenario results in lots of wasted RAM and careful checking you don't overflow each one of them.

 

However I liked a lot how small the C code for each task would be on a preemptive RTOS. Knowing that a context switch will be fired up after some time (preemptively) or when task decides to wait some miliseconds before continuing execution (cooperatively).

 

I have put a lot of thought on this, and I'm trying to conceive a Cooperative RTOS which uses only one stack, but without using switch-case, instead forcing a context switch with the software interrupt. Something like this:

 

 

Not having an independent stack for each task implies that for prior to each context switch (SWI triggering) the Stack Pointer must always point to the same address. In other words, stack must be always the same. All other CPU registers could easily be saved and restored inside SWI ISR, so that's no reason to worry about.

 

Im aware that when entering a C function, for each byte of arguments and local variables, stack increases as they are allocated into stack, but lets consider there are none of them.

 

I'm also aware that when a function is called inside a "function as task" it modifies the stack pointer, but when it returns, stack pointer returns to the previous value as it nothing happened. So this does not worry me.

 

If I haven't made myself clear I want to know if stack pointer changes between lines of C code inside a function for any reason. If so, I would like to know if there's a compiler argument to tell him not to do it. 

 

So far I've coded a prototype handling some leds and a 16x2 LCD and it has been working flawlessly, software interruptions could be anywhere insife my "task function" and stack pointer doesn't change. But I don't want to develop more complex code if it won't be valid for other cases, and I would have wasted a lot of time.

 

Of course there's an inevitable drawback: a function that will act as a task could NEVER EVER have different local variables and arguments from each other, because they will increase the stack in different quantities, and stack pointer for each "function as task" will be different. Results would be catastrophic. A standard set of arguments (and local variables) could be handled easily saving and restoring them inside the context switch as long as each "function as task" uses the exact same ones (in fact they could have different names and data sizes, as long as the total bytes is the same for each one, but playing with the data sizes would be dangerous, so let's stick with "the exact same ones")

Labels (1)
Tags (1)
0 Kudos
Reply
12 Replies

1,624 Views
carloscuev
Contributor V

Yes you're right Mac, the context swithing begins there, just after setting up everything. I think that curly-bracing such statements is a very good idea.

 

Hello Daniel, yes I think a complex function would get me into trouble ! But anyway I think my small RTOS will be very helpful for doing small, personal projects. For the code I write for a living at work I would never try to implement such trickery and prefer to stick with 100% ANSI C code.

 

Thank you both for the great advice, I'll continue developing this idea in 2 weeks, because I'll be on vacation the next one yay!

Please forgive me if I bring this thread from the death when I encounter something weird and can't find an answer to.

 

Regards

Carlos

0 Kudos
Reply

1,624 Views
carloscuev
Contributor V

Hello bigmac I tried those pragmas in another function (not a function used as task, because I have had no problems with them) that I was having trouble with, but didn't worked.

 

Pragmas tried:

#pragma NO_ENTRY
#pragma NO_EXIT
#pragma NO_FRAME
#pragma NO_RETURN

(All at once)

 

My main initialization function is called ezTasker_run, on main() I call it like this:

__asm JMP ezTasker_run;

trying to evade the stacking of the return address but even using those pragmas resulted in this function startup:

ezTasker_run:
00000012 A7FE [2] AIS #-2

... REST OF ASM

even calling ths function with normal:

ezTasker_run();

resulted in same AIS #-2 at function startup and that function call been done with standard JSR

So my only way to deal with is is doing a stack pointer reset when entering.

Any advice to make it work? 

By putting those pragmas over a function only affect that function, or all functions starting on that line?

 

My RTOS Prototype is finished and haven't had problems with extra stacking inside functions, anyway I may encounter them in the future. Thanks a lot for the tips. Anyway I like having one task to handle everything about the same problem, not separate a problem resolution into multiple tasks if it's not needed.

 

I think my approach is more likely cooperative instead of preemptive, because even though I hame a contest switch, it only happens when task decides to do so, there's no timer interruption triggering it.

 

 

Hello CompilerGuru, I think I made a mess in the first post, but let me explain myself.

 

Every time a context switch is triggered (SWI) stack must be empty because my context switch only saves CPU registers except stack pointer, once I pulled and saved all registers from stack, the stack must be empty. Meaning there were no temp variables stacked and task will continue to run nicely next time it's given the CPU.  If stack was found not empty, task is flagged as and never runs again just as a precautionary measure. That's why SWI must be triggered with an empty stack, so I don't have to worry about lost temp variables, any temp stack usage must be cleared before SWI.

 

An alternative is as you said, save also all stack, but I would prefer not to because implies a slower context switch and extra (task-specific amount) RAM for each task. Definitively would be less RAM than a preemptive RTOS because as you know, context switching will be done at controlled points, instead of preparing enough dedicated RAM to save stack if being caught in a deep stack level.

 

So far I have a working RTOS, tasks can trigger a context switch and can go to sleep some miliseconds and then wake up, algo have implemented a very simple event system, 8 max events per task (you guessed right, each one is a bit of a byte)  and also a very useful event wait timeout feature. If any event happens within certain time, a "timeout event" is triggered, very nice for making SCI, SPI, IIC drivers.

 

My main motivation is the very easy to understand code that a full preemptive RTOS enables. 

Here's my working classic 16x2 LCD Driver task:

 

 

sleep() macro is:

#define sleep(a) currentTask->sleep = a; __asm swi;

 

Needless to say, making  currentTask->sleep = a assignment makes the scheduler to wait for that amount of milliseconds prior giving that task the CPU again, meanwhile all other tasks continue running.

 

There's a special "system task" and when it detects it's the only one in RUN state(not sleeping nor waiting for event) runs__asm wait; until a system tick interrupt happens. That system tick gets serviced every milisecond and as you guess, decrements the .wait  for each task by one. This saves about 1/3 the power consumption by the microcontroller in my demo application., thats about 2.5 and 3 mA

 

Anyway there's still danger of trying to switch context with a not empty stack, and even though the context switch detects it and flags task, I'm eager to learn more about the compiler and ways to make it treat those specific functions differently, or any pseudo-instruction to safely empty stack (or course to call that pseudo-instruction prior __asm SWI; )

 

As for your r = (a * 1.2) + (b*2.2) + (c*3.3) I'm thilled about the heavy load that this expression would generate on a 8-bit CPU, but I think it may not happen as I follow this reasoning:

 


0. <- Prior entering we have an empty stack

1. r = (a * 1.2) + (b*2.2) + (c*3.3);<- Heavy temp variable stacking, lots of CPU time

2. <- Heavy processing finished, stack dumped, empty again

3. __asm swi;<- Enters with empy stack, no problem here !

 

But in this case we may be in trouble:

 

0. <- Prior entering we have an empty stack

1. r = (a * 1.2) + (b*2.2) + (c*3.3);<- Heavy temp variable stacking, lots of CPU time

2. <- Compiler saw that the last term is used in line 4, it decides not to dump it from stack

3. __asm swi;<- Enters with remaining variable in stack, STACK NOT EMPTY

4. r +=  (c*3.3)

 

Any advice would be very very appreciated. Thanks !

0 Kudos
Reply

1,624 Views
CompilerGuru
NXP Employee
NXP Employee

There are legal ansi C code patterns for which the compiler will allocate stack space for the complete function.

You can always move such code in a function called from your task, so there is a way to change the code to still work in your setup. But but there is no simple compiler for any arbritrary code to work in your setup.

Sorry, don't have a compiler S08 installed on this PC, so no explicit sample. But I'm sure if the code is complex enough it will allocate stack space per function and not just around the expression.

 

Daniel

 

PS: Complex for a S08: More temporaries than registers (long/float/double intermediate results) and this mixed in a expression with control flow (? operator, &&, ||, ...) and other side effects.

 

 

 

0 Kudos
Reply

1,618 Views
bigmac
Specialist III

Hello,

 

I think that you misunderstand the purpose of the pragmas.  The AIS #-2 is creating temporary space on the stack for something that is occurring within the function - maybe due to the definition of a local variable, or a temporary variable for a process that is about to occur within the function.  You should ascertain the point where this stack space is released, maybe with AIS #2 instruction.

 

Since you are using a JMP instruction, ezTasker_run() becomes an extension of main(), and therefore must never exit.  Hence, the context switching would commence during this function.  Even if the stack space is not released prior to the switching, it should have no affect on the switching process because the same offset would be present for all tasks.  The stack pointer needs to be constant at the start of each task, but the stack does not need to be "empty".

 

The pragmas are applied only to the function definition immediately following the pragmas.

 

With the following code sequence, the placement of the SWI instruction is obviously inappropriate because the arithmetic calcultions have not been fully resolved.  I am assuming that the variables a, b, c and r are global.

  r = (a * 1.2) + (b*2.2) + (c*3.3);

  __asm swi;

  r +=  (c*3.3);

 

The following code sequence might be more appropriate.

  {  // Commence separate block of code

     r = (a * 1.2) + (b*2.2) + (c*3.3);

     r +=  (c*3.3);

  }

  __asm swi;

 

 

Regards,

Mac

 

0 Kudos
Reply

1,618 Views
CompilerGuru
NXP Employee
NXP Employee

Why is the restriction that the various SWI cannot be at different stack levels?

Could not the SWI handler just copy the whole stack from the bottom up to the current value into a per task storage, and then there would only be a requirement that each per task storage is larger than the "deepest" SWI location of that task. But different tasks could use different amount of stack (given that their storages have the necessary size), there would be no need to align all the stacks magically. 

 

> Not having an independent stack for each task implies that for prior to each context switch (SWI triggering) the 

> Stack Pointer must always point to the same address. In other words, stack must be always the same.

 

I don't follow the reasoning here. To use a single stack for different tasks means that for a task switch, the stack content of the previously running task has to be copied somehere else, and that the prior stack content of the next task to run has to be copied into the one and only real stack. The amount of stack needed by the two tasks do not need to be identical if we store away how much stack each task needed in his last task switch in additon to its stack content.

 

When requiring the same stack size for all tasks, we save the storage space for the stack size, 2 bytes per task. But at the same time, we require each task to allocate the stack of the biggest task, so we might end up wasting more space by that than we save by not storing the stack size.

 

Also note that locals are not the only reason for the compiler to allocate stack space. For complex expression it will also allocate stack space for intermediate results. Just try to use floating point (well don't try, but for operations like "r = (a * 1.2) + (b*2.2) + (c*3.3)" the compiler has to store intermediate results somewhere.

 

Daniel

0 Kudos
Reply

1,618 Views
bigmac
Specialist III

Hello,

 

As a precautionary measure, prior to the definition of each task function you might include the following #pragmas -

NO_ENTRY to prevent register values being pushed to the stack when the task function is entered,

NO_FRAME to inhibit the inclusion of any stack frame, and

NO_EXIT to suppress an unnecessary RTS instruction at the conclusion of the task function.

 

When a function is called by a task, I would assume that the stack pointer would be the same as just prior to the function being called - any temporary stack usage by the function should be released prior to the function exit.

 

If you required to do some aritmetic computation within a task function, this may involve the automatic creation of some additional temporary variables and some sub-function usage, so the context switch should not occur until the computation has been resolved.  To ensure the timely release of temporary variables, you might place all the calculations within a separate block defined by an additional curly brace  pair.

 

I can see no advantage in incorporating the once only initialisation processes within a task.  I would simply call these functions as part of the MCU initialisation prior to entering the main loop.  Looping within the task function would then become unnecessary since the context switching will be cyclic anyway.

 

Additionally, I would tend to treat the sending to line 1 or line 2 of the LCD as two separate tasks.  The tasks within the example would then become much simpler.

 

#pragma NO_ENTRY#pragma NO_EXIT#pragma NO_FRAMEvoid task_ADC_proc( void){   if (flag_ADCresult) {      flag_ADCresult = 0;      process_result();   }   __asm swi;}#pragma NO_ENTRY#pragma NO_EXIT#pragma NO_FRAMEvoid task_LCD_sendL1( void){   if (LCDbuf_L1)  // Valid pointer to buffer      send_first_row( LCDbuf_L1);   __asm swi;}#pragma NO_ENTRY#pragma NO_EXIT#pragma NO_FRAMEvoid task_LCD_sendL2( void){   if (LCDbuf_L2)  // Valid pointer to buffer      send_second_row( LCDbuf_L2);   __asm swi;}

 

Of course, all variables referred to within the task functions need to be global or static, and if can be written within an ISR function, will also need to be declared volatile.

 

In previous projects, I have personally avoided the need for pre-emptive task switching, but have used a very simple task management arrangement of non-overlapping timeslots into which one or more individual tasks can be placed.  For example, with a timeslot period of 6.25ms, I would make use of timeslot repeat intervals of 25ms, 50ms and 100ms, to suit different tasks.  Where more complex control processing is required by a particular task, I would also make use of a finite state machine (FSM) configuration, involving a table of function pointers to select the function required by the current state of the FSM.

 

Regards,

Mac

0 Kudos
Reply

1,618 Views
kef
Specialist I
  • If I haven't made myself clear I want to know if stack pointer changes between lines
  • of C code inside a function for any reason.

Yes.

 

  • If so, I would like to know if there's a compiler argument to tell him not to do it.

AFAIK there's no such argument.

0 Kudos
Reply

1,618 Views
carloscuev
Contributor V

Thanks kef for your reply.

 

I've been reading compiler optimizations and optimizations as "Create sub-functions with common code" seems one way to inadvertently change SP between lines of C code :smileysad:

 

Theres also this "Special Feature" I couldn't understand, could you give me some clue about it?

0 Kudos
Reply

1,618 Views
kef
Specialist I

I'm not sure what "Special Feature" you are asking about. Your link is pointing to some compiler options. Which one you are asking about?

 

You may try setting compiler options for easiest debugging. This may reduce the risk of changing SP between lines. But you still will be at risk. You add just one variable, or compiler allocates some temporary variable on stack and you get task stacks not the same... This is fine for assembly, where you may control stack usage, but not for C. Instead I would try optimising task stacks and add code to monitor stack usage, which would trigger some error when task pointer is dangerously close to bottom of stack.

0 Kudos
Reply

1,618 Views
carloscuev
Contributor V

Oh Sorry, you're right, this is the correct link:

 

http://www.freescale.com/infocenter/index.jsp?topic=%2Fcom.freescale.doc.mcu.hcs08_compiler%2Fdoc%2F...

 

Thank you for your suggestions, Is there a pragma to disable optimization only in "pragma enclosed" ones? I can't find out how

0 Kudos
Reply

1,624 Views
CompilerGuru
NXP Employee
NXP Employee

It's a bit a unfortunate manual entry. Basically forget/ignore the _STACK and _ADJ directives. All current HC08 compilers since many years are "ICG-based Compiler" 's. So they just ignore those directives. If you need to do such code, use the assembler and not the compiler.

 

Daniel

0 Kudos
Reply

1,620 Views
kef
Specialist I

You link talks about inline asm pseudoinstruction, which should be used to inform compiler that your smart asm code adjusted SP. Provided you have some local variables on stack, using push, pop, txs instructions you code may modify SP, which may make compiler unable address variables properly, until you restore SP. _STACK pseudoinstruction helps solving this problem.

0 Kudos
Reply