"PRINTF" SRAM Overhead/Stack Overflow

myke_predko · ‎04-20-2021

I'm running my FreeRTOS (with USB CDC) development code on a FRDM-K22F and I've just started seeing something strange. To monitor operations, I periodically "PRINTF" (using the SDK library code) to put out a message. All works well when I run with MCUXpresso debug active.

If I stop the debugger (and optionally remove the USB Cable from the debug port of the Freedom board) which allows the code in the Freedom board continues executing until it encounters a "PRINTF" statement at which point it stops and indicates a Stack Overflow issue.

The "PRINTF" statements are typically only built into the code when I specify the pre-processor symbol SDK_DEBUGCONSOLE=0 and all "PRINTF" instances are surrounded by "#if" statements like:

#if (2 != SDK_DEBUGCONSOLE)
  PRINTF("Done");
#endif

When I remove the "PRINTF" statements by setting SDK_DEBUGCONSOLE=2 the application runs fine, no stack overflows detected regardless of whether or not it is connected to the development PC and whether or not debug is active.

So, I believe the problem is with the PRINTF statements. Now, I've increased the "configTOTAL_HEAP_SIZE" as well as the stack size for the first task that executes a "PRINT" statement but no joy. 25k is the total stacks size used by all the tasks and the total system heap is 36k.

I haven't checked running the code without MCUXpresso Debug active for a week or so, during which I've added a number of tasks and queues and a mutex along with increasing the total number of queues in FreeRTOSConfig.h. - but, as indicated, when MCUXpresso Debug is active, no issues or overflow detected/indicated. Along with that no task's stack is close to it's threshold.

Rather than pouding out different ideas, I'd thought I'd ask if anybody has any thoughts as to where I should look to understand this issue.

Thanx!

myke_predko · ‎05-12-2021

This thread got to be quite long before the solution became understood. I am marking it as "Solved" so that anybody in the future looking to understand this issue will see that it hasn't been left hanging with more than 50 replies.

The problem was that I modified "semihost_hardfault.c" with code to turn on an LED to indicate that a "hard fault" occured. When I put in the extra code, I was under the impression that the method was for out of bound conditions (this assumption was made because when I encounter an out of bounds write, execution stops at the start of "semihost_hardfault.c") and not as a tool to handle semihost error conditions.

I should have a) read the comments in the"semihost_hardfault.c" source file and b) not touched the file.

When I reverted back to the original code, the issues of the application going into an invalid state when "PRINTF" is encountered and no debugger active went away.

Don't change "semihost_hardfault.c"

I appreciate the help by @ErichStyger @jingpan & @bobpaddock in helping me understand what the issues was.

在原帖中查看解决方案

ErichStyger · ‎04-22-2021

as you are obviously using printf which usually is using malloc() (at least for your file I/O below): what library are you using (newlib-nano?) and what kind of FreeRTOS heap?

Note that the usual malloc() implementations are not thread safe, so you should use a thread-save version of it. See https://mcuoneclipse.com/2020/11/15/steps-to-use-freertos-with-newlib-reentrant-memory-allocation/ , https://mcuoneclipse.com/2017/07/02/using-freertos-with-newlib-and-newlib-nano/ and especially https://nadler.com/embedded/newlibAndFreeRTOS.html

Erich

myke_predko · ‎04-22-2021

Hey @ErichStyger

I agree with @bobpaddock - NXP has to fix the html errors in this system.

I just lost a long email explaining what I have done to try and identify what is causing the problem. The changes are basically:

Reverting the ADC code to the state machine used before.
Drastically increasing STACK and HEAP build values (to 16k each)
Optionally build with the Mutux operations I added taken out
Searched out assert statements and commented them out for a build

Nothing changes how the problem is manifested.

I have read the articles you linked and I'm not comfortable replacing malloc at this time for two reasons. First, making the change doesn't explain exactly what happened - if it fixes it, can I be sure that the problem is fixed going forwards? I've seen a lot of "fixes" over the years that seem to work until their position in memory changes or that they mask the problem (ie increasing a stack size) leaving a bigger problem to deal with. Secondly, it is a lot of work to make the changes and validate the code, especially considering that the root cause isn't understood and product firmware will NOT have any PRINTF statements (I selectively build with using the SDK_DEBUGCONSOLE variable). If I don't have any PRINTF statements, I don't seem to have any problems so why would I do a lot of work for something that I don't know is going to fix the underlying condition and, as of right now, appears to work just as well.

This is not to say that I don't think that my code is the cause of the problem. I clearly added something (and quite a bit of code was put in) before I discovered this issue in going through my test & validation process. I would just like to figure out what's going on.

My guess right now is that the PRINTF statement (and the code behind it) is doing something more than just attempting to send the associated message and returning without caring whether or not the message got through (which is how I would have coded it).

Any ideas where to look next?

myke

ErichStyger · ‎04-24-2021

I would do an 'attach' to the target to see what is causing the stack overflow triggered. I suspect that it might not be technically an overflow but something writing to your stack. With the attached session you should be able to inspect the cause and what is on the stack and why it triggered. If it is a write to your stack from somewhere you should see the changed pattern on the stack. Then I would restart the application with a watchpoint (https://mcuoneclipse.com/2018/08/11/tutorial-catching-rogue-memory-accesses-with-eclipse-and-gdb-wat... set to that address or memory range to see what is writing to it.

Just my 1 cent without knowing the exact cause.

myke_predko · ‎04-24-2021

Hi @ErichStyger

As I noted in my previous comment, I think we're coming to a similar conclusion that something is causing a stack to get trashed when the debugger is halted and a "PRINTF" statement is executed.

I suspect the issue is within the PRINTF statement code (and maybe a reentrancy issue with malloc that you previously mentioned). Would it be worth going to the HEAP1 model (or anther one) that doesn't allocate SRAM? Other than the USB driver, I don't believe any of my code or required OS features allocates/frees SRAM.

As I said previously, I'm running on Linux and there doesn't seem to be an "attach" option (as you've outlined in Attaching to a Running Target with Segger J-Link, GDB and Eclipse ) in the programmer window - or do you know of one? As I said, I think I've got to find a Win10 laptop with enough disk space to handle MCUXpresso.

I'm very familar with setting watchpoints, so the process sounds good, I just have to get development hardware that can support it.

Thank you for the suggestions.

myke

ErichStyger · ‎04-24-2021

Hi @myke_predko ,

that link refers to the Kinetis Design Studio, things are different for the MCUXpresso IDE.

I could swear I wrote about this feature, but now Google fails to find it (so I can blame Google?). Anyway, it is here:

I hope this helps,

Erich

bobpaddock · ‎04-21-2021

There is a really obscure bug in printf that requires the stack to be aligned on 64 bit boundaries (8 bytes).
The bug usually shows itself when trying to print 64-bit numbers.

In most non-RTOS systems the stack is placed at the end of memory which would be on a 64 bit boundary, so the bug goes unnoticed.

Are all of the RTOS stacks on a 64 bit boundary?

[I edited this to correctly say 64-bit, not byte as when I initially posted.]

myke_predko · ‎04-21-2021

Hi @bobpaddock

I just looked at the .map file for the current as well as past versions of code. The end of memory generally looks like below, noting that the it is aligned on a four byte (32 bit) boundary. I don't print (or even use) any 64 bit numbers.

.heap 0x0000000020006388 0x1000
0x0000000020006388 _pvHeapStart = .
0x0000000020007388 . = (. + _HeapSize)
*fill* 0x0000000020006388 0x1000 
0x0000000020007388 . = ALIGN (0x4)
0x0000000020007388 _pvHeapLimit = .
0x0000000000001000 _StackSize = 0x1000

.heap2stackfill
0x0000000020007388 0x1000
0x0000000020008388 . = (. + _StackSize)
*fill* 0x0000000020007388 0x1000

.stack 0x000000002000f000 0x0
0x000000002000f000 _vStackBase = .
0x000000002000f000 . = ALIGN (0x4)
0x0000000020010000 _vStackTop = (. + _StackSize)
0x0000000000000000 _image_start = LOADADDR (.text)
0x0000000000027120 _image_end = (LOADADDR (.data) + SIZEOF (.data))
0x0000000000027120 _image_size = (_image_end - _image_start)

Thank you for the comment.

ErichStyger · ‎04-20-2021

Hi @myke_predko ,

do you print to a serial port or are you using semihosting?

If using semihosting, be aware that without a debugger attached it will cause a hard fault to be handled by your semihosting hard fault handler. It will add to the MSP (interrupt stack) (see https://mcuoneclipse.com/2016/08/28/arm-cortex-m-interrupts-and-freertos-part-3/ if this is new to you). So I suggest to increase the interrupt stack space. In MCUXpresso you can do this in the linker settings (there is a setting for the Stack under 'managed linker script').

I hope this helps,

Erich

myke_predko · ‎04-20-2021

Hi @ErichStyger

That's really helpful - I'm using semihosting and the default.

I was aware of the two stack architecture in the Cortex-M but I thought that the task stack (PSP) was used during interrupts and the MSP was used within FreeRTOS to avoid adding to the process's stack data (and also simplify restoring the process's context information).

When I take a look at "Properties"->"C/C++ Build"->"Settings"->"MCU Linker"->"Managed Linker Script" I see:

And when I roll over the "Default" for "Stack", I see:
"Actual size allocated: 4.0kB (0x1000)
Setting to any non-numeric value will result in Default"

Is this what you are suggesting that I change? If so, what value should I put in here?

Finally, any idea why this stack would now have a problem? Looking over the changes, I added:

ADC Interrupt Handler
2x Queues
1x Mutex

I wouldn't think that would significantly add to the amount of memory required by the MSP.

Again, many thanx,

myke

ErichStyger · ‎04-20-2021

Hi @myke_predko ,

The PSP and the FreeRTOS stack space is used for the task (without the interrupt stacks). The interrupts are stacked on the MSP: so all variables and and pushed registers of the interrupts will end up on the MSP stack.

I'm not 100% positive if this is really your issue, but as using printf with semihosting without a debugger will trigger a hard fault on top of it, so this might be the thing. Although the hardfault handler itself (if using the one from the SDK) should not use much stack.

I recommend you check the interrupt stack size used with the Image Info:

It does not count in for the interrupt nesting (this is not something this view can account for), but if you know your interrupt priorities you should get an idea about the stack consumed.

And yes: that is the linker setting you showed responsible for the MSP stack size. 4k should be plenty, but who knows what your interrupts are doing?

The other thought I have: do you reset the MSP stack at scheduler start? The NXP SDK standard examples are doing this, and if you have some data present on the MSP stack this will be not good.

See https://mcuoneclipse.com/2019/01/20/freertos-how-to-end-and-restart-the-scheduler/ on that topic.

That's why I have added

configRESET_MSP

to FreeRTOS to configure this. So check if you reset it,e.g.

So depending on this you have more (or less) usage of the MSP.

I hope this helps,

Erich

myke_predko · ‎04-21-2021

Hi @ErichStyger

Well no luck with changing the stack value - I doubled the "Default" value to 0x2000 (4k) and I get the same behaviour.

I looked at the Call Graph for a previous build (which works without any issues) and the only difference I can see is the "ADC_IRQHandler" which is in the new project and not in the old as you can see here:

I'm a bit surprised the ADC Handler has a call depth of 8 levels - I'm using the SDK code which appears to be fairly inefficient - right @mjbcswitzerland ?

So, any suggestions on what to look at next? Maybe I'm not changing the right Property?

ErichStyger · ‎04-21-2021

Hi @myke_predko ,

can you share a picture of that ADC0 interrupt handler in the image info unfolded to see all the dependencies?

I'm wondering why it is not able to show the stack used unless it is implemented in assembly?

Erich

myke_predko · ‎04-21-2021

@ErichStyger

Here's the interrupt handler:

and here's the sensor value variable declare:

So, as coded, up to eight interrupt sources can be used and stored in an array. I just looked at the assembly code (below) and I don't see anything that I would be extreme. I also followed the code through "ADC16_GetChannelConversionValue" and there is only one subroutine call in the method which means that the depth of this interrupt service routine should be "3" and not the "8" produced by the Call Graph (I guess I don't understand how it works).

ADC0_IRQHandler:
0000205c: push {r3, r4, r7, lr}
0000205e: add r7, sp, #0
62 adcpollConversionDoneFlag = TRUE;
00002060: ldr r3, [pc, #32] ; (0x2084 <ADC0_IRQHandler+40>)
00002062: movs r2, #1
00002064: str r2, [r3, #0]
64 adcpollSensorValue[adcpollActive] = ADC16_GetChannelConversionValue(ADC0_ADC16_BASE
00002066: ldr r3, [pc, #32] ; (0x2088 <ADC0_IRQHandler+44>)
00002068: ldr r4, [r3, #0]
0000206a: movs r1, #0
0000206c: ldr r0, [pc, #28] ; (0x208c <ADC0_IRQHandler+48>)
0000206e: bl 0x2030 <ADC16_GetChannelConversionValue>
00002072: mov r3, r0
00002074: ldr r2, [pc, #24] ; (0x2090 <ADC0_IRQHandler+52>)
00002076: str.w r3, [r2, r4, lsl #2]
879 __ASM volatile ("dsb 0xF":::"memory");
0000207a: dsb sy
880 }
0000207e: nop 
71 }
00002080: nop 
00002082: pop {r3, r4, r7, pc}
00002084: mvns r4, r4
00002086: subs r7, r7, #7
00002088: mvns r0, r5
0000208a: subs r7, r7, #7
0000208c: add sp, #0
0000208e: ands r3, r0
00002090: mvns r4, r0
00002092: subs r7, r7, #7

Thoughts or places where you see issues?

myke

ErichStyger · ‎04-21-2021

Hi @myke_predko ,

the depth of this interrupt service routine should be "3" and not the "8" produced by the Call Graph (I guess I don't understand how it works).

It counts how many functions are called to the max and stacked up. Could you share the image as I did for that depth unfolded? Are there any printf() in it?

If it is as on my SDK example, the ADC16_GetChannelConversionValue adds up to the number of calls because of that assert in it.

Erich

myke_predko · ‎04-21-2021

@ErichStyger

Here is the Call Graph Fully Expanded.

No PRINTFs in it.

myke

ErichStyger · ‎04-21-2021

No PRINTFs in it.

Outsch! Even worse, you have file I/O in it (writing to a file) :-(.

Your assert must be pretty fancy

Erich

myke_predko · ‎04-21-2021

@ErichStyger

I haven't done anything with the assert - this is all the SDK code.

When I look at the assert macro expansion I get:

((channelGroup < (uint32_t)(2)) ? (void)0 : \
__assertion_failed("/home/myke/MCUXpressoIDE_Workspace/FRDMK22F_SmartDisplay_OLED_26/drivers/fsl_adc16.h" ":" "486" " : " "channelGroup < (uint32_t)FSL_FEATURE_ADC16_CONVERSION_CONTROL_COUNT"))

I'm trying to find the source to "__assertion_failed". Any ideas where I can find it?

jingpan · ‎04-26-2021

Hi,

You can change library setting in Quicksart Panel

You can find this in MCUXpresso IDE user's guide.

Regards,

Jing

myke_predko · ‎04-26-2021

Hi @jingpan

Thank you - you helped me identify where the problem is.

When I read the section of the manual you've pointed out I see:

This describes basically what I'm seeing so you've identified the isse - see below.

I must confess, I never read through the semihost_hardfault.c comments - I honestly thought that its purpose was to catch invalid memory space accesses as when they do happen, the "fault" window becomes active and the "semihost_hardfault.c" file is displayed as:

So, looking at "HardFault_Handler" and not understanding it's context (or looking beyond the few lines that came up when there was an access issue), I added a write to an LED to indicate that the Handler had been invoked when the debugger wasn't connected. I had tested it and demonstrated it worked when I forced the code to execute at an invalid location. I added the LED write it a week+ ago and I didn't do my testing of the application with the debugger off until I had made a number of other changes so I assumed it was all right and it was the new changes that caused the problem.

If you've read through the entire thread here, you'll see that my plan was always to remove the debug "PRINTF" statements using a build variable so my question back to you is, is this appropriate or should I use something other than the default "Redlib (semihost-nf)" for "QuickSettings>>"->"Set Library Header Type"? What do you recommend? Right now, I'm debugging code on a FRDM-K22F board with added hardware but I will be using a custom board with a J-Link Plus programmer/debugger when the boards come in.

I also want to point out that the "Set Library Header" options all all greyed out - how do I enable them? Is this something that I should select when I'm creating the project or is there a way of changing an existing project?

You're amazing - thank you, thank you, thank you.

myke

ErichStyger · ‎04-22-2021

It is in fsl_assert.c

myke_predko · ‎04-22-2021

Hi @ErichStyger

I can't find "__assertion_failed" anywhere in the project and I definitely can't find "fsl_assert.c". When I click on it and hit F3, nothing happens.

Here is the project's "driver" folder where I'd expect it:

2021.04.22 - where is fsl_assert Interrogative.jpg

Any ideas where to look?

myke