FRDM-K22F monitoring maximal stack/heap usage of program

xmksa · ‎03-22-2021

Im using the MCUxpresso IDE and it is able to show me the current heap/stack usage when I pause the debugger at the right time, although is there a function that is able to monitor and show me the maximal heap/stack usage my program requires? If one function allocates memory and frees it, I wont see that at the end of my program, I will only see the current stack/heap usage.

ErichStyger · ‎03-23-2021

Hi @xmksa ,

You won't have more visibility into the heap usage, except monitoring the 'last allocated pointer' which is what the view in MCUXpresso IDE does.

As for the stack size (by function and accumulated), the MCUXpresso IDE has a nice 'Image Info' view which gives you that. See section 'RAM: heap and Stack' in https://mcuoneclipse.com/2019/08/17/tutorial-how-to-optimize-code-and-ram-size/:

If using the standard library, it won't have that information available unless you recompile the library (how to do this in general, see https://mcuoneclipse.com/2014/08/23/gnu-libs-with-debug-information-rebuilding-the-gnu-arm-libraries... , but be warned that this won't be easy)

If you prefer to do this on the command line, see https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/ plus you can list the size per module with the GNU size utility (see https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/ ).

I hope this helps,

Erich

PS: if using FreeRTOS, you have dedicated views for each task stack and the heap.

xmksa · ‎03-23-2021

When I use FreeRTOS on MCUxpresso I cant change the heap configFRTOS_MEMORY_SCHEME to 1 or 2. If I could change it to 1 and use the built in HeapUsage(FreeRTOS) function in IDE, I could see the maximal heap usage since heap1 doesnt allow freeing the memory, thus at the end of my program I would see the maximal memory usage.

I tried to change my heap allocations into stack allocations, then increase the stack size and use the -fstack-usage flag to produce .su files which contain the stack usage of each function. Next I determined the maximal memory usage by creating a map of function calls, if function calls another function the maximal memory usage would be the usage of these two functions together.

@ErichStyger How do I turn on the "Image info" view? Also what are the steps to change the heap scheme from 4 to 2? I tried several things but it doesnt work for me, I also just copied the contents of heap2.c into heap4.c and still no effect.

myke_predko · ‎03-23-2021

@xmksa

I've just done a quick scan of the FreeRTOS documentation that I have (including "Mastering the FreeRTOS Real Time Kernel") and I'm not sure why you would want to go from the Heap 4 model to the Heap 1 or 2 models. I know I read a note saying that FreeRTOS should not be run in anything less than the Heap 4 model but I can't find the reference right now.

As I understand the models, Heap 1 through Heap 3 allocate memory for function calls within the task BUT range from no freeing of memory (Heap 1) to freeing the memory but not defragmenting freed memory leading to a heap that is a bunch of unusable pieces. The only way to ensure you avoid running out of heap space or fragmenting it to death while running in Heap models 1 through 3 is to never call a function (I think you can use local variables in the task as long as you don't create/destroy any) or use any objects (so C++ is out).

The original question was seeing maximum stack/heap space usage, which @ErichStyger explained how to monitor, but I'm questioning the decision of changing the Heap model, when it is clear you have to do something unnatural to force the change and it's going to result in execution operation that really isn't in your best interest.

When I'm checking my stack/heap usage in a task, I set a breakpoint at the last function call possible (and there may be several) in the task and look at the heap usage - then add 1k to it when I update the allocation to ensure that interrupt handlers don't blow the stack. It's a somewhat labour intensive process, but it does give peace of mind.

myke

ErichStyger · ‎03-23-2021

(I think you can use local variables in the task as long as you don't create/destroy any) or use any objects (so C++ is out).

C++ is using the standard library heap, not the FreeRTOS one. If it shall use a shared one, then the reentrant heap implementation (aka Scheme 6, https://mcuoneclipse.com/2020/11/15/steps-to-use-freertos-with-newlib-reentrant-memory-allocation/ ) shall be used.

Erich

ErichStyger · ‎03-23-2021

Heap Scheme 1 is an excellent choice if there is no need to have memory released again, which many applications anyway do (like the only create tasks but never delete them). And scheme 1 is very efficient and always takes the same time.

Scheme 2 should not be used as it can lead to fragmented memory. It should only be used if same size of blocks get allocated and deallocated (no 'random' sequence). Better use Scheme 4 which merges free blocks.

I believe the motivation to change the heap model was to see the amount of memory allocated (you won't see this directly in the debugger if using scheme 1 or 3. But each scheme has its own overhead, so the numbers are not really comparable. I would recommend the Percepio Tracealyzer as it tracks memory allocation nicely.

Erich

myke_predko · ‎03-24-2021

Hey @ErichStyger

Thank you for the replies.

I'll have to look at the FreeRTOS heap models again as I don't have a very in depth understanding of them. This may be an area where FreeRTOS works differently than the other RTOS' I've worked with over the years - I'm very surprised that C++ objects are not taken from the current task's heap as this seems to be the standard method in other OS's.

In the past, I've always cautioned people against finding exactly what a task or thread requires in terms of memory because that too often leads to disaster because of ISRs, RTOS/debugger overhead, etc. that are very difficult to account for and will blow the task's stack and start doing writes that cause very difficult to track down execution problems.

This is why I put in the comment above that I make sure there is a (few hundred byte to 1k) buffer that is added to what's seen in the "Task List". It's a lot less work and mitigates the chance for unexpected problems.

When given the choice I will always choose the option that provides maximum execution security over optimizing memory use/execution speed.

myke

ErichStyger · ‎03-24-2021

Hi @myke_predko ,

I'm very surprised that C++ objects are not taken from the current task's heap as this seems to be the standard method in other OS's.

The memory allocation of C++ objects has no knowledge of the RTOS used and uses the standard library one. The non-referenced objects are created on the current task stack (not heap) as normal variables and structs are. But dynamic/referenced types are any other objects needing dynamic memory allocation get allocated through the standard library dynamic memory allocation (malloc and free). So using Scheme 3 would seem to be a good choice for such a scenario, however because standard malloc and free are not reentrant it causes a big problem which gets solved with heap Scheme 6 which makes malloc and free reentrant.

To me understanding the memory allocation scheme used in a system is crucial to have it working correctly, along with keeping reentrancy in mind.

Erich

myke_predko · ‎03-24-2021

Hey @ErichStyger

I always learn something from you.

I guess my error in terms of how C++ objects stems from two things:

My PC C++ programming experience where C++ objects are saved in the process heap.
C++ local object data is stored on the task stack.

In my mind, I was conflating the two.

I just deleted a couple of paragraphs on my thoughts on reentrency but they can be netted to me (and everyone of my generation) being scarred by the original PC DOS and it's lack of reentrency. From that, I only allow one path to a resource in my RTOS applications and serialize/virtualize resource access as appropriate.

ErichStyger · ‎03-23-2021

Also what are the steps to change the heap scheme from 4 to 2?

I'm using the McuLib port (https://github.com/ErichStyger/McuOnEclipseLibrary ), and there it is done with an extra define:

#ifndef configUSE_HEAP_SCHEME
  #define configUSE_HEAP_SCHEME                   4 /* either 1 (only alloc), 2 (alloc/free), 3 (malloc), 4 (coalesc blocks), 5 (multiple blocks), 6 (newlib) */
#endif /* configUSE_HEAP_SCHEME */

For the port in the NXP SDK, you need to use exclude from build for the other heap files (https://mcuoneclipse.com/2014/07/22/exclude-source-files-from-build-in-eclipse/ ).

Erich

ErichStyger · ‎03-23-2021

@ErichS How do I turn on the "Image info" view?

Context menu on the .axf file, then use the Tools menu.

xmksa · ‎03-23-2021

Thank you I found that option. Is there a way to generate a new .axf file for every new debugging session without deleting it manually?

I have a problem calculating the max usage of main function when my, lets say char arrays, are not predefined

array[500];

x=100;

array[5*x]

this gives me different values of the function that reserves the space, is there a way to make it give out the same stack usage?

ErichStyger · ‎03-23-2021

The stack calculation is a *static* one. It cannot compute the size if using VLA.

Personally I recommend against using VLA (search for VLA on stackoverflow and you will get good information about it).

Erich

ErichStyger · ‎03-23-2021

Is there a way to generate a new .axf file for every new debugging session without deleting it manually?

Why do you want to do this? The Image info is the same if you just re-link the application. But you could add a custom script as pre-build to touch a file or delete the .axf, then have the debugger to run a build first, see https://mcuoneclipse.com/2012/10/30/speeding-up-the-debug-launch-in-codewarrior/

Here is how the pre-build for MCUXpresso IDE: https://mcuoneclipse.com/2021/03/24/touch-build-auto-update-of-firmware-date-and-time/

Erich

mjbcswitzerland · ‎03-22-2021

Hi

For heap use I think that you will need to look into the library used since I doubt that the IDE will be able to monitor such things.

Users of the uTasker project have a function that allows the state of stack and heap utilisation to be monitored at run time. Here is what the output looks like:

System memory use:
==================
Free heap = 0x6bf8 from 0xfc00
Unused stack = 0x00004721 (0x00005326)

uCalloc:
Size: 0x00003000
Max. objects: 50
Hole size limit: 0x1000
Allocated memory: 0x00000000 bytes with 0 holes
Allocs: 4 / De-Allocs: 4
Memory margin: 0x00002bc8
Max. used objects: 4

It uses stack margin monitoring which shows the worst case (lower safety margin) encountered during all operation. It then has a heap that is used for dynamic allocation of memory that won't need to be returned (zero overhead) of which its optional calloc() takes a defined amount for its use, whereas in the case above a memory stick was inserted and removed, showing that the USB host stack allocated 4 chunks and gave them all back resulting in all being returned without any holes and the peak utilisation still leaving a safety margin of 0x2bc8 bytes.

Based on such information from systems that have been running for a certain amount of time in which highest loading has been tested the actual peak utilisations and safetly margins become very clear and can allow further optimisating the memory utilisation to achieve highest performance while remaining within safe utilisation limits.

In very limited systems this information is critical to be sure of adequate safety margins in worst case loading situations.

Regards

Mark
[uTasker project developer for Kinetis and i.MX RT]
Contact me by personal message or on the uTasker web site to discuss professional training, solutions to problems or rapid product development requirements

For professionals searching for faster, problem-free Kinetis and i.MX RT 10xx developments the uTasker project holds the key: https://www.utasker.com/kinetis/FRDM-K22F.html