Newlib + FreeRTOS thread safety

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Newlib + FreeRTOS thread safety

4,627 Views
v_canoz
Contributor III

Hello,

We are using FreeRTOS + Newlib + C++ on several NXP MCUs. After a few random hard faults in a new project, we've discovered that the Newlib version provided by NXP is not thread safe!

We've tried using Dave Nadler's solution shipped with McuXpresso v11.2.1 using heap_useNewlib.c for heap management. We also added configUSE_NEWLIB_REENTRANT  set to 1 in the FreeRTOSConfig.h file.

Despite this, we still have random hard faults, for example in the __ssprint_r newlib function.

Can NXP provide a thread safe example of FreeRTOS used with Newlib? Do you think __ssprint_r has a bug, or is it a thread safety related problem?

8 Replies

4,478 Views
v_canoz
Contributor III

Hello,

Thanks for your answers.

As a reminder we are using heap_useNewlib.c for heap management. We also added configUSE_NEWLIB_REENTRANT  set to 1 in the FreeRTOSConfig.h file. The Newlib binaries used are Newlib-none (NOT nano).

We wrote a simple test bench to check wether vsnprintf is thread safe. We tried to read the newlib sources from here, but given we don't know what flags NXP uses, we don't know exactly what implementation is used by McuExpresso (By the way, can NXP provide the sources of Newlib with their own settings, this should be open source, isn't it?).

The test bench is available here. We have a thread safe task that we instantiate 20 times with different levels of priority, in order to create context switches in runtime. In the tasks loops, we just run vsnprintf with some integers into a buffer. This test may not represent a real use case, but I sleep better when I know my programs do not contain big random flaws.

This test bench almost instantly generates an hardfault. However, if I replace vsnprintf with my custom made int to string function, it works flawlessly!

So my conclusion is vsnprintf is NOT thread safe at all ! 

I would be so grateful if you guys (@myke_predko and @ErichStyger ) can corroborate our findings. Unless I am mistaking, I really think one should never use the printf family function in mcuxpresso with Freertos. But now I wonder, what others functions of newlib are broken too??

Many thanks,
Victor

0 Kudos

4,444 Views
myke_predko
Senior Contributor III

Hi @v_canoz 

Honestly, I don't have the time to create a new application to test NewLib/printf reentrancy in FreeRTOS, it sounds like you've done a good job on your own.  

However, If I can make a few comments;

  1. Based on what you've written here, it sounds like you've found a bug in the float to string functionality. I suggest that you join the New Lib Mail Server and report the bug.  
  2. As I said before it's very difficult to make IO functions reenterant - I always recommend that the developer provide either serialization or virtualization code to access to an IO method if it is going to be accessed by more than one task.  The ideal is to not to call an IO method and the hardware it accesses by more than one task (then you don't have to worry about serializing or virtualiztng the IO method).  
  3. I use printf's constantly in my FreeRTOS applications without any problems.  But, I follow a fairly rigorous set of rules for their use, namely a printf at the start of each task so that I can see the task has started, a printf when an execution error is detected and printfs fairly liberally in the application execution task (which performs no IO). 

ALL printf statements are encased in "#if (2 != SDK_DEBUGCONSOLE)"/"#endif" so that when I want to release the code I can disable all of them by changing the DEBUGCONSOLE build variable.   

My placement of printfs is designed to not affect the timing of IO operations (unless there is an error, at which point I don't care and then causing an execution fault is not necessarily a bad thing) so that I can run with or without them and not worry how the application performs.  I've been using NewLib for 7+ years, first with MQX and then with FreeRTOS without any issues (although I haven't been using it with floats).  

I detailed how I use printfs in FreeRTOS because I'm not sure how you are putting printfs in your code.  From what you've written, it sounds like you put them everywhere which means that you're going to have a hodge podge of messages piling up on the screen that can be very difficult to decypher when you have a problem to resolve.  

I would suggest that you be somewhat more strategic in how you use printfs in your code and reentrancy should not be an issue (although you still have the float issue to deal with).  

Good luck,

myke

0 Kudos

4,471 Views
converse
Senior Contributor V

As detailed in the SoftwareContentRegister.txt file provided in LPCXpresso, NXP take their GCC directly from the ARM Embedded Toolchain project

https://developer.arm.com/tools-and-software/open-source-software/developer-tools/gnu-toolchain/gnu-...

This project provides full source code downloads, build instructions (including all of the settings used). You just need to match up the version you are using with the appropriate release. 

This is what it says in v11.1.1 (the version I am using)

GNU Tools for ARM Embedded Processors
A GNU toolchain built by ARM
Licenses: GPL-2.0, GPL-3.0, LGPL-2.0, LGPL-2.1, LGPL-3.0, GPL-2.0-with-GCC-exception, BSD-3-Clause, and further licences (see below)
Further Licenses: The Newlib and Newlib-nano packages within this component are subject
to a collection of licenses, listed in the License File. The Expat package is
subject to a license whose text is given in the License File.
License File: ./ide/plugins/com.nxp.mcuxpresso.tools.*/tools/share/doc/gcc-arm-none-eabi/license.txt
Distribution Type: Source & Binary
Version: 8-2019-q3-update
plus gdb 7.12 binaries taken from 6_3-2017q2 (and labelled "-712")
Location: ./ide/plugins/com.nxp.mcuxpresso.tools.*/tools
Website: https://developer.arm.com/open-source/gnu-toolchain/gnu-rm
Source URL: https://developer.arm.com/open-source/gnu-toolchain/gnu-rm/downloads

 

0 Kudos

4,541 Views
myke_predko
Senior Contributor III

@v_canoz 

Could you provide more context as to your problems?  What happens if you #ifdef out the newlib method calls?  

I wanted to respond to the comments made by @ErichStyger - the newlib ssprint_f code is not reentrant (or "thread safe").  I did a quick look at the  vfprintf.c Source Code and there is no check to see if the method is already running.  I'm not surprised at this as making IO library code reentrant is extremely difficult and most authors leave it as an exercise for the application developers (ie you).  

I should point out that it's extremely unlikely that you're going to have reentrancy problems with printf statements unless you've littered them *everywhere* including, interrupt service routines (which you should never do).  

Having said that, I do agree with @ErichStyger that your problems are most likely of your own making and a good place to look is your stack size.  Erich quoted that you should have 50 bytes extra for each task and I would consider that the bare minimum, especially if interrupts are being used in your application.  For very high reliability applications, I have calculated the worst case for the task (context registers + local variables (for all called methods)  + number of method calls * size of data), but when it comes right down to it, I specify a lot bigger stack than I need (who cares if I use up the device's SRAM) and verify it with the process outlined in Erich's blog post: Tutorial: Using Runtime Statistics with Amazon FreeRTOS V10 

Good luck,

myke

0 Kudos

4,599 Views
v_canoz
Contributor III

[edit] moved post, can be deleted

0 Kudos

4,616 Views
ErichStyger
Senior Contributor V

I have been using Dave's wrappers for more than 3 years (see https://mcuoneclipse.com/2017/07/02/using-freertos-with-newlib-and-newlib-nano/) and have not seen any reentrancy issue with it.

If you get hard faults, it is probably more because of stack overflows or inadequate buffers: I recommend you turn on Code Analysis (see https://mcuoneclipse.com/2013/01/06/free-static-code-analysis-with-eclipse/) which is able to catch some application bugs plus turn on stack analysis https://mcuoneclipse.com/2015/08/21/gnu-static-stack-usage-analysis/.

Have a look as well at the FreeRTOS task view (stack usage: I always keep around 50 bytes spare space)

I hope this helps,

Erich

0 Kudos

4,583 Views
v_canoz
Contributor III

Hello Erich,

Thank you for your answer. It's reassuring to know that you have been using it for 3 years.

I'm narrowing down the problem, it seems to be related to the conversion of floats to strings.

I've pasted here : https://godbolt.org/z/zKqaox the code that is sufficient to induce hardfault. In particular std::to_string(some_float) is the culprit.

Is it a known issue?

Many thanks,
Victor

0 Kudos

4,576 Views
v_canoz
Contributor III

[edit] I still have random crashes in __ssprint_r even if I am using integers (no floats). I've pasted below the disassembly code. The last instruction fetched is the last one (0x17838). I'm using newlib-nohost and the optimisation flags are set to -O3

__ssprint_r:
00017790: ldr r3, [r2, #8]
00017792: stmdb sp!, {r4, r5, r6, r7, r8, r9, r10, r11, lr}
00017796: mov r9, r2
00017798: sub sp, #12
0001779a: cmp r3, #0
0001779c: beq.n 0x1788c <__ssprint_r+252>
0001779e: ldr r7, [r2, #0]
000177a0: mov r8, r0
000177a2: mov r4, r1
000177a4: ldr r0, [r1, #0]
000177a6: adds r7, #8
000177a8: ldr r5, [r1, #8]
000177aa: b.n 0x17844 <__ssprint_r+180>
000177ac: ldrh.w r12, [r4, #12]
000177b0: tst.w r12, #1152 ; 0x480
000177b4: beq.n 0x17820 <__ssprint_r+144>
000177b6: ldrd r1, r2, [r4, #16]
000177ba: adds.w r2, r2, r2, lsl #1
000177be: sub.w r5, r0, r1
000177c2: it mi
000177c4: addmi r2, #1
000177c6: adds r0, r5, r6
000177c8: mov.w r11, r2, asr #1
000177cc: adds r0, #1
000177ce: cmp r0, r11
000177d0: mov r2, r11
000177d2: bls.n 0x177d8 <__ssprint_r+72>
000177d4: mov r11, r0
000177d6: mov r2, r0
000177d8: tst.w r12, #1024 ; 0x400
000177dc: str r3, [sp, #4]
000177de: beq.n 0x17858 <__ssprint_r+200>
000177e0: mov r1, r2
000177e2: mov r0, r8
000177e4: bl 0x15fa4 <_malloc_r>
000177e8: ldr r3, [sp, #4]
000177ea: mov r10, r0
000177ec: cmp r0, #0
000177ee: beq.n 0x1786e <__ssprint_r+222>
000177f0: mov r2, r5
000177f2: ldr r1, [r4, #16]
000177f4: str r3, [sp, #4]
000177f6: bl 0x16540 <memcpy>
000177fa: ldrh r2, [r4, #12]
000177fc: ldr r3, [sp, #4]
000177fe: bic.w r2, r2, #1152 ; 0x480
00017802: orr.w r2, r2, #128 ; 0x80
00017806: strh r2, [r4, #12]
00017808: add.w r0, r10, r5
0001780c: sub.w r2, r11, r5
00017810: str.w r10, [r4, #16]
00017814: mov r5, r6
00017816: mov r10, r6
00017818: str r0, [r4, #0]
0001781a: str r2, [r4, #8]
0001781c: str.w r11, [r4, #20]
00017820: mov r1, r3
00017822: mov r2, r10
00017824: bl 0x1662c <memmove>
00017828: ldr r0, [r4, #8]
0001782a: ldr.w r1, [r9, #8]
0001782e: ldr r3, [r4, #0]
00017830: subs r5, r0, r5
00017832: subs r6, r1, r6
00017834: add.w r0, r3, r10
00017838: str r5, [r4, #8]

The more tasks I run in parallel, the more likely it is to produce a hardfault quickly. 

0 Kudos