Bug Report: In MQX 4.1 change of size of bool can cause stack corruption

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Bug Report: In MQX 4.1 change of size of bool can cause stack corruption

1,794 Views
chrissolomon
Contributor III

HI,

I have been tracking down an issue causing unpredictable crashes - bus faults, mem manage faults, hard faults & usage faults.

After 3 days debugging with profiling tools and debuggers we have finally tracked the problem down:

In MQX 4.0.x a boolean type was an alias of an unsigned long  (32bits).

In MQX 4.1.x a bool is a C99 standard type, and is guaranteed to be 8 bits on all platforms.

The specific issue that I have found is that in _io_fstatus a bool (result) is created on the stack, and then a pointer to this bool is passed into the IOCTL for the device in question.

In _io_cdc_serial_ioctl the pointer is cast to an _mqx_int * (32 bit) and then the address is set to true or false, which wipes 3 bytes of stack above the bool.

The fix is simple in this case (changes to io_fstat.c):

@@ -50,7 +50,7 @@

    )

{ /* Body */

    IO_DEVICE_STRUCT_PTR   dev_ptr;

-   bool                result;

+   uint32_t               result;

#if MQX_CHECK_ERRORS

    if (file_ptr == NULL) {

@@ -64,7 +64,7 @@

       dev_ptr = file_ptr->DEV_PTR;

       if (dev_ptr->IO_IOCTL != NULL) { 

          (*dev_ptr->IO_IOCTL)(file_ptr, IO_IOCTL_CHAR_AVAIL, &result);

-         return(result);

+         return (bool)(result);

       } /* Endif */

    } /* Endif */

    return (FALSE);

I haven't completed my review of all the other places where bools are used, but this could be a problem in other places as well.

Chris

Tags (2)
0 Kudos
Reply
7 Replies

968 Views
matthewkendall
Contributor V

I am wondering why you fixed _io_fstatus (to go back to using a uint32_t) rather than fixing _io_cdc_serial_ioctl (to de-reference the pointer as a bool)?

I found an instance of the opposite when I upgraded to MQX 4.1. IO_IOCTL_CHAR_AVAIL (associated with serial ports) previously de-referenced the passed pointer as a 32-bit type and now de-references it as a bool, with the result that if you pass in a pointer to a 32-bit int only the bottom 8 bits will be modified. If the other bits happen to be non zero you get the wrong result. I chose to fix this by modifying the code that calls ioctl() to pass a pointer to a bool in this case.

The larger issue here is that the documentation for the use of ioctl() is poor. For each IOCTL the documentation should specify exactly how the third argument (prototyped simply as pointer-to-void) is treated. But for example MQXIOUG section 5.7 (Serial Device Families - I/O Control Commands) just lists the IOCTL commands in a table and you are left to guess how the third parameter is treated. Until now it seems it was almost always treated as pointer to 32-bit int, but that is not actually documented. Now it seems that some uses that return boolean results treat it as pointer to bool, but again I don't see that documented anywhere. You are pretty much left to guess, or examine the source code for the driver in question.

0 Kudos
Reply

968 Views
chrissolomon
Contributor III

Hi,

I was trying to avoid other problems - I think you're right, changing the _io_cdc_serial_ioctl would have been the correct fix, strictly speaking, but if other devices are still treating the parameter as a uint32_t * then they would produce the same issue.


This way, if the device is fixed it is fine, and if it isn't it should still work, although to be sure it works you would have to initialize result to zero. (Which I missed, so thanks for making me re-examine the fix.)

Chris

0 Kudos
Reply

968 Views
DavidS
NXP Employee
NXP Employee

Hi Guys,

Did either of you submit this as error?

If not I will.  Just let me know.

Regards,

David

0 Kudos
Reply

968 Views
matthewkendall
Contributor V

Does posting to this forum not count as submitting it? I assumed it did. I have not submitted it anywhere else.

I would be surprised if the two issue that Chris and I found were the only ones. I would expect there to be other instances of problems caused by the change to sizeof(bool), particularly with ioctl() since the treatment of the parameter is poorly documented.

0 Kudos
Reply

968 Views
DavidS
NXP Employee
NXP Employee

Hi Matthew,

Good suggestion for update to Community site would be to have a "submit errata" button.

But until then...I'll submit this to ensure MQX Development team scrubs the total code base.

Thank you Matthew and Chris.

Regards,

David

0 Kudos
Reply

968 Views
RadekS
NXP Employee
NXP Employee

Thank you for your bug report. You are right that there is issue minimally in _io_cdc_serial_ioctl.

You are right that some chapters of IO User Guide don’t help you in this case.

One of our MQX designers currently works on checking all IOCTL commands and documentation update.

We didn’t found any other IOCTL command with this issue till now – but we are just on beginning.


Have a great day,
RadekS

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply

968 Views
RadekS
NXP Employee
NXP Employee

We identified another IOCTL command with similar problem in bool type:

NANDFLASH_IOCTL_WRITE_PROTECT

Line:

if (NANDFLASHERR_NO_ERROR == (*handle_ptr->WRITE_PROTECT)(handle_ptr, *(_mqx_uint_ptr)param_ptr)) {

should be replaced by:

if (NANDFLASHERR_NO_ERROR == (*handle_ptr->WRITE_PROTECT)(handle_ptr, *(bool*)param_ptr)) {

It will be fixed in next MQX release.

0 Kudos
Reply