Performing a core/stack dump on a Cortex M4 MCU

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Performing a core/stack dump on a Cortex M4 MCU

Jump to solution
5,399 Views
larsl
Contributor II

Hi!

 

My company is working with an embedded system (bare-metal, no OS) that is deployed at a very distant location. About once a month or so, the system shows signs of starvation - the interrupts run, but the background loop does not. So something has gone wrong. Since hooking up a debugger and running a >30 day debug session really isn't an option, we've decided to attempt to write a core dump routine, that would write the RAM data to flash for later retrieval.

 

We have support for detecting the starvation (currently we just reset the MCU), and we have support for writing to flash. So the real questions are, how would we go about to dump the RAM, and what would be the best way of interpreting it? We've never written any debug routines like this before, and therefore feel quite lost about where to start...

 

Kind regards

Lars

Labels (1)
0 Kudos
1 Solution
3,547 Views
larsl
Contributor II

Ok, so I pretty much solved it. So, I'll leave this here for anyone who might be interested.

1) Issue a NMI interrupt by writing to SCB_ICSR register.

2) In the NMI handler, write an assembly chunk which does this:

  • Dump all interesting registers on the stack
  • Find out if the SP comes from MSP or PSP, and store it somewhere appropriate (This will be the only thing you need to restore your dump)
  • Write RAM to Flash.
  • Breakpoint! (This will be your restore-point as well)
  • Pop all interesting registers from stack before returning from NMI handler.

Now, to restore:

1) Retrieve the dump in whatever manner suits you.

2) To restore the dump, issue a NMI interrupt from GDB while debugging the code, you will get a break in the NMI assembler chunk.

3) Load RAM dump with GDB's restore command.

4) Set the stack pointer via GDB.

5) Step out of the NMI handler, you should now have restored the dumped session.

Remember though that this process is not foolproof. You are going to have a hard time to restore all registers necessary to restore the MCU into the exact same state as before the dump, since a lot of registers are read only, and restoring them is not really straight forward.

View solution in original post

0 Kudos
4 Replies
3,547 Views
trytohelp
NXP Employee
NXP Employee

Hi Lars,

I think we need more information.

  - what is the development tool version used ?

  - do you have a tower board or other eval board ? or is it your own board ?

  - what is the interface used ?

CW version used:

  Under CodeWarrior IDE (classic)

      Start the IDE and click on Help | About Freescale CodeWarrior.

      Click on Installed Products 

      Provide us all info displayed.

      Or you can save them in a txt file.

  Under Eclipse IDE

    1-      Start Eclipse and click on Help ¦ Freescale Licenses

      The Status column gives the status of the license.

      Under Product, select it and click on details.

      A new dialog show up giving license details.

      Provide us all info displayed

    2-      Start Eclipse and click on Help ¦ About CodeWarrior Development Studio

      Under Installed Products, you will see the version used.

If the Installed Products is not available for older version you should find information in the welcome.txt

under the installation folder.


Have a great day,
Pascal Irrle

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
3,547 Views
larsl
Contributor II

Hi Pascal,

We use Eclipse, and have our own custom made PCBs. We are running MK10FX512 MCUs which are hooked up to Segger J-Link debuggers at the office. However, I don't really see the need for more information on our tools, I am more looking for a programmatic approach to doing this. What I've come up with so far is:

1) Cast a NMI exception, processor will put all core registers on the stack.

2) After retrieving all core registers, start writing all RAM to flash, and registers in a specific spot.

3) After the system has frozen, we can attach a debugger and collect the data stored in flash.

4) Since registers were stored in a specific spot, we can pull those out first.

5) Retrieve RAM stored in Flash, and construct an SREC.

6) At the office, with a devboard, load SREC onto unit.

7) When RAM has been loaded, set processor registers from dump as well.

8) Debug!

Does this seem like a reasonable approach? I feel that the RAM and the core registers is all that is needed to do this, but maybe I'm missing something?

0 Kudos
3,548 Views
larsl
Contributor II

Ok, so I pretty much solved it. So, I'll leave this here for anyone who might be interested.

1) Issue a NMI interrupt by writing to SCB_ICSR register.

2) In the NMI handler, write an assembly chunk which does this:

  • Dump all interesting registers on the stack
  • Find out if the SP comes from MSP or PSP, and store it somewhere appropriate (This will be the only thing you need to restore your dump)
  • Write RAM to Flash.
  • Breakpoint! (This will be your restore-point as well)
  • Pop all interesting registers from stack before returning from NMI handler.

Now, to restore:

1) Retrieve the dump in whatever manner suits you.

2) To restore the dump, issue a NMI interrupt from GDB while debugging the code, you will get a break in the NMI assembler chunk.

3) Load RAM dump with GDB's restore command.

4) Set the stack pointer via GDB.

5) Step out of the NMI handler, you should now have restored the dumped session.

Remember though that this process is not foolproof. You are going to have a hard time to restore all registers necessary to restore the MCU into the exact same state as before the dump, since a lot of registers are read only, and restoring them is not really straight forward.

0 Kudos
3,547 Views
trytohelp
NXP Employee
NXP Employee

Hi Lars,

Thanks for the info.

Good to know you've solved the problem.

Following your feedback (September 30th), I was contacted and moved the question to the Kinetis Team.

According to the data base the status of this issue appears as Open.

I will forward them your solution.

Thanks a lot for your expertise and feedback.

Have a great day,
Pascal Irrle

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos