Kinetis K10 restarts on its own

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Kinetis K10 restarts on its own

Jump to solution
1,271 Views
peterkrause
Contributor I

Hello.

I'm currently developing an application using a Kinetis MK10DX256VLH7 (CodeWarrior 10.6). I have a big problem problem with spontaneous restarts of the MCU. Sometimes it runs for many hours before it restarts. And sometimes it only runs for a few minutes or shorter. I can't trigger it, so I can't debug it.

What can I do to find the restart source or problem?

Thanks.

0 Kudos
1 Solution
1,010 Views
mjbcswitzerland
Specialist V

Peter

A lock-up occurs when either an NMI or hard fault takes place but then results in another error that can't be recovered from.

This probably means that you don't have a proper hard fault handler installed so I would add this first. It can be any routine but I prefer to just use:

void __interrupt fnHardFault(void) // __interrupt has no meaning apart from making it clear that it is an interrupt handler

{

}

You can then set a break point in the interrupt handler and wait for it to be called (it will usually be due to code accessing non-existent memory or trying to write to a NULL pointer, or general run-away code).

When the break point is hit switch the debugger to disassemble mode (instruction stepping) and simply step back out of the interrupt and you will see the operation that caused it to fire. Then you can usually see what went wrong.

In the worst case it will be random code, which means that the fault was a result of a pevious fault (then trace may be useful to be able to look back to the initial fault).
If you happen to see what looks like valid code it can be an indication that the you have the Flash clock set too high (max. 25MHz) because this can make program operation unreliable. (In fact I have also seen Flash running sightly less that the maximum Flash speed randomly causing faults to occur but I understand that this was due to poor power supply design in that particular case).

Finally you should look at the watchdog concept since it is usually the first part of a design and not an after-thought in a project.

Regards

Mark

Kinetis: µTasker Kinetis support

K20: µTasker Kinetis TWR-K20D72M support  / µTasker Kinetis FRDM-K20D50M support  / µTasker Kinetis TWR-K20D50M support  / µTasker Teensy3.1 support

For the complete "out-of-the-box" Kinetis experience and faster time to market

View solution in original post

0 Kudos
9 Replies
1,010 Views
mjbcswitzerland
Specialist V

Hello Peter

It is important to know whether you have the watchdog enabled or not. If enabled it is probably due to a watchdog reset, which could have two reasons:

1. The code is failing and not retriggering the watchdog
2. The watchdog servicing has an error and causes a random reset

Note that you can read the last cause of a reset from the chip to see what the cause was (including due to undervoltage).

In case 1 it is useful to disable the watchdog, connect the debugger and let the processor run until it fails. Typically the code will be spinning in a loop (or an interrupt) and this is then visible by pausing with the debugger.

In case 2 a typical reason is that the watchdog retrigger sequence is not protected against interrupts - an interrupt arriving during the restrigger sequence will cause the watchdog to fire. The solution is to disable interrupts before performing the watchdog retrigger and then re-enable them afterwards.

Regards

Mark

Kinetis: µTasker Kinetis support

K20: µTasker Kinetis TWR-K20D72M support  / µTasker Kinetis FRDM-K20D50M support  / µTasker Kinetis TWR-K20D50M support  / µTasker Teensy3.1 support

For the complete "out-of-the-box" Kinetis experience and faster time to market

1,010 Views
davidgraham
Contributor II

Thanks ...# 2 was a problem for me. 

0 Kudos
1,010 Views
peterkrause
Contributor I

Hello Mark,

the watchdog is not implemented at the moment, I'm currently not using it.

Yon mentioned that I can read back the last reset source after reset. How can I do that?

Regards

Peter

0 Kudos
1,010 Views
mjbcswitzerland
Specialist V

Peter

Read RCM_SRS0 or MC_SRSH (depending on which register your chip has - probably the first one) to see the reset cause.

Regards

Mark

Kinetis: µTasker Kinetis support

K20: µTasker Kinetis TWR-K20D72M support  / µTasker Kinetis FRDM-K20D50M support  / µTasker Kinetis TWR-K20D50M support  / µTasker Teensy3.1 support

For the complete "out-of-the-box" Kinetis experience and faster time to market

0 Kudos
1,010 Views
peterkrause
Contributor I

Hello Mark,

that looks very good. I hope that the register will give me an idea what's going wrong.

Thanks. I will I let you know if I found the problem.

Regards

Peter

0 Kudos
1,010 Views
peterkrause
Contributor I

Ok, I sent out the content of the RCM_SRS0 and RCM_SRS1 registers via UART at device start-up. It's yery difficult to get a valid result because sometimes the MCU runs very long. Nevertheless I managed to get three resets in the last hour. In all cases it was a Core Lockup Reset.

What is the best way to figure out which was the source of the lockup?

Regars

Peter

0 Kudos
1,011 Views
mjbcswitzerland
Specialist V

Peter

A lock-up occurs when either an NMI or hard fault takes place but then results in another error that can't be recovered from.

This probably means that you don't have a proper hard fault handler installed so I would add this first. It can be any routine but I prefer to just use:

void __interrupt fnHardFault(void) // __interrupt has no meaning apart from making it clear that it is an interrupt handler

{

}

You can then set a break point in the interrupt handler and wait for it to be called (it will usually be due to code accessing non-existent memory or trying to write to a NULL pointer, or general run-away code).

When the break point is hit switch the debugger to disassemble mode (instruction stepping) and simply step back out of the interrupt and you will see the operation that caused it to fire. Then you can usually see what went wrong.

In the worst case it will be random code, which means that the fault was a result of a pevious fault (then trace may be useful to be able to look back to the initial fault).
If you happen to see what looks like valid code it can be an indication that the you have the Flash clock set too high (max. 25MHz) because this can make program operation unreliable. (In fact I have also seen Flash running sightly less that the maximum Flash speed randomly causing faults to occur but I understand that this was due to poor power supply design in that particular case).

Finally you should look at the watchdog concept since it is usually the first part of a design and not an after-thought in a project.

Regards

Mark

Kinetis: µTasker Kinetis support

K20: µTasker Kinetis TWR-K20D72M support  / µTasker Kinetis FRDM-K20D50M support  / µTasker Kinetis TWR-K20D50M support  / µTasker Teensy3.1 support

For the complete "out-of-the-box" Kinetis experience and faster time to market

0 Kudos
1,010 Views
peterkrause
Contributor I

Hello Mark,

thanks for your detailed information.

I hadn't implemented a special hard fault handler yet. Currently it is the default handler with a single instruction: __asm("bkpt");

As you recommended, I will implement a custom handler to see what's going wrong.

I also re-checked the flash clock. Flash is running at 24 MHz.

Regards

Peter

0 Kudos
1,010 Views
peterkrause
Contributor I

Hello Mark,

it seems that I found the bug. It was "deeply hidden" in one of our algorithm modules. At very special circumstances some wrong indices are calculated. The result of this error is an access to a non-exiting memory location. This faulty scenario ocurred not very often. So I'm lucky that I found the bug.

Thanks for the help. I'm sure, this was not the last bug of this type. But next time I'm better prepared to find it.

Regards

Peter

0 Kudos