LPC2368 Watchdog triggered on heat

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

LPC2368 Watchdog triggered on heat

跳至解决方案
1,975 次查看
junointegration
Contributor II

Hi,

We have a product using the LPC2368 for years now. The software/firmware has two parts: Boot code and Main code. Both stored in the same Internal Flash, but Main code is in upper address boundary. The Boot code can verify if the Main code is valid; and can jump to the Main code. Interface to this product is done via USB.

We have a few boards wherein the Watchdog is triggered when heat is applied to the board (65 to 80 celsius). The LPC2368 is rated to work for LPC2368.

We've been trying to track the point BEFORE the Watchdog is triggered. However, it seems to being triggered in several random places. We're using one of the UARTs for debugging/tracing. On a few occasions when the Watchdog is triggered and the Boot code runs, it even reports that the Main code is INVALID or CORRUPTED.

Some questions:

1. Is there any other way (besides using UART tracing) to detect at what point did the software hanged up the Watchdog is triggered?

2. What are the other possible peripherals (XTAL, etc.), that can cause software hangup? As noted - it seems to random places.

3. How can the Boot code report that the Main code is INVALID or CORRUPTED? Since both are sitting on the same Internal Flash.

Thanks for any feedback!

标记 (1)
0 项奖励
回复
1 解答
1,958 次查看
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi, Jun,

I agree with Frank that the flash access may requires more delay cycles with temperature rising. pls change the MAMTIM with 7 and have try.

With the temperature rising, you have not enough time delay when reading flash, the binary code your read is wrong, so the CPU will hang, the watchdog will function and reset the chip.

Hope it can help you

BR

XiangJun Rong

 

xiangjun_rong_0-1619600733749.png

 

 

 

 

 

在原帖中查看解决方案

7 回复数
1,959 次查看
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi, Jun,

I agree with Frank that the flash access may requires more delay cycles with temperature rising. pls change the MAMTIM with 7 and have try.

With the temperature rising, you have not enough time delay when reading flash, the binary code your read is wrong, so the CPU will hang, the watchdog will function and reset the chip.

Hope it can help you

BR

XiangJun Rong

 

xiangjun_rong_0-1619600733749.png

 

 

 

 

 

1,955 次查看
frank_m
Senior Contributor III

Hello, assuming you are in a kind of moderator role - the post from contributor "jimiizx" in this thread here looks very suspiciously like spam, a.k.a. unsolicited advertisment.

Please have a look at it.

1,941 次查看
junointegration
Contributor II

I've actually disabled MAM for now and Watchdog has not been triggered so far (when applying heat to 75+ celsius)! Some bit of testing required on our side to understand the implications of this though. 

Some  additional questions:

1. The datasheet seems to indicate that MAM is disabled by default. But when the software runs after a reset, the CPU is indicating it is set to 2 (fully enabled). Does this mean that some initial batches of LPC2368 has it 0 (off), and newer batches has set it to 2 (fully enabled)?

2. We actually have another board which has the Watchdog triggered intermittently even without heat. Does this mean that there's an internal fault in the MAM of the CPU? Or is there a fault in the peripherals connected to the CPU?

3. Do you have a rough idea what is the performance penalty with MAM disabled? Will it be twice  slower than before?

Thanks @xiangjun_rong  and @frank_m again for response! Much appreciated!

 

 

0 项奖励
回复
1,888 次查看
junointegration
Contributor II

We have been SOAK testing the suspected 'faulty' hardware now for weeks now. All was working OK with no Watchdog resets apart from one incident we saw yesterday on one hardware.

Prior to disabling MAM, these suspected 'faulty' hardware were triggering Watchdog resets intermittently (one unit multiple times a day; some units once every 2-3 days).  It's likely that the latest Watchdog reset incident is due to the a faulty CPU or some other hardware fault - difficult to determine. The hardware has undergone several heat testing above 85 degrees celsius previously.

With regards to speed, disabling the MAM for the LPC2368 made it 3x slower. However, the actual application that we use does NOT seem to have any affect at all (as the application depend on timers ranging from 50ms to 1second). The LPC2368 is still fast with MAM disabled for our purpose.

I have revisited the LPC2368 errata sheet again. And did notice that MAM is mentioned in 3.12.

https://www.nxp.com/docs/en/errata/ES_LPC2364_66_68.pdf

However, the implications of the incorrect MAM setting is quite vague - does not mention about Watchdog resets or corruption of code.

Anyway, thanks again @xiangjun_rong  and @frank_m  for your help!

 

 

0 项奖励
回复
1,932 次查看
xiangjun_rong
NXP TechSupport
NXP TechSupport

Hi,

Q1. The datasheet seems to indicate that MAM is disabled by default. But when the software runs after a reset, the CPU is indicating it is set to 2 (fully enabled). Does this mean that some initial batches of LPC2368 has it 0 (off), and newer batches has set it to 2 (fully enabled)?

>>>>>If the MAMTIM register is not the same with the default 0x07, I suppose it must be modified by the code somewhere.

Q2. We actually have another board which has the Watchdog triggered intermittently even without heat. Does this mean that there's an internal fault in the MAM of the CPU? Or is there a fault in the peripherals connected to the CPU?

>>>>Yes, the MAMTIM value must match with your system clock frequency.

7.9 MAM usage notes

When changing MAM timing, the MAM must first be turned off by writing a zero to
MAMCR. A new value may then be written to MAMTIM. Finally, the MAM may be turned
on again by writing a value (1 or 2) corresponding to the desired operating mode to
MAMCR.
For a system clock slower than 20 MHz, MAMTIM can be 001. For a system clock
between 20 MHz and 40 MHz, flash access time is suggested to be 2 CCLKs, while in
systems with a system clock faster than 40 MHz, 3 CCLKs are proposed. For system
clocks of 60 MHz and above, 4CCLK’s are needed.

 

Q3. Do you have a rough idea what is the performance penalty with MAM disabled? Will it be twice  slower than before?

>>>>>If the MAM is disabled, I think the default 7 clcok cycles delay will be enabled. Regarding the performance penalty, I think reading flash will use 7 clock cycles delay, when the instruction is in the instruction pipeline, it will be fast.

Hope it can help you

BR

XiangJun rong

0 项奖励
回复
1,939 次查看
frank_m
Senior Contributor III

> 3. Do you have a rough idea what is the performance penalty with MAM disabled? Will it be twice  slower than before?

AFAIK (as a software guy), this is highly individual, i.e. might differ between instances of MCUs. This is why MCU vendors use to guarantee worst case conditions, and often list averages (Min/Typ/Max).

For a properly working product, you would need to expect the worst case ...

Cannot comment on the LPC2368, I never worked with that MCU.

0 项奖励
回复
1,964 次查看
frank_m
Senior Contributor III

Many parameter depend on temperature, and timing gets worse with higher temperatures.

I think you watchdog triggers because Flash fetches fail. Your Flash setup (core clock speed, wait states) are approriate for normal temperatures, but most probably fail at higher temperature.

This would also explain the seemingly random location.

You could try more relaxed (i.e. more generous) wait state settings, or a reduced core clock. Alternatively, shut down the device at critical temperatures.

I remember a project I once worked on, that also failed at similar temperatures. While the MCU was still fine, all the optocoupler signals it heavily relied on had basically "disappeared".