Tracing exceptions on MPC5604P

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Tracing exceptions on MPC5604P

Jump to solution
1,410 Views
tommiddelweerd
Contributor II

Hi there,

I encounter a strange behavior on some of our PCBs using the MPC5604P.

On a couple of boards I get a program interrupt exception (IVOR6) caused by an illegal instruction (PIL bit is set in ESR), but on most of the boards the exact same software is running just fine.

Is there a way to trace down the root cause of this exception, or do you know what could be the reason for it?

I'm using the PE Micro Multilink Debugger.

 

Thank you,

 

Tom

Labels (1)
Tags (2)
1 Solution
1,244 Views
tommiddelweerd
Contributor II

Just to let you know, we found the origin of the issue. It was a failure within the initialization of the internal flash. 

The Flash Read Wait State Cycles (RWSC) was set to 1 instead of 2. For our system clock of 64MHz at least 2 wait state cycles are recommended.


Thanks for the help anyway.

View solution in original post

5 Replies
1,245 Views
tommiddelweerd
Contributor II

Just to let you know, we found the origin of the issue. It was a failure within the initialization of the internal flash. 

The Flash Read Wait State Cycles (RWSC) was set to 1 instead of 2. For our system clock of 64MHz at least 2 wait state cycles are recommended.


Thanks for the help anyway.

1,244 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

e200 invokes an Illegal Instruction program exception on attempted execution of the following instructions:

  • Instruction from the illegal instruction class
  • mtspr and mfspr instructions with an undefined SPR specified
  • mtdcr and mfdcr instructions with an undefined DCR specified

 

From user point of view it is almost the same as “Unimplemented Operation” exception, thus there is an attempt to execute instruction that does not exist.

“Illegal instruction” means instruction is partially recognized as it is either defined by other e200 platform or reserved for future extensions.

In both cases it is caused by fetching of invalid opcode.

 

Typically I would see two possible issues:

- either there is an attempt to execute code from some invalid area due to some incorrect branch or

- code is compiled for slightly different core subtype than should (for instance e200z1 instead of e200z0) or used assembly code has been written for different core subtype

 

I any case it would be needed to investigate content of SRR0 to find address causing the exception and check where it points to.

0 Kudos
1,244 Views
tommiddelweerd
Contributor II

Hi David,

The content of SRR0 is 0x00014620.

Disassembly instruction on this address is:

0x0001461c: e_lmw r30,40(rsp)
0x00014620: se_lwz rsp,0(rsp)
0x00014622: se_lwz r0,4(rsp)
0x00014624: se_mtlr r0

This seems to be a valid instruction, also supported by the e200z0 core. At least I think so, although I couldn't find a document with the specific instruction set for this core. Do you know where I can find such a reference document?

When I try to debug the issue I can step through the code, even passing by the same address/instruction a couple of times without any problem and then at some point it would still throw the illegal instruction exception referring to the address 0x00014620.

I almost come to believe that my debugger is misleading me on this one, although I turned off optimization.

The code is compiled for zen (-proc Zen) with the Codewarrior 10.6 IDE. Assembly code is also not used.

Thank you for your help.

Regards,

Tom

0 Kudos
1,244 Views
davidtosenovjan
NXP TechSupport
NXP TechSupport

I should be needed to debug real code (you showed disassembled code but not what is really flashed in MCU). You said it happens on couple of boards thus the issue is repeatable. Is there anything special with these boards like different MCU revisions or something like that?

0 Kudos
1,244 Views
tommiddelweerd
Contributor II

Hi David,

The failing MCUs have different revisions (QTH1608K, QRU1606C, QTH1608L, QRZ1607L, ...), but there seems no general relation between the revisions and the problem.

I also checked the unique serial number that's included in the Test Block of the flash (at address 0x403C10), but I cannot find a relation there either (although I didn't find any description on how to interpret these bits):

Wthout error:

1C12451E C0000034 0044AC4C 00000000 

1C12451E C0000034 00405C1C 00000000

With error:

1C12451E C0000034 00449850 00000000

1C12451E C0000034 00485428 00000000

I also did a couple of other checks:

  • DMA is disabled
  • Frequency of external 24MHz crystal is OK
  • System PLL and secondary PLL are OK 
  • RAM check OK
  • Verified Code Flash
  • Blank check before programming

Then I encountered a strange behavior. When I cool down the MCU with cooling spray (freeze), the application is running fine, and when I let it rewarm again or use a hot air gun to heat up the MCU (ca.50°C) the exception is triggered.


I have heard of similar issues, corrupting the on-chip flash memory (NVM) through temperature depending bit-flips. Those issues where caused by to high clock speeds when programming the MCU or because of out of range flash core voltage supplies.

But I can't resolve the issue on the failing MCUs even when reprogramming them with low speeds and correct power supply (internal voltage regulator is used to generate the 1.2V core&flash voltage out of the 3.3V main power supply).

So I start to think this might be more a hardware issue.

Regards

0 Kudos