AnsweredAssumed Answered

MPC5566 bus error occurring only for flash LAS block 0

Question asked by Andrew Henderson on Dec 13, 2019
Latest reply on Dec 23, 2019 by David Tosenovjan

Hello all. I inherited an MPC5566 application that uses the two flash LAS 16K blocks (at 0x0000 and 0x1C000) for a ping-pong buffering scheme. The erase/writes of these ping-pong buffers does not occur that often, so the expected endurance of the flash blocks we're using is quite long. We've had a few isolated reports come in from our customers of boards refusing to boot after running for time periods ranging from weeks (~2000 writes) to years (~60000 writes). The RMA'd boards that I have examined all show the same symptom: LAS block 0 (the "ping" buffer) gives a bus error on access. Using a Lauterbach JTAG debugger to examine the flash memory of a returned board, all of the bytes in block 0 are shown as "??". The Lauterback Trace32 application refuses to dump bytes from this block and reports a bus error. I am able to dump bytes from all other blocks in flash and have verified that they are as expected via CRC checks.

 

I've tried to reproduce the error condition here by using flash locking/unlocking and creating ECC conditions per application note AN5200 ("Error Correcting Codes Implemented on MPC55xx and MPC56xx Devices"). But, in both of these cases, I can still access the flash data in the block without receiving a bus error via the JTAG. The bad boards coming in from the field give me a bus error on any data access in block 0. If I use Trace32 to delete the corrupt/bad block 0, the bus errors go away and the system behavior returns to normal (the firmware detects the "0xFF" pattern of the empty block and repairs the missing "ping" buffer using the data from the "pong" buffer). 

 

When accessing block 0 gives bus errors, the MPC5566 Boot Assist Module is still able to launch our bootloader (located in LAS block 1 at 0x4000) as it normally does, so it isn't like there is an RCHW in the bad block 0 that is getting in the way. BAM is able to recognize that block 0 does not contain a valid RCHW and moves on to booting from block 1 (which has a valid RCHW at the start of it).

 

My questions are:

 

1. Has anyone seen behavior similar to this? A single flash block giving bus errors when you try to view it via JTAG?

2. Is there are a particular flash control register that I should be looking at to diagnose why only one flash block is acting like this? When looking through the MPC5566 reference manual, I'm not seeing anything that would disable reading for a single block. You can lock against erase and writing, but I didn't see anything related to blocking reading (short of disabling the flash as a whole). You can also disable flash as a whole, but I didn't see anything related to disabling individual blocks.

3. How can I recreate this issue by disabling/locking that first flash block programmatically? What register writes might help me to do this on command? Right now, I'm limited to the few boards showing the issue that are coming in from the field, so I can't experiment as much as I'd like to to track down the root cause in the firmware.

 

Thank you for your help! 

Outcomes