an error captured by the EDAC on the T1040rdb

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

an error captured by the EDAC on the T1040rdb

1,890 Views
zilu
Contributor II

When running usdpaa lpm-ipfwd application in a demo board with the T1040d4rdb SoC and send traffic for about some hours,I got the following errors reported by the EDAC:

[11999.400331] EDAC MPC85xx MC0: Err Detect Register: 0x00000004
[11999.404780] EDAC MPC85xx MC0: Faulty Data bit: 13
[11999.408176] EDAC MPC85xx MC0: Expected Data / ECC: 0x11fc1048_c9efb439 / 0x88
[11999.414005] EDAC MPC85xx MC0: Captured Data / ECC: 0x11fc1048_c9ef9439 / 0x88
[11999.419833] EDAC MPC85xx MC0: Err addr: 0x1e24c0180
[11999.423401] EDAC MPC85xx MC0: PFN: 0x001e24c0
[11999.426456] EDAC MC0: 1 CE mpc85xx_mc_err on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x1e24c0 offset:0x180 grain:8 syndrome:0x88)
[84851.203087] USDPAA process leaking 4 QPOOLs
[390361.804384] EDAC MPC85xx MC0: Err Detect Register: 0x00000004
[390361.808833] EDAC MPC85xx MC0: Faulty Data bit: 13
[390361.812228] EDAC MPC85xx MC0: Expected Data / ECC: 0x4f06a484_b487b15b / 0x29
[390361.818057] EDAC MPC85xx MC0: Captured Data / ECC: 0x4f06a484_b487915b / 0x29
[390361.823886] EDAC MPC85xx MC0: Err addr: 0x1e224fb80
[390361.827455] EDAC MPC85xx MC0: PFN: 0x001e224f
[390361.830510] EDAC MC0: 1 CE mpc85xx_mc_err on mc#0csrow#0channel#0 (csrow:0 channel:0 page:0x1e224f offset:0xb80 grain:8 syndrome:0x29)

Is there someone that could explain what could be the next steps in my investigation?

thank you.

0 Kudos
Reply
3 Replies

1,771 Views
zilu
Contributor II

Thank you very much, Platon.

And how to download T1040RM? we only have T1040DPAARM.pdf and QORIQ-SDK-2.0-IC-REV0_09082018.pdf

0 Kudos
Reply

1,771 Views
bpe
NXP Employee
NXP Employee

Sorry for not updating sooner. Below is a link to the document:

https://www.nxp.com/webapp/Download?colCode=T1040RM 

Regards,

Platon

0 Kudos
Reply

1,771 Views
bpe
NXP Employee
NXP Employee

Your T1040 memory controller detects single-bit memory corruption.
Single-bit memory errors are correctable, but they should not happen
in a healthy system. These errors happen either because the memory
controller is improperly initialized or because of  a hardware problem
with memory. Possible actions are:

1. Reflash your board with NXP SDK pre-built images valid for your board.

2. Undo board modifications, if you have done  any.

3. Replace memory.

4. Replace power supply.

5. Replace the board.

Details about T1040 DDR error detection can be found in
T1040RM, Section 14.5.8.


Hope this helps,
Platon

-------------------------------------------------------------------------------
Note:
- If this post answers your question, please click the "Mark Correct" button. Thank you!

- We are following threads for 7 weeks after the last post, later replies are ignored
Please open a new thread and refer to the closed one, if you have a related question at a later point in time.
-------------------------------------------------------------------------------

0 Kudos
Reply