MACHINE CHECK PARITY ERROR

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

MACHINE CHECK PARITY ERROR

527 Views
A_Maruthi
Contributor I

HI All.

We are using T1040 in one of Switch/Router. We did Booting by using Code Warrior Tools. While loading Linux we are facing the Issue. Those logs are as follows : 

Machine check in kernel mode.
Machine check in kernel mode.
Caused by (from MCSR=400c0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc000000000314004
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000004b404
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000013f600
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000004b404
Unable to handle kernel paging request for data at address 0x00000061
Faulting instruction address: 0xc00000000013f604
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=4 CoreNet Generic

 

It need to give us Linux Prompt. But we are facing issue related to Machine Check Parity Error. Pls try to reply if  you have any suggestion regarding this.

Thank You.

 

0 Kudos
Reply
3 Replies

396 Views
June_Lu
NXP TechSupport
NXP TechSupport

For the clock attached, it not only the frequency but also the PPJ and phase noise, and so on.

Please confirm follow all the SPEC.

0 Kudos
Reply

501 Views
June_Lu
NXP TechSupport
NXP TechSupport

Problems start with a cache parity error. Most likely, this has to do with overclocking, noisy, unstable or inappropriate voltage power, external conditions like high level of radiation or overheating. The only way the software can directly create this condition is described in E5500RM, Section 5.4.5. I do not think Linux has any code for cache error injection, suggestions below are actually sanity checks:

Check if it violate the errata.

Make sure no customizations have been done to u-Boot and/or Linux to switch on/off cache error protection without cache invalidation.

 Compare the problematic kernel configuration against the SDK default to make sure the build flags are correct for the CPU and no unsupported code is included in the build.

Use SDK provided build tools to ensure the kernel is not miscompiled.

Check general system operation conditions, make sure there is no overheating, overclocking, ESD, radiation.

Check the power rails for stability and noise. The fact that the problem aggravates with more cores enabled together with the observation that the error shoots when a core leaves the normal idle routine, point to power as the most likely root cause.

461 Views
A_Maruthi
Contributor I

Thank You for Replying to Us. We have considered your suggestions and proceeded to debug. In our Cards , all the hardware and software debug options mentioned by you are in Suitable Working Condition according to our observation. We have total 30 Cards based on T1040RDB, in which 28 Cards are finished with U-boot, Linux Installation, and they are Working. We are following same procedure, but specifically in 2 Cards we are facing Machine Check Parity Error. We checked Power Supply levels (3.3V & 1.8V) which are within the Permissible limit. The clock input to the processor is measured to be 99.9997 MHz (Screenshot attached). Are there any other specific tests that we may do to further debug the source of the problem faced?

Thank You!

 

0 Kudos
Reply