MACHINE CHECK PARITY ERROR

キャンセル
次の結果を表示 
表示  限定  | 次の代わりに検索 
もしかして: 

MACHINE CHECK PARITY ERROR

3,817件の閲覧回数
A_Maruthi
Contributor I

HI All.

We are using T1040 in one of Switch/Router. We did Booting by using Code Warrior Tools. While loading Linux we are facing the Issue. Those logs are as follows : 

Machine check in kernel mode.
Machine check in kernel mode.
Caused by (from MCSR=400c0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc000000000314004
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000004b404
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000013f600
Machine check in kernel mode.
Caused by (from MCSR=400d0000): Instruction Cache Parity Error
Machine Check Effective Address: 0xc00000000004b404
Unable to handle kernel paging request for data at address 0x00000061
Faulting instruction address: 0xc00000000013f604
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=4 CoreNet Generic

 

It need to give us Linux Prompt. But we are facing issue related to Machine Check Parity Error. Pls try to reply if  you have any suggestion regarding this.

Thank You.

 

0 件の賞賛
返信
3 返答(返信)

3,686件の閲覧回数
June_Lu
NXP TechSupport
NXP TechSupport

For the clock attached, it not only the frequency but also the PPJ and phase noise, and so on.

Please confirm follow all the SPEC.

0 件の賞賛
返信

3,791件の閲覧回数
June_Lu
NXP TechSupport
NXP TechSupport

Problems start with a cache parity error. Most likely, this has to do with overclocking, noisy, unstable or inappropriate voltage power, external conditions like high level of radiation or overheating. The only way the software can directly create this condition is described in E5500RM, Section 5.4.5. I do not think Linux has any code for cache error injection, suggestions below are actually sanity checks:

Check if it violate the errata.

Make sure no customizations have been done to u-Boot and/or Linux to switch on/off cache error protection without cache invalidation.

 Compare the problematic kernel configuration against the SDK default to make sure the build flags are correct for the CPU and no unsupported code is included in the build.

Use SDK provided build tools to ensure the kernel is not miscompiled.

Check general system operation conditions, make sure there is no overheating, overclocking, ESD, radiation.

Check the power rails for stability and noise. The fact that the problem aggravates with more cores enabled together with the observation that the error shoots when a core leaves the normal idle routine, point to power as the most likely root cause.

3,746件の閲覧回数
A_Maruthi
Contributor I

Thank You for Replying to Us. We have considered your suggestions and proceeded to debug. In our Cards , all the hardware and software debug options mentioned by you are in Suitable Working Condition according to our observation. We have total 30 Cards based on T1040RDB, in which 28 Cards are finished with U-boot, Linux Installation, and they are Working. We are following same procedure, but specifically in 2 Cards we are facing Machine Check Parity Error. We checked Power Supply levels (3.3V & 1.8V) which are within the Permissible limit. The clock input to the processor is measured to be 99.9997 MHz (Screenshot attached). Are there any other specific tests that we may do to further debug the source of the problem faced?

Thank You!

 

0 件の賞賛
返信