Intermittent Boot Failure Due to Synchronous Exception

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 

Intermittent Boot Failure Due to Synchronous Exception

506 次查看
zyz
Contributor III

During repetitive power cycling tests, we observed intermittent boot failures on the device. The issue has been preliminarily traced to a synchronous exception triggered during OS initialization, causing the system to hang. Details are as follows:

  1. Failure Context:

    • CPU is stuck at the synchronous exception vector entry, with the PC pointer frozen.

    • PC, ELR_EL1, and FAR_EL1 all point to the exception entry address 0xFFFFFFFF802E6200 (VBAR_EL1 = 0xFFFFFFFF802E6000).

    • ESR_EL1 = 0x86000004 (EC=0x21, IFSC=0x04: Instruction Abort, Translation fault at level 0).

    • DDR memory content is unreadable via Lauterbach debugger.

  2. Hypothesis:
    The CPU enters a deadlock when attempting to jump to the synchronous exception handler (VBAR_EL1 + 0x200), likely due to invalid page table mappings for this address, causing recursive exceptions.

  3. Open Questions:

    • How to identify the original exception trigger point (the initial faulting instruction)?

    • Why is DDR memory inaccessible via the debugger during this state?

 
How should I proceed with the next step of troubleshooting?
 
 
 
2335fbff-5343-4d81-aa7b-19bf019e1328.png

异常时CPU寄存器.png

a0efb996-7f2d-4421-87a2-0fdb9e583185.pngb1af47b0-3359-4f36-84b1-47ad396b633d.png

40ff3413-3e90-4082-a581-81a86c7e802c.png

  

 

标记 (1)
0 项奖励
回复
3 回复数

439 次查看
chenyin_h
NXP Employee
NXP Employee

Hello, @zyz 

Thanks for your reply.

Seems the issue is not related with BSP from your description, and it is more likely a debug phase on your own OS, I feel sorry that it is difficult for us to analyze it without code/reproduced setup from our end, since it is found when testing with Vxworks, I suggest also consulting WindRiver for querying tips on the analysis.

I apologize for your inconvenience.

 

BR

Chenyin

0 项奖励
回复

480 次查看
chenyin_h
NXP Employee
NXP Employee

Hello, @zyz 

Thanks for your post.

Would you mind sharing us more details of the background and the steps for triggering such issue?

Is the test done on custom board or RDB/EVB? with S32G2 or G3?

The test seems is done on A53 side, is it based on BSP? which version?

You mentioned synchronous exception triggered during OS initialization, the OS here is Linux from BSP or others? 

If the test is based on BSP, any modifications done from your side?

 

BR

Chenyin

0 项奖励
回复

473 次查看
zyz
Contributor III

Hi chenyin_h,

Thank you for your reply.

We tested this on our self-developed board with our own OS(vxworks).

At this preliminary stage, it appears that the issue lies within our initialization code.

Currently, we are narrowing down the problem by adding debug prints. However, since the issue has a low reproduction rate and involves extensive code.

we would like to consult on how to more efficiently pinpoint the source of such anomalies.

0 项奖励
回复