Hi Experts,
I am having some troubles about P2020’s machine check interrupt. Could someone kindly help to answer my questions below? Thanks in advance!
The hardware board is produced by ourselves with a P2020 on it. Our software applications run under a embedded OS.
The system raise a machine check exception from time to time. When it happens:
- (MCSR) Machine Check Syndrome Register is 0x8, this indicates a “BUS_RBERR(Bus read data bus error)”
- (MCAR) Machine Check Address Register is 0xB000xxxx. This is a RapidIO address in system. However, I am sure system never read/write this address after initialization.
- (MCSRR0) Machine Check Save/Restore Register 0 usually points to a “sync” instruction, and occasionally points to a store instruction(the address of store instruction is a normal DRAM address used as function stack)
Currently I cannot address the reason why machine check happen.
My questions are:
1.What is the meaning of a “Bus read data bus error”? It looks that my MCAR and MCSRR0 register value have no relationship with this syndrome type?
2.Do MCSR/MCAR/MCSRR0 always save the correct informations about a machine check exception?
3.I configure a address(say 0xc0000000) in MMU and do not configure it in LAW(Local Access Windows), system will immediately raise a machine check interrupt when 0xc0000000 is read, but if 0xc0000000 is written, there is no any interrupt/exception in system, is this a expected behavior of the processor?
Thanks.
Jerry
1) BUS_RBERR gets set because the core_fault_in gets asserted to the CPU signaling a fault on the internal bus.
Sources, capable to generate core_fault_in are described in the P2020 QorIQ Integrated Processor Reference Manual, Rev. 2, Table 5-1. Differences between the e500 core and the QorIQ core implementation, HID1[RFXE].
2) The registers should contain correct data for unsuccessful read (uncorrectable read error) operations.
3) Yes, see 2).
Note:
RFXE should always be 0 for normal operation for the e500v2; it should be set only if it is necessary that the assertion of core_fault_in generate a machine check or a checkstop because peripherals are not properly configured to report bus faults. This would typically occur only during software or firmware development.
When the exception occurs, Machine Check Syndrome Register is 0x8(BUS_RBERR - Bus read data bus error), but the MCSRR0 and MCAR register do not provide instruction and address which have something to do with BUS_RBERR. This is what are confusing me. Do you have any other suggestions I can follow to look into the reason of this exception.
Thanks.
Maybe 2 boards. Both have the same result.
I am manually creating a machine check exception, and then to see if the software & hardware can provide the expected information (instruction address and the address instruction are reading).
I'll ping you again when I collect other questions.
Much appreciate for your reply.