I am trying to test out DDR ECC on a P1016 based board.
I single-bit error scenarios all seems ok - the DDR_ERR_SBE register happily counts the error occurences.
When inducing multi-bit errors, I expected a machine check exception. However the core just seems to hang.
Processor set-up bascally as below.
IVOR1 installed
HID1[RFXE]=1
DDR_ERR_DISABLE = 0
DDR_DDR_SDRAM_CFG[ECC_EN] = 1
Mulit-bit error injection
DDR_ERR_INJECT[EIEN] = 1
DDR_DATA_ERR_INJECT_HI = 0x00000003
{processor hangs}
All seems to align with E500 and P1025 reference manual.
Am I doing something wrong?
Solved! Go to Solution.
As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.
Regards,
Bulat
Most probably the core enters into checkstop state. This happens when a machine check exeption (MCE) occurs while MSR[ME] bit is cleared. Typical scenaio is when second MCE occurs while previous MCE is being processed in the interrupt handler.
Regards,
Bulat
Is there a better way to do this? Verify the generation of the machine check exception for a DDR ECC error.
I noticed the address parity error injection suggests it only generates one error. So would this prevent a seconds MCE occurring?
Regards,
Paul
As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.
Regards,
Bulat