P1016 DDR ECC testing

PaulWalker · ‎07-20-2015

I am trying to test out DDR ECC on a P1016 based board.

I single-bit error scenarios all seems ok - the DDR_ERR_SBE register happily counts the error occurences.

When inducing multi-bit errors, I expected a machine check exception. However the core just seems to hang.

Processor set-up bascally as below.

IVOR1 installed

HID1[RFXE]=1

DDR_ERR_DISABLE = 0

DDR_DDR_SDRAM_CFG[ECC_EN] = 1

Mulit-bit error injection

DDR_ERR_INJECT[EIEN] = 1

DDR_DATA_ERR_INJECT_HI = 0x00000003

{processor hangs}

All seems to align with E500 and P1025 reference manual.

Am I doing something wrong?

Bulat · ‎07-30-2015

As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.

Regards,

Bulat

View solution in original post

Bulat · ‎07-20-2015

Most probably the core enters into checkstop state. This happens when a machine check exeption (MCE) occurs while MSR[ME] bit is cleared. Typical scenaio is when second MCE occurs while previous MCE is being processed in the interrupt handler.

Regards,

Bulat

PaulWalker · ‎07-23-2015

Is there a better way to do this? Verify the generation of the machine check exception for a DDR ECC error.

I noticed the address parity error injection suggests it only generates one error. So would this prevent a seconds MCE occurring?

Regards,

Paul

Bulat · ‎07-30-2015

As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.

Regards,

Bulat

P1016 DDR ECC testing

P1016 DDR ECC testing

QorIQ P1 Devices