P1016 DDR ECC testing

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

P1016 DDR ECC testing

Jump to solution
809 Views
PaulWalker
Contributor III

I am trying to test out DDR ECC on a P1016 based board.

I single-bit error scenarios all seems ok - the DDR_ERR_SBE register happily counts the error occurences.

When inducing multi-bit errors, I expected a machine check exception. However the core just seems to hang.

Processor set-up bascally as below.

IVOR1 installed

HID1[RFXE]=1

DDR_ERR_DISABLE = 0

DDR_DDR_SDRAM_CFG[ECC_EN] = 1

Mulit-bit error injection

DDR_ERR_INJECT[EIEN] = 1

DDR_DATA_ERR_INJECT_HI = 0x00000003

  {processor hangs}

All seems to align with E500 and P1025 reference manual.

Am I doing something wrong?

Labels (1)
Tags (2)
0 Kudos
1 Solution
469 Views
Bulat
NXP Employee
NXP Employee

As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.

Regards,

Bulat

View solution in original post

0 Kudos
3 Replies
469 Views
Bulat
NXP Employee
NXP Employee

Most probably the core enters into checkstop state. This happens when a machine check exeption (MCE) occurs while MSR[ME] bit is cleared. Typical scenaio is when second MCE occurs while previous MCE is being processed in the interrupt handler.

Regards,

Bulat

0 Kudos
469 Views
PaulWalker
Contributor III

Is there a better way to do this? Verify the generation of the machine check exception for a DDR ECC error.

I noticed the address parity error injection suggests it only generates one error. So would this prevent a seconds MCE occurring?

Regards,

Paul

0 Kudos
470 Views
Bulat
NXP Employee
NXP Employee

As I can see HID1[RFXE]=1 in your case. Did you try keep it 0 (HID1[RFXE]=0)? Potentially this can be the root of the second MCE.

Regards,

Bulat

0 Kudos