AnsweredAssumed Answered

How to fix ECC of DDR4 training address after warm boot?

Question asked by Ruud Pendavingh on Nov 5, 2019
Latest reply on Nov 7, 2019 by Ruud Pendavingh

Hi,

I have a custom board with an LS1046A and 4GB DDR4 ECC memory and U-Boot. I successfully found a way to perform a 'warm' reboot (preserving DRAM contents) by setting the FRC_SR bit of the DDR_SDRAM_CFG_2 register (while executing from internal SRAM). My U-boot code implements the is_warm_boot() function to distinguish between warm/cold boots. I do not have SPD; the DDR controller configuration has been created using NXP CodeWarrior.

There are no issues after a cold boot (memory fully initialized by the DDR controller); a stressapptest will run for days on linux without problems.

After a warm boot, all memory seems nicely preserved except the (128 byte?) area where the DDR training occurs . In the fsl_ddr_gen4.c source file the DDR training address for warm boot is set to CONFIG_SYS_SDRAM_BASE (defined as 0x8000_0000):

#ifdef CONFIG_DEEP_SLEEP
    if (is_warm_boot()) {
        ddr_out32(&ddr->sdram_cfg_2,
              regs->ddr_sdram_cfg_2 & ~SDRAM_CFG2_D_INIT);
        ddr_out32(&ddr->init_addr, CONFIG_SYS_SDRAM_BASE);
        ddr_out32(&ddr->init_ext_addr, DDR_INIT_ADDR_EXT_UIA);

        /* DRAM VRef will not be trained */
        ddr_out32(&ddr->ddr_cdr2,
              regs->ddr_cdr2 & ~DDR_CDR2_VREF_TRAIN_EN);
    } else
#endif

Apparently this means that training is performed on CPU physical address 0x8_8000_0000. Reading from a location between 0x8_8000_0000 and 0x8_8000_007F will cause an error, but the area above can be accessed:

=> mw.q 0x880000080 deadc0dec0ffabba
=> md.q 0x880000080 10
880000080: deadc0dec0ffabba 0000000000000000    ................
880000090: 0000000000000000 0000000000000000    ................
8800000a0: 0000000000000000 0000000000000000    ................
8800000b0: 0000000000000000 0000000000000000    ................
8800000c0: 0000000000000000 0000000000000000    ................
8800000d0: 0000000000000000 0000000000000000    ................
8800000e0: 0000000000000000 0000000000000000    ................
8800000f0: 0000000000000000 0000000000000000    ................

I attempt to fix the ECC of the training area using 64-bit writes:

=> mw.q 0x880000000 0 10

But a read after that will fail:

=> md.q 0x880000000
880000000:"Synchronous Abort" handler, esr 0x96000210
elr: 0000000040145994 lr : 00000000401457fc (reloc)
elr: 00000000fbdbb994 lr : 00000000fbdbb7fc
x0 : 0000000000000010 x1 : 000000000000003a
x2 : 0000000000000020 x3 : 0000000000000001
x4 : 00000000fbc6aed8 x5 : 0000000000000009
x6 : 0000000000000021 x7 : 00000000fffffffd
x8 : 0000000000000038 x9 : 000000000000000c
x10: 00000000fbdc85b8 x11: 000000000000000f
x12: 0000000000000004 x13: 00000000fbc6b3d0
x14: 00000000fbc6b6d8 x15: 00000000fbc6b058
x16: 0000000000000000 x17: 00000000ffffffff
x18: 00000000fbc6dd78 x19: 0000000000000011
x20: 0000000000000002 x21: 00000000fbc6b3c8
x22: 0000000000000002 x23: 0000000000000008
x24: 0000000000000008 x25: 00000000fbdd0740
x26: 0000000880000000 x27: 0000000000000010
x28: 0000000000000000 x29: 00000000fbc6b330

 

How can I fix the ECC of the training area after a warm boot?

Is the DDR_INIT_ADDR the DDR address or a physical CPU address?

 

Regards,

Ruud

Outcomes