AnsweredAssumed Answered

T1042: u-boot crashed after a warm reboot with ECC DDR4

Question asked by kenny zhou on Nov 6, 2019
Latest reply on Nov 7, 2019 by ufedor

Issue: we are verifying the functionality of the ECC DDR4 on our T1042 board, the board can boot up normally into linux system in a cold reboot(power cycle the board), but it crashed in u-boot after a warm reboot using "reboot" command. 

 

Background: Our T1042 card implemented the PRAM(Persist RAM) function, to achieve that we have the DDR stay in self-refresh mode to preserve the DDR contents through a CPU Hrst,  and disabled the D_INIT bit in the DDR_DDR_SDRAM_CFG_2 register during the warm reboot. BTW, the D_INIT bit is default to enabled, we just disabled it as PRAM needs during a warm reboot. I guess that's why the board can boot up with power cycle(D_INIT is enabled), but crashed in U-BOOT at a warm reboot(D_INIT is disabled).

 

Question: How we can do to have the board boot up with a warm reboot? what can we do to make that happen? thanks

 

Below is the log of our card booting with ECC DDR4 at a cold reboot and a warm reboot.

log of cold reboot

U-Boot 2016.09+fsl+g199df35 (Nov 01 2019 - 17:20:39 +0000)
shasta 2018.03-r0.3

U-Boot code: EFF40000 -> F0000000 BSS: -> F004BAE0
CPU0: T1042E, Version: 1.1, (0x85280211)
Core: e5500, Version: 2.1, (0x80241021)
Clock Configuration:
CPU0:1200 MHz, CPU1:1200 MHz, CPU2:1200 MHz, CPU3:1200 MHz,
CCB:500 MHz,
DDR:800 MHz (1600 MT/s data rate) (Asynchronous), IFC:62.500 MHz
FMAN1: 500 MHz
QMAN: 250 MHz
PME: 250 MHz
eSDHC: 1200 MHz
L1: D-cache 32 KiB enabled
I-cache 32 KiB enabled
Reset Configuration Word (RCW):
00000000: 0a10000c 0c000000 00000000 00000000
00000010: 81000000 00400002 fc027000 21000000
00000020: 00000000 00000000 00000000 012300f4
00000030: 00000200 00165a05 00000000 00000000
Board: T1042 Calix
SERDES Reference: Clock1=100MHz Clock2=100MHz
Watchdog enabled
I2C: ready
SPI: ready
DRAM: Initializing....using SPD
Detected UDIMM 18ASF2G72HZ-2G6E1
fsl_ddr_set_memctl_regs() -- Configuring DDR memory device
14 GiB left unmapped
Monitor len: 0010BAE0
Ram size: 00000000
Reserving 0k for Calix u-boot image at: 80000000
Reserving MP boot page to 7ffff000
Reserving 8192k for protected RAM at 7f7ff000
Reserving 1070k for U-Boot at: 7f6f0000
Reserving 10240k for malloc() at: 7ecf0000
Reserving 80 Bytes for Board Info at: 7eceffb0
reserve_board: setting bi_pramsize=8388608 bi_pramstart=7f7ff000
Reserving 192 Bytes for Global Data at: 7ecefef0
4 GiB (DDR4, 64-bit, CL=11, ECC on)
New Stack Pointer is: 7ecefee0
Relocation Offset is: 8f7b0000
Relocating to 7f6f0000, new gd at 7ecefef0, sp at 7ecefee0
setup_reloc: After relocation bi_pramsize=8388608 bi_pramstart=7f7ff000
Adjusted IFC clock to 83.333 MHz (ccr=3)
Flash: 32 MiB
L2: 256 KiB enabled
Corenet Platform Cache: 256 KiB enabled
Using SERDES1 Protocol: 129 (0x81)
WARN: pls set popts->cpo_sample = 0x4c in <board>/ddr.c to optimize cpo
MMC: FSL_SDHC: 0
PCIe1: Root Complex, x1 gen2, regs @ 0xfe240000
01:00.0 - 10ee:8011 - Simple comm. controller
PCIe1: Bus 00 - 01
PCIe2: Root Complex, x2 gen2, regs @ 0xfe250000
03:00.0 - 14e4:8370 - Network controller
03:00.1 - 14e4:8370 - Network controller
PCIe2: Bus 02 - 03
PCIe4: Root Complex, x1 gen2, regs @ 0xfe270000
05:00.0 - 10ee:8011 - Simple comm. controller
PCIe4: Bus 04 - 05
In: serial
Out: serial
Err: serial
Mem_Total=4096M Kernel_mem=2039M pram_addr=7f7ff000 pram_size=8192k

 

log of warm reboot

U-Boot 2016.09+fsl+g199df35 (Nov 01 2019 - 17:20:39 +0000)
shasta 2018.03-r0.3

U-Boot code: EFF40000 -> F0000000 BSS: -> F004BAE0
CPU0: T1042E, Version: 1.1, (0x85280211)
Core: e5500, Version: 2.1, (0x80241021)
Clock Configuration:
CPU0:1200 MHz, CPU1:1200 MHz, CPU2:1200 MHz, CPU3:1200 MHz,
CCB:500 MHz,
DDR:800 MHz (1600 MT/s data rate) (Asynchronous), IFC:62.500 MHz
FMAN1: 500 MHz
QMAN: 250 MHz
PME: 250 MHz
eSDHC: 1200 MHz
L1: D-cache 32 KiB enabled
I-cache 32 KiB enabled
Reset Configuration Word (RCW):
00000000: 0a10000c 0c000000 00000000 00000000
00000010: 81000000 00400002 fc027000 21000000
00000020: 00000000 00000000 00000000 012300f4
00000030: 00000200 00165a05 00000000 00000000
Board: T1042 Calix
SERDES Reference: Clock1=100MHz Clock2=100MHz
Watchdog enabled
I2C: ready
SPI: ready
DRAM: Initializing....using SPD
Detected UDIMM 18ASF2G72HZ-2G6E1
14 GiB left unmapped
Monitor len: 0010BAE0
Ram size: 00000000
Reserving 0k for Calix u-boot image at: 80000000
Reserving MP boot page to 7ffff000
Reserving 8192k for protected RAM at 7f7ff000
Reserving 1070k for U-Boot at: 7f6f0000
Reserving 10240k for malloc() at: 7ecf0000
Reserving 80 Bytes for Board Info at: 7eceffb0
reserve_board: setting bi_pramsize=8388608 bi_pramstart=7f7ff000
Reserving 192 Bytes for Global Data at: 7ecefef0
4 GiB (DDR4, 64-bit, CL=11, ECC on)
New Stack Pointer is: 7ecefee0
Relocation Offset is: 8f7b0000
Relocating to 7f6f0000, new gd at 7ecefef0, sp at 7ecefee0
setup_reloc: After relocation bi_pramsize=8388608 bi_pramstart=7f7ff000
Machine check in kernel mode.
Caused by (from mcsr): mcsr = 0x00008000
NIP: 7F6F1174 XER: 20000000 LR: 7F6F8F48 REGS: 7ecefd40 TRAP: 0200 DAR: 00000000
MSR: 00021200 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 00

GPR00: 7F739A48 7ECEFE30 7ECEFEF0 00B83820 00000000 00000040 00000001 00000340
GPR08: 80000080 800000C0 10020000 7ECEFEB0 7F6F7558 100280C8 00000000 8F7B0000
GPR16: 10110000 00000000 10020000 00000000 00000001 00000000 00000000 00000001
GPR24: 00007918 00000000 10020000 00000000 7F766758 00000001 7F760F5C 00000002
MCSR=0x00008000 MCSRR0=0x7f6f1174
MCSRR1=0x00021200 MCAR=0x00000000
Call backtrace:
00000000 7F739A48 7F70A8F8 7F6F1050
Returning back to 0x7f6f1174
Machine check in kernel mode.
Caused by (from mcsr): mcsr = 0x00008000
NIP: 7F6F1174 XER: 20000000 LR: 7F6F8F48 REGS: 7ecefd40 TRAP: 0200 DAR: 00000000
MSR: 00021200 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 00

GPR00: 7F739A48 7ECEFE30 7ECEFEF0 00B83820 00000000 00000040 00000001 00000340
GPR08: 80000080 800000C0 10020000 7ECEFEB0 7F6F7558 100280C8 00000000 8F7B0000
GPR16: 10110000 00000000 10020000 00000000 00000001 00000000 00000000 00000001
GPR24: 00007918 00000000 10020000 00000000 7F766758 00000001 7F760F5C 00000002
MCSR=0x00008000 MCSRR0=0x7f6f1174
MCSRR1=0x00021200 MCAR=0x00000000
Call backtrace:
00000000 7F739A48 7F70A8F8 7F6F1050
Skipping current instr, Returning to 0x7f6f1178
Machine check in kernel mode.
Caused by (from mcsr): mcsr = 0x00008000
NIP: 7F6F1174 XER: 20000000 LR: 7F6F8F48 REGS: 7ecefd40 TRAP: 0200 DAR: 00000000
MSR: 00021200 EE: 0 PR: 0 FP: 0 ME: 1 IR/DR: 00

Outcomes