Pass DDR Stress test but fail during Mfgtool download (unzipping image)

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Pass DDR Stress test but fail during Mfgtool download (unzipping image)

1,805 Views
billlau
Contributor I

Background of hardware and software used.

 

  1. Freescale or other standard reference board used = custom design board
  2. Kernel version and/or BSP release used = based on 3.0.35
  3. Any additional software/application or hardware used = i.MX6 Solo, 512MB DDR2 RAM (Micron MT42L128M32D1LF-18 WT:A @396MHz), 1GB NAND flash (Toshiba TC58NYG3S0FBAID)

 

Recently, we found numbers of units failed during firmware download by mfgtool.

The failure rate is around 2% ( 20 units out of 1000)

The following was the highlighted failure log during the download. (A detail sample failure log was attached.)

 

ModuleID[2] LevelID[10]: ExecuteCommand--Push[WndIndex:0], Body is $ mount -t ubifs ubi2:system /mnt/system

ModuleID[2] LevelID[10]: ExecuteCommand--Push[WndIndex:0], Body is pipe busybox tar -xv -C /mnt

ModuleID[2] LevelID[1]: PortMgrDlg(0)--MxHidDevice--Command Push excute failed

ModuleID[2] LevelID[10]: CmdOperation[0], current command executed failed, so SetEvent(hDevCanDeleteEvent)

 

We tried to locate the problem when the crash happened.

For those 20 failure units, they have no problem in ROM read, RAM init, firmware down, enumerate USB as MSC.

The failure units could also run UCL command, but they easily crashed when unzipping big image. (checked in serial console)

(push command should be successfully sent out and executed, but there would be failure during processing in i.MX6)

 

*Strange problem:

  • The failure happened very frequently on unzipping big files (system.tar ~150MB).
  • For small files (recovery.tar ~ 6MB), there would be no crash.


Observation and experiment:

  • We tried to lower the DDR2 RAM clock rate to half (reduced PLL2 clock), all failure units could complete the whole download procedures.

However, even though they could complete the firmware download, they will easily hang up during normal boot or operation. 

  • Then we tried to run the DDR Stress test v1.0.2 on those failure units and using the same RAM init code.

Most of them can pass at the rated clock (=396MHz) and only fail at higher stressed clock.

13003_13003.pngunit1 ddr stress test.png

  • We also replaced a new working NAND flash on those failure board, the failure still exists. (NAND seems to be not a problem)

 

Questions:


1. By reducing the DDR2 RAM clock to half, it could complete the download procedures. It sounds like a DDR RAM clock issue.

     On the other hand, it passed the DDR stress test.

     Are there any test limitations for the DDR Stress test which is not able to demonstrate some RAM setting or hardware problem in real operation?

 

2.  For normal practice, Is it necessary for units to pass DDR stress test at a high clock rate such that to have more margin?

 

3. I have attached the RAM init file. Could you point out some critical setting that we should focus on?


4.  We have one unit (1 out of 1000) that fail DDR stress test. It could only pass at clock rate = 365MHz but fail at 396MHz.

     That unit also fail to complete download process. (the failure happened during RAM init stage but sometimes it could continue to process until get crashed in some UCL command execution)

     By reducing DDR clock rate to half, it could also complete the download.

     In the DDR stress test, it could complete the calibration at rated clock = 396MHz but it still fail at  RAM test at 396MHz.

     I wondered if the calibrated setting has not been implemented such that failure resulted in both stress test and operation.


It is still difficult to conclude the RAM had an issue as most of the failure units have passed the DDR stress test.

but many observations are pointing to RAM having a problem.

Original Attachment has been moved to: MfgTool-2.log.zip

Original Attachment has been moved to: solo_lpddr2_400mhz_cs0_32bit_v06.inc.zip

Labels (1)
Tags (3)
0 Kudos
Reply
3 Replies

1,055 Views
Yuri
NXP Employee
NXP Employee

  Agree, the issue looks as DRAM one.
Please try different drive strength settings for both i.MX6 and DDR parts.
Also one can vary DDR_SEL options. Basically DDR_SEL (say, in IOMUXC_SW_PAD_CTL_GRP_DDR_TYPE
register) is intended to adjust drive strength, which is mainly configured via DSE field.


Have a great day,
Yuri

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
Reply

1,055 Views
billlau
Contributor I

Hi Yuri,

I would like to provide an update here.

We have implemented new register setting and those download failure units could able to complete the download process.

1. we found the register setting is not correct, most of them related to DRAM timing setting.

The registers are set according LPDDR2 register programming aid v0.7

setmem /320x021b000c =0x33374135// MMDC0_MDCFG0
setmem /320x021b0004 =0x000201CF// MMDC0_MDPDC
setmem /320x021b0010 =0x00100A43// MMDC0_MDCFG1
setmem /320x021b0014 =0x00000093// MMDC0_MDCFG2
setmem /320x021b0018 =0x001016C8// MMDC0_MDMISC
setmem /320x021b0038 =0x00190778// MMDC0_MDCFG3LP

2. In our application, we only use 1 die and 1 channel LPDDR2 SDAM

for  MMDC0_MDMISC:LPDDR2_2CH should set to "0" as i.MX solo does not support 2 channel

and we have tried to comment out unnecessary CS1 and channel 1 setting in the init script

Analysis:

=> Even we have wrongly set the timing before, we still could able to run the DDR Stress Tester @ rated clock but fail during download process.

for example, our clock rate is 396MHz, it could pass the stress test @ 416MHz but failed at 450MHz. But it fail to download even clocking at 396MHz

=> By using the correct timing, the same unit could pass the stress test even @500MHz and also complete the download.

So, i have an idea:

Units passing DDR Stress test @ rated clock is not sufficient to ensure 100% workable in normal operation

It seems that we should stress the unit to work at much higher clock for RAM setting verification.

On the other hand, i still have some questions on the RAM setting.

3. for MMDCx_MDREF, the field REFR[2:0]

we are using a 4Gb Density DRAM, required number of referesh commands should be 3.9us. and REFR[2:0] should be 0x7

REFI timing.jpg

However, if we input 8192 in the programming aid excel sheet, REFR recommended value is 0x3

REFR.jpg

is that something wrong in the excel sheet?

2. For register setting MMDCx_MDASP, the field "CS0_END" is set “0x4F”for 4Gb SD RAM which is automatic generated in the Freescale register programming aid excel sheet. But if we checked the i.MX mannual, the value does not match the RAM capacity we use. For 4Gbit (512MB), it should be “0x0F”

what should be the correct value?

3. For register setting MMDCx_MPMUR0, the field "FRC_MSR"  should be used only during manual (SW) calibration and not while the DDR is functional. In the DDR_Stress Test, it set as "0x1"

Should we need to set  to "0x0" in our boot code for normal operation?


0 Kudos
Reply

1,055 Views
Yuri
NXP Employee
NXP Employee

1.
Please check if IOMUXC_SW_PAD_CTL_PAD_DRAM_RESET register, DDR_SEL Field is set to "00".


2.

As for the CS0_END, please use [IMX6 DL] How to set CS0_END in flash_header.S

and imx6d lpddr2 MT42L256M64D4LM-25 and CS0_END

3.

Yes, MMDCx_MPMUR0, the field "FRC_MSR" should be cleared for normal operations.

~Yuri.

0 Kudos
Reply