IMX6 DDR Stress Test Failures

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

IMX6 DDR Stress Test Failures

1,696 Views
tomsaluzzo
Contributor III

Customer is experiencing random DDR Stress Test failures on their custom IMX6D design. A few examples of the failures are:

Serial # 272:

t0: MEMCPY10 SSN X 64 test

Address of Bank 2 Failure: ox280025F8

Data was: 0x7FFFFFFFFF00FFFF

But Pattern was: 0x7FFFFFFFFFFFFFFF

Source is Wrong, it is: 0x7FFFFFFFFF00FFFF

Address of Source Failure: 0x200025F8

Serial # 249:

t0: MEMCPY10 SSN X 64 test

Address of Bank 2 Failure: ox2856E678

Data was: 0xFFFFFFFFFF56FF7F

But Pattern was: 0xFFFFFFFFFFFFFF7F

Source is Wrong, it is: 0xFFFFFFFFFF56FF7F

Address of Source Failure: 0x2056E678

Serial # 218:

t0: MEMCPY10 SSN X 64 test

Address of Bank 2 Failure: ox3B0017F8

Data was: 0xFFFFFFFF7F00FFFF

But Pattern was: 0xFFFFFFFF7FFFFFFF

Source is Wrong, it is: 0xFFFFFFFF7F00FFFF

Address of Source Failure: 0x1B0017F8

The failures are typically on data byte 23:16 but have seen some other bytes fail. Boards vary is frequency of failure from never/almost never to nearly every test. We have one unit that demonstrates failures roughly every other test iteration.

Product uses MCIMX6D5EYM10AD processor and four MT41J128M16HA-15E DDR3 DRAM devices in a T-layout topology with all four memory devices on the top side of the PCB . For the memory interface, schematics are essentially copied from the Sabre board. Software is Uboot and Linux OS. Using the MMPF0100F0AEP PMIC.

Changing the DRAM clock from 528MHz to 396MHz improves the DRAM Stress Test performance but does not fix it. Changing ARM Clock Frequency from 996MHz to 792MHz drastically improves the DDR Stress Test performance. On a board that fails roughly every other iteration at 996MHz, it can run over 1000 passes at 792MHz.

Another symptom of a failure is a "system freeze". It has not been confirmed if this is related to the DDR Stress Test failures but there is a theory that they may be related. The Linux OS was upgraded which included the removal of the DVFS module which resulted in the ARM Core Clock changing from 996MHz to 792MHz. Of 32 boards that would freeze with a 24 hour period, after the OS upgrade, 30 of the 32 boards passed the 24 hour testing period.

Customer has built over 500 units total at CM. CM has performed IMX6 device replacements and failure tends to follow the part. One part has been sent back and tested by Freescale Failure Analysis by running part through production test vectors and it passed.

Customer has reviewed DRAM settings and layout multiple times, including hiring consultants to assist and can't find anything definitively wrong. Customer is looking for correlation of information to problem and a definitive resolution.

Customer will be providing failing board, schematic, PCB gerbers, and DDR init script separately.

Labels (1)
3 Replies

859 Views
GordyCarlson
NXP Employee
NXP Employee

Yuri,

   Thanks for the reply and advice.  Tom Saluzzo (Arrow DFAE) and I were onsite at the customer yesterday and worked with him on the DDR stress tests.  Tom's observations noted above.

   Separately, because of the initial production stage....customer's senior management has requested escalation of the effort to determine why the boards lock up at regular operating speed (996 Mhz). Given the limitations of their own lab instrumentation and ability to improve the results, they have requested additional measurements in Austin of their board by our iMX apps team there. TheAdmiral​ has agreed to take a look at the board and offer any insights or observations that the customer will then use to improve their DDR script values, or if needed....fix layout errors.

   We will keep this public thread open for general responses, but the board schematics,  gerbers, DDR Init script, and DDR Stress test logfile/dump are being shipped to Mark today on a thumb drive along with the customer's board and power supply.

  TheAdmiral​, thanks for your assistance with the significant customer in our market.  Board is being shipped today to you.

Gordy

0 Kudos

859 Views
GordyCarlson
NXP Employee
NXP Employee

Updating and closing out this thread...Issue has been resolved.

Due to the customer's atypical layout (4 chips "on top of board" in a balanced T configuration) their write leveling values were....

0x003F0047

0x00550047

0x003B0054

0x00360042

Mark noted that with such high values for WL, you have to set WALAT = 1 in the MDMISC register.

Without WALAT = 1, the MMDC is forcing the pads into a high-Z state before the DQS strobe has had a chance to make a full down stroke on the last byte in a burst Write. That is why a whole byte is affected. Also intermittent, depending on the board because most of the time the DDR is able to see the last falling edge. But on some boards, some byte lanes, it may miss it.

In order of expected byte lane failures:

Byte Lane 3         Should have more failures than any other lane.

Byte Lane 4

Byte Lane 0 (Tie)

Byte Lane 2 (Tie)

Byte Lane 6

Byte Lane 1

Byte Lane 5

Byte Lane 7         Should see the least number of failures.

Which is exactly the error/failure pattern we were seeing.   Mark set WALAT bit to one, and DDR stress tests now pass.  Customer has duplicated this on a board in their lab, and report same results.

Thank You Mark!  and thanks to Yuri for the added attention and advice.

-Gordy

859 Views
Yuri
NXP Employee
NXP Employee

Hello,

  Sometime, DDR problems  may be solved by using different Drive Strength configurations.

Also one can vary DDR_SEL options. Basically DDR_SEL (say, in IOMUXC_SW_PAD_CTL_GRP_DDR_TYPE

register) is intended to adjust drive strength, which is mainly configured via DSE field.

  Please try to decrease memory frequency. Also, You may try to use WALAT = 1 in memory

initialization script.

  If software adjusting does not help, it makes sense to check  PCB design,

using Chapter 3 (i.MX 6 Series Layout Recommendations) of the “Hardware Development Guide …”

http://cache.freescale.com/files/32bit/doc/user_guide/IMX6SXHDG.pdf

   Also, please use to Excel page named “MX6 DRAM Bus Length Check” in

“HW Design Checking List for i.Mx6”, linked below.

https://community.freescale.com/docs/DOC-93819

It makes sense to check the board regarding the recent Design checklist, in particular : number,

nomenclature and location of (bulk) capacitors.


Have a great day,
Yuri

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos