Customer is experiencing random DDR Stress Test failures on their custom IMX6D design. A few examples of the failures are:
Serial # 272:
t0: MEMCPY10 SSN X 64 test
Address of Bank 2 Failure: ox280025F8
Data was: 0x7FFFFFFFFF00FFFF
But Pattern was: 0x7FFFFFFFFFFFFFFF
Source is Wrong, it is: 0x7FFFFFFFFF00FFFF
Address of Source Failure: 0x200025F8
Serial # 249:
t0: MEMCPY10 SSN X 64 test
Address of Bank 2 Failure: ox2856E678
Data was: 0xFFFFFFFFFF56FF7F
But Pattern was: 0xFFFFFFFFFFFFFF7F
Source is Wrong, it is: 0xFFFFFFFFFF56FF7F
Address of Source Failure: 0x2056E678
Serial # 218:
t0: MEMCPY10 SSN X 64 test
Address of Bank 2 Failure: ox3B0017F8
Data was: 0xFFFFFFFF7F00FFFF
But Pattern was: 0xFFFFFFFF7FFFFFFF
Source is Wrong, it is: 0xFFFFFFFF7F00FFFF
Address of Source Failure: 0x1B0017F8
The failures are typically on data byte 23:16 but have seen some other bytes fail. Boards vary is frequency of failure from never/almost never to nearly every test. We have one unit that demonstrates failures roughly every other test iteration.
Product uses MCIMX6D5EYM10AD processor and four MT41J128M16HA-15E DDR3 DRAM devices in a T-layout topology with all four memory devices on the top side of the PCB . For the memory interface, schematics are essentially copied from the Sabre board. Software is Uboot and Linux OS. Using the MMPF0100F0AEP PMIC.
Changing the DRAM clock from 528MHz to 396MHz improves the DRAM Stress Test performance but does not fix it. Changing ARM Clock Frequency from 996MHz to 792MHz drastically improves the DDR Stress Test performance. On a board that fails roughly every other iteration at 996MHz, it can run over 1000 passes at 792MHz.
Another symptom of a failure is a "system freeze". It has not been confirmed if this is related to the DDR Stress Test failures but there is a theory that they may be related. The Linux OS was upgraded which included the removal of the DVFS module which resulted in the ARM Core Clock changing from 996MHz to 792MHz. Of 32 boards that would freeze with a 24 hour period, after the OS upgrade, 30 of the 32 boards passed the 24 hour testing period.
Customer has built over 500 units total at CM. CM has performed IMX6 device replacements and failure tends to follow the part. One part has been sent back and tested by Freescale Failure Analysis by running part through production test vectors and it passed.
Customer has reviewed DRAM settings and layout multiple times, including hiring consultants to assist and can't find anything definitively wrong. Customer is looking for correlation of information to problem and a definitive resolution.
Customer will be providing failing board, schematic, PCB gerbers, and DDR init script separately.