DDR3 Intermittent Problem

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

DDR3 Intermittent Problem

5,918 Views
johnfielden
Contributor IV

We are seeing an intermittent issue with the DDR3 interface on our boards.  Our design is similar to the Phytec implementation in that we use the same family of Micron DDR3 parts and use the simplest terminiation scheme (series resistors on the address, clock and control lines).   We boot from QSPI and run from internal memory, so we arn't seeing any bootup issues currently.  But, since this board and processor are new to us we've been running an extensive memory test over the DDR3 on each power up to verify that all is well.

The memory test is home brewed and performs three seperate tests.  Data bit (barber pole of the data bits), Address bit (barber pole of the address bits), and random data patterns over the entire memory. 

We see three different  results.  First, an immediate failure.  The data bit test, fails at the first access at 0x8000000.   Currently our software is such that when we get a data bit test failure the test stops running.

Secondly, we see a case where there is an occasional error during the random data test.  In this case, our software will continue to run the test over and over.  Typically we see one to zero failures per pass.  Failures we see are never on the same data bit or at the same address.  When we see a failure, our software immediately re-reads the same address again, but we never see an error on the second read.

Third, no failures ever.   When the board is in this "state" we have let it continue to run (testing the memory) continuously over and over for days.  These runs have occurred at room temp, at a rapid temperature ramp to +65C, steadily at +65C for several days, a rapid temperature ramp  to minus 20C, and steadily at  minus 20C for several days.  A fairly punishing routine with no failues.

We don't do any sort or warm start.  Each of these tests are done with a cold start (power applied to the board).    We are using the DDR3 setup parameters from the Tower board MQX setup.  As it turns out, these values are the same as Phytec's parameters for the same Micron memory family that we are using.

The only difference between these three testing result cases is a power cycle.  The first result (the immediate fail) seems to happen a lot after power has been applied for the first time after a minute or more of being unpowered.  The other two occur mostly on second or third power up attemtps.  Most of the time it works flawlessly.

We had been looking very closely at the power applied to the part to see if we could find an issue.  What we see is similar to what is sketched in the attached image.  I sketeched this because of the huge differences in time scale.   From power cycle to power cycle, there is no discernable difference in the way 3.3V and 1.2V to the Vybrid come up.  The resets and DDR_1.5V come up with no notable differences at the ms time scale.

After looking at the posts https://community.freescale.com/message/336513#336513, it appears that our problem may not be related to power sequencing but to some other DDR3 settings.

Any suggestions of what to look at next?

Labels (1)
Tags (2)
0 Kudos
Reply
26 Replies

545 Views
naoumgitnik
Senior Contributor V

Hello John,

Regarding the signal integrity simulation: recently I posted the Re: New (Rev.H) schematic and layout of Vybrid Tower Module (TWR-VF65GS10) [untested - use on your o... on the internal Freescale Community space with the following:


Disclaimer:

  • Physical board does not exist yet, use on your own risk <== (this is why is is not public!),
  • High-speed parallel interfaces (DDR3, ETM, etc.) simulated successfully, for entire voltage / temperature / process variation range,

Main Vybrid-related revision modifications (with respect to existing Rev.G):

  • DDR3: external termination deleted, Vref circuit simplified.
  • Vybrid ballast transistor powered from 1.5V, not 3.3V...
  • Vybrid Power-On-Reset active timeout made longer to guarantee proper SD card initialization.
  • Optional Ethernet MII interface added...
  • Filtering (series ferrite beads) added into x_AFE and x_ADC power rails for better performance...


from which you may copy the DDR3 section design.

To get this information, you will have to turn to our local FAE and sign an NDA (but, IMO, it is worth it).

Regards, Naoum Gitnik.

0 Kudos
Reply

545 Views
johnfielden
Contributor IV

The link to the rev H schematic seems to be dead.  When I click it, it says that I'm not authorized. 

0 Kudos
Reply

545 Views
naoumgitnik
Senior Contributor V

Not dead, John, but rather not public - as stated above:

  • Physical board does not exist yet, use on your own risk <== (this is why is is not public!),
  • "To get this information, you will have to turn to our local FAE and sign an NDA (but, IMO, it is worth it)."

/Naoum.

0 Kudos
Reply

545 Views
johnfielden
Contributor IV

k

0 Kudos
Reply

545 Views
naoumgitnik
Senior Contributor V

Hello John,

First answering your "slice" question - please, take a look at "34.6.15.1 High Level Block Diagram":

"DRAM MC uses a slice-based approach for the DDR PHY. Each slice manages a byte (8 bits) of data and its corresponding DQS and DM signals.

A high level block diagram of the PHY is provided in Figure 34-197."


And the rest of "34.6.15 DDR PHY" section.


Sincerely yours, Naoum Gitnik.

0 Kudos
Reply

545 Views
naoumgitnik
Senior Contributor V

Hello John,

Before we start digging deeper, it makes sense to clarify several issues:

  • does your test run flawlessly on any other trusted platform - ours or  Phytec?
  • how different is the DDR3 interface's section than ours, e.g. decoupling scheme? - no need to send the entire schematic, just describe the difference, please , if any.
  • is the DDR3 interface's layout based on any trusted example?

Sincerely yours, Naoum Gitnik.

0 Kudos
Reply