Vybrid: About DDR leveling feature on DDRMC.

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Vybrid: About DDR leveling feature on DDRMC.

7,341 Views
norihiromichiga
Senior Contributor I

Hello NXP team,

Our customer is using Vybrid with DDR3 and they have questions about DDR leveling in Vrybrid RM.

Could you answer the following questions?

Case 1)  In case of fly-by topology.

I think that read/write leveling and gate training is meaningful only for fly-by topology.

In case of Vybrid, fly-by topology can be applied to 8-bit width DDR memory x 2 pcs.

(Please correct me if this understanding is wrong)

Question 1)

In  VFxxx Controller Reference Manual, Rev. 8, 11/2015 ,  10.1.6.16.3.1 Software Gate Training in MC Evaluation Mode, we can find the following description.

=======================================================

3. Add a ½ clock cycle increment to the DQS gate by setting

PHY02[SW_HALF_CYCLE_SHIFT], PHY18[SW_HALF_CYCLE_SHIFT],

PHY02[GATE_CFG] and PHY18[GATE_CFG] = 1.

=======================================================

But register description says that SW_HALF_CYCLE_SHIFT is only used for Write level training.

Is this typo?  I think it should be "EN_HALF_CAS" in this context because it is used for Gate training.

Question 2) As for read leveling, Table 10-27 is not referred from any description in this RM.

How we can translate this table?  We need to understand relationship between SWLVL_RESP_x in CR94/CR95 and this table.


Question 3) The edge we should set for the read leveling is rising edge of DQS?  We should do the read leveling on both falling and rising edge?

Question 4) Our software should set the adjusted values in SWLVL_RESP_X to RDLVLD_DL_X?

Question 5) Do you have any reference code for DDR read/write leveling and gate training which followed the leveling operation in RM?

Case 2) In case of non fly-by topology.

In case single DDR3 memory is connected to Vybrid (16 bit width x 1 pcs), read/write leveling is not needed in our understanding.  But we think that still DQS training (or DQS calibration) is useful to get optimum timing of DDR signals correct?  I think that registers to control these timings can be the same registers that are used for leveling such as RDLVL_DL. But software processing for non fly-by topology differs from read/write leveling.

For this purpose,  "VYBRID DDR VALIDATION TOOL" is useful to get optimum number for DDRMC registers?

Thanks,

Norihiro Michigami

AVNET

17 Replies

5,678 Views
norihiromichiga
Senior Contributor I

Hello Mark-san,

Sorry for delaying in my response and thank you for your modification of my slide and information about wrong register setting which may be related to their original problem. I could explain our customer your point. I hope changing CR13 should fix their problem, but I have not got their feedback yet. Once I got their feedback, I will update this page.

Thanks,

Norihiro Michigami

AVNET

0 Kudos

5,678 Views
norihiromichiga
Senior Contributor I

Hello Mark-san,

I have a question about your following comment.

The test result of our customer seems to be the similar to your case.

I mean that the gate training at our customer passed with RDLVL_GTDL = 0 and RDLVL_GTDL=4 only. (unit step for RDLVL_GTDL is 4) 

>I noticed that when it performed the GATE Training calibration routine,

>only the first 2 - 3 settings passed.

>What this tells me is that the DQS pads become enabled for input signals very much too late;

>So late in fact that the Read signal is already reaching the pads by the time the pads are being configured for input.

"DQS pads become enabled for input signals very much too late" means that there are larger latency between the change of internal gate signal (low to high) and the change of DQS pad (output mode to input mode for receiving data) ?  If so, it seems to be the layout issue of this device.

>Then, if they want to try an experiment, have them subtract '1' from there original value of RDLAT_ADJ (DDRMC_CR132[5:0]).

>This will cause the data pads to configure to inputs one clock cycle earlier. At the same time,

> the customer needs to add '1' to the PHY_RDLAT (DDRMC_CR126[13:8]) field.

> This will allow the PHY to keep the data pads configured for input one extra clock cycle.

> The total change will allow the Read Data pads to be enabled for input one clock cycle earlier,

> but remain enabled to the same point as previously used.

> This should allow the VALIDATION TOOL an extra clock cycle to find the correct un-gate timing parameters.

I want to ask our customer to try above settings. With this experiment, if they can see many settings that can _pass_ the gate training comparing to their original test result, what does it mean?  The result of their gate training in my xls is not correct un-gate timing and they should adjust  RDLAT_ADJ and PHY_RDLAT along with RDLVL_GTDL to get better margin before DQS pad becomes enabled for input data?

Thanks,

Norihiro Michgiami

AVNET

0 Kudos

5,678 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Norihiro-san,

> With this experiment, if they can see many settings that can _pass_ the gate training comparing to their original test result, what does it mean?

If they try the experiment and it works, I expect that maybe Gate Calibration settings of 0 - 132 should now work. This means that the Read Pads are being enabled a full clock cycle (128) before the Read data is arriving at the processor. This then allows the customer to un-gate the Read signal at least 1/2 clock cycle before the data arrives, instead of just picoseconds before the data gets there.

>  The result of their gate training in my xls is not correct un-gate timing and they should adjust  RDLAT_ADJ and PHY_RDLAT along with RDLVL_GTDL to get better margin before DQS pad becomes enabled for input data?

Yes, we are looking for a better marging for the pads to be enabled before the input data gets there.

Cheers,

Mark

5,678 Views
norihiromichiga
Senior Contributor I

Hello Mark-san,

I really appreciate your reply.

I think I fully understand what happens inside of Vybrid on our customer's board now.

But before I ask our customer to change RDLAT_ADJ and PHY_RDLAT on their system, could you take a look at my slides to illustrate my understanding?   If this diagram is correct, I will ask our customer to change their setting with this explanation to get expected margin of un-gating and re-run DDRv TOOL on all the board they have.

1st page:  Result of current settings

2nd page: Result of modified setting (assumption)

Thanks,

Norihiro Michgiami

AVNET

0 Kudos

5,677 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Norihiro-san,

I marked up your presentation slides a bit, hopefully to show more clearly when Vybrid pads are configured to be input. If you have any questions, please let me know. These slides are a simplification of the signals in Vybrid. For example, there are really two signals that go to the pads to configure it to be either output or input. But it is not necessary to show two signals on the slide. The slides should get the point across correclty.

Also, I went through the initial register settings.  I did see one setting that could be causing the problems that the customer is seeing. Here are my comments:

Register CR13 has a definite error in it. DDR3 memory does not allow interrupting a burst in progress. So bits [2:0] need to be set to 0x0.  Please change CR13 = 0x15040400.
This seem like the most likely reason for your current problems.

Another register that should probably be changed:

CR78: Setting for Q-FULLNES [bits 26:24]. A zero is telling the arbiter that the command Queue is nearly full after only one entry. It will basically defeat the purpose of the arbiter if it cannot have more commands to shuffle around for the most efficient order of Reads/Write. We recommend this field be set to 0x7.

Other register that our different than our current recommendations, but should not really be causing a problem:

CR22 Auto Pre-charge recovery time. Normally we set this value to tRP (6) + tWR (6) = 0XC. I’m not sure if the datasheet says something different. You may want to double check.

CR26: For 400 MHz, 7800 ns is 3120 clock cycles. We are setting this parameter to 0xC30. Probably not a big issue.

CR33: Your code is setting an extra clock for CKE_STAT bits to update [bit 24]. Our implementation of this does not require the extra bit, so we normally recommend ‘b0. Also in the same register, we recommend that the controller be allowed to interrupt an initialization if a self refresh is required [bit 16].

CR74: This does not affect your current issue, but we recommend a setting of 0x40 for command aging just to make sure that traffic doesn’t wait too long in the arbiter. This applies to [bits 15:8] and [bits 7:0].

CR76: Bit 8 for W2R_SPLT_EN is only for a 2 chip select system, which Vybrid cannot support. So we recommend this bit be set for disable.

CR126: Setting PHY_RDLAT = 12 is probably too long [bits 11:8]. We would recommend a setting of 8 if RDLAT_ADJ = CAS_LAT_LIN or a setting of 9 if RDLAT_ADJ = CAS_LAT_LIN – 1. Also note: if RDLAT_ADJ = CAS_LAT_LIN – 1, your values for RDLVL_GTDL_0/1 should probably be in the vicinity of 0x40, but this does not apply to your current settings.

PHY03/PHY19/PHY35: If your board is having trouble with DLL locking, you may want to change these registers to 0x0043012A. This is our current “all design” setting, because we found it necessary on some customer boards. But if you are not having a problem booting, this does not apply to you.

PHY32/PHY33/PHY34: Although the Command/Address slice is different than the other two, we recommend to match the settings of these registers to those of PHY00/PHY01/PH02, just for consistency sake.

PHY52: We normally have ODT on, except for DDR writes. That is done by setting this register to 0x00111111

I would very much like to know if the change to CR13 helps solve the customer timing issues on their bad boards. Otherwise, it might be a very difficult issue to solve.

Cheers,

Mark

5,676 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Norihiro-san,

You are correct in that the VALIDATION TOOL writes data to the DDR, reads it back, and then compares the results to determine if the data Write/Read was correct. It requires a host computer to monitor each test, and then reset the DUT (Device Under Test) with new timing parameters to repeat the tests. The host computer keeps track of all the tests performed, and then determines the recommended calibration parameter. This method is completely unsuitable for a stand alone device to use a boot time.

I know that in the past, some customers have considered running calibration routines at boot time to determine exact timing parameters for each individual device in the field. Some customers have even considered performing the calibration routines after a fixed time period of continuous operation. NXP officially discourages this policy. Calibration parameters do not very enough from board to board, silicon to silicon, to warrent the extra time necessary to perform this calibration at boot, nor to risk the possibility that the calculation is done incorrectly and the device fails to boot.

I have years of experience performing DDR calibration tests on ARM Core embedded processors. The calibration values simply do not very that much. If you repeat the tests across a large sample of boards, even including a Temperature Chamber for Hot and Cold testing. The reason for this is that the ZQ Calibration routine fully compensates for varitions in operating conditions.

Having said all that, I understand that the customer has a number of boards that don't seem to calibrate correctly. I don't believe this to necessarily be a calibration parameter problem. I think there is another timing parameter that is not set correctly, that is on the edge of working on most boards, but is causing problems on a few boards. I would really like to assist the customer in getting all of their prototype board working correctly before the customer tries to further experiment with calibration code.

I would like to give an example of what I am talking about: When the VALIDATION TOOL first became available, I noticed that when it performed the GATE Training calibration routine, only the first 2 - 3 settings passed. What this tells me is that the DQS pads become enabled for input signals very much too late; So late in fact that the Read signal is already reaching the pads by the time the pads are being configured for input. This left only a very few settings to work for un-gating the Read DQS signal. There should be at least a full clock cycle (128 delay elements) of passing test routines available for the VALIDATION TOOL to then choose a correct calibration value (because the required Read Pre-Amble of a burst sequence lasts a full clock cycle).

I would like to ask the customer what the VALIDATION TOOL returns as values for the Gate Training Tool for both a passing board, and for one of their falling boards. When running the test on the failing board, use the same register settings (including calibration values) as used on the passing boards.

Then, if they want to try an experiment, have them subtract '1' from there original value of RDLAT_ADJ (DDRMC_CR132[5:0]). This will cause the data pads to configure to inputs one clock cycle earlier. At the same time, the customer needs to add '1' to the PHY_RDLAT (DDRMC_CR126[13:8]) field. This will allow the PHY to keep the data pads configured for input one extra clock cycle. The total change will allow the Read Data pads to be enabled for input one clock cycle earlier, but remain enabled to the same point as previously used. This should allow the VALIDATION TOOL an extra clock cycle to find the correct un-gate timing parameters.

If you would like, you can send me the initialization file that the customer is using to set the register values. I will review them and provide any other recommendations if I see anything that can be improved.

As for the code used by the VALIDATION TOOL, it is not availabe to be passed out to customers. It is owned by the Business  Office, and not by the Microprocessors team. It is maintained in a separate software database and we do not have access to it.

Cheers,

Mark

5,677 Views
norihiromichiga
Senior Contributor I

Hello Mark-san,

Thank you for your detailed comment again.

I really appreciate your help because you completely understand what I wanted to know.

Here is my understanding about your comment.

1. As for the idea of implementing calibration tool in the firmware, NXP officially doesn't recommend it.

Because it is not worth to spend time to calibrate the various timing. And boot may fail if the result of calibration is not correct.

2. As for result of DDR VALIDATION TOOL, I understand that the reason why some board showed "all fail" may be related to DDRC settings other than RDLVL_DL/RDLVL_GTDL/WRLVL.

> I would like to ask the customer what the VALIDATION TOOL returns as values

> for the Gate Training Tool for both a passing board, and for one of their falling boards.

> When running the test on the failing board, use the same register settings (including calibration values)

> as used on the passing boards.

Please find attached xls.  This result was captured from the board where DDRv tool worked well. I also put initial value of DDRC into this xls sheet which was used as initial values for DDRv tool.

In case of non-working board, DDRv tool showed "all fail" for read/write/gate training. So, I'm not attaching XLS sheet for non-working case.  But according to customer, even though DDRv tool showed "all fail" on that board, when they applied register values that were taken from the board where DDRv worked correctly to that board, the board could bring up correct.y. So customer wants to know the reason why DDRv tool failed on that board. I think they should try the following idea to see if the reason of failure is related to gate timing or not.

> Then, if they want to try an experiment, have them subtract '1' from there original value of

> RDLAT_ADJ (DDRMC_CR132[5:0]). This will cause the data pads to configure to inputs one clock cycle earlier. .....

As you can see in attached xls, write/read/read gate are correctly computed in this case. (NXP Japan team told us that we can use fixed value for RD_DL_SET=4 and GATE_CFG=0 regardless of the result of DDRv).

NXP Japan team told us below how to select optimum number from this result.

- DLL_WRITE_DL => Center value should be taken as optimum number.

- RDLVL_DL=> Center value should be taken as optimum number.

- RDLVL_GTDL => If DDRv tool could pass in between 0 ~ 64, zero should be taken as optimum number.

                                If DDRv tool could pass in 64 or higher number, (Max - 64) should be taken.

Finally, I understand that it is difficult to get the actual source code of DDRv tool running on Processor Expert.

But if possible, could you advise me the following point to understand how DDRv works.

- What data patterns and how many words are written in write/read/gate calibration?

  I think DDRv write marching patterns like "0xAAAA", "x5555" etc...

- Calibration result is used for the next calibration process?

  In my understanding, DDRv tool runs the calibration process in the following order.

  1st phase: Write timing calibration.

  2nd phase: Read timing calibration.

  3rd phase: Gate timing calibration.

I think that  DDRv tool can know the optimum value for "DLL_WRITE_DL" after completion of write timing calibration.

This number is automatically selected when read timing/gate timing calibration is performed?

Or, read / gate calibration still uses our initial value for "DLL_WRITE_DL" during read/gate calibration process?

Thanks,

Norihiro Michigami

AVNET

0 Kudos

5,677 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Norihiro-san,

Sorry for being a little late on this. I was on PTO.

I reviewed the attached DDRv results. As I suspected, the range
of allowed RDLVL_GTDL values is very small. I really think, for now, that the
failing boards are due to timing problems with the Read Enable signal which is
preventing the Read Pads from turning on soon enough. Which is why I
recommended the experiment to change the RDLAT_ADJ value. It may also be
necessary to increase the value of PHY_RDLAT (DDRMC_CR126 bits [13:8]) by one to compensate for the fact that the
PHY pads are being enabled one cycle earlier. This is my best guess to solve
the customer issue at this point.

  • As you can see in
    attached xls, write/read/read gate are correctly computed in this case. (NXP
    Japan team told us that we can use fixed value for RD_DL_SET=4 and GATE_CFG=0
    regardless of the result of DDRv).

NXP Japan team told us
below how to select optimum number from this result.

- DLL_WRITE_DL =>
Center value should be taken as optimum number.

- RDLVL_DL=> Center
value should be taken as optimum number.

- RDLVL_GTDL => If
DDRv tool could pass in between 0 ~ 64, zero should be taken as optimum number.

                              
If DDRv tool could pass in 64 or higher number, (Max - 64) should be taken.

I agree completely with these recommendations. To answer
your other questions:

  • What data patterns and how many words are written in write/read/gate
    calibration?

The Test patterns are defined in the Validation tab of
Processor Expert. I am trying to get an update to my license to run Processor
Expert, so I can’t specifically describe it exactly, but it is in one of the
tabs in the lower right of the Validation tab. Those values are read into the
test script:

                set $start_addr = <<{START_ADDRESS}>>

                set $pattern = {<<{PATTERN}>>}

                set $size = <<{SIZE}>>

This script can be found at: C:\Eclipse-4.2\eclipse\Optimization\resources\Vybrid\DDR\templates\scripts

The file name is Read Write Compare.txt

  • Calibration result is used for the next calibration process?

That is correct. Processor expert will determine what it
thinks is the best value and set the calibration register to that value and
move on to the next test. If Processor Expert does not finish the test, the
value in the register will be left to the value of the last trial (which is
often too high to make the next tests work).

The order of the tests is as they appear in the check list.
I would recommend running them one at a time, starting with Gate training, then
Read Timing and finally Write Timing (reverse order).

  • This number is automatically selected when read timing/gate timing
    calibration is performed?

Yes, that is correct.

Cheers,

Mark

5,676 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Norihiro-san,

Jiri will be leaving NXP very soon, so I will take over responsibility for answering your questions.

First, I want to make sure that we correctly define and understand the different timing parameters, because the IP we are using in Vybrid uses confusing terms:

  • Write Leveling (WRLVL_DL_0/1) refers to adjusting the timing between the Write DQS strobe signal and the SDCLK signals, so that the edges align.
  • DLL Write delay (DLL_WRITE_DL) refers to adjusting the DQS strobe in relation to the DQ signals so that the strobe edge is centered in the window of valid write data.
  • Read Leveling (RDLVL_DL_0/1) refers to adjusting the DQS strobe in relation to the DQ signals so that the strobe edge is
    centered in the window of valid read data.
  • Read Gate Delay (RDLVL_GTDL_0/1) refers to the delay the PHY uses to un-gate the Read DQS strobe pad from the time that the PHY enables the pad to input the strobe signal.

To help understand this last point, I am attaching an explanation I recently wrote that discusses Write and Read Latency timings and the settings
used both for the DDR devices and for the PHY.

Second, the Vybrid DDR controller only supports training using a software mode controlled by the Memory Controller (Hardware Training is
not supported by the PHY). NXP does not have any SW to perform Write Leveling because we do not expect customers to use memory in a Fly-By Topology. The
total memory address space supported by Vybrid is easily accommodated by one or two DDR devices. Fly-By Topology is typically used for x64 and x128 bus widths.
Nevertheless, if the customer really wants to configure DDR3 in a Fly-By Topology, they can manually enter delay values into the WRLVL_DL_0/1 fields based on trace
length differences from their layout. Mismatches up to 25% or tCK (clock period) are allowed, so the value in the filed doesn’t have to be very
accurate.

Next, for question #1: Yes, you are correct. The description in Software Gate Training in MC Evaluation Mode does contain a typo. It should
read EN_HALF_CAS and not SW_HALF_CYCLE_SHIFT.

Question #2: You are correct that table 10-27 is not referenced in the procedure. I am going to submit a manual change to reword
Step 14 of the procedure:

  1. Read the responses from the SWLVL_RESP_X bits. The meaning of the response is based on the previous response and how the delay was
    changed for this iteration. Table 10-27 below can be used to help the user determine the next actions to be used in Step 15.

Also, I see a similar problem in Write Leveling and Gate Training, and will update those procedures as well.

Question #3: The edge we should set for the read leveling is rising edge of DQS? – The edge you set in field RDLVL_EDGE will determine which
data pattern you should expect back from a read.

To use the SW Read Level method, you need to set the DDR devices to output a pre-defined data pattern. That is done in step 5 of the
procedure in the Reference Manual: “An MPR write will enable read leveling for the memory…” What this step actually means
is that the software routine needs to do a mode register write (MRS) to MR3 of the DDR device. Setting MR3 to 0x0004 forces the DDR3 device to issue a predefined bit
pattern in response to any subsequent read command until MR3 is programmed back to 0x0000. In the predefined pattern, all Read data bits will be set to ‘0 for
the rising edge of the DQS strobe and will be set to ‘1 for the falling edge of the DQS strobe.

So, by setting the RDLVL_EDGE = 0, the SW should expect to see 0x00 in the SWLLVL_RESP_X field. Any bit that shows up as ‘1 means that the
strobe was either too early or too late to correctly read this DQ trace. The opposite is true for RDLVL=1.

Question #4: Should the software adjust values in SWLVL_RESP_X to RDLVL_DL_X?

As explained above, the values reported in SWLVL_RESP_X represent the data bits strobed into the READ FIFO as a result of the delayed
DQS strobe edge, and should match the predetermined value as selected by RDLVL_EDGE. To find the correct value for RDLVL_DL_X, first find the lowest
value of RDLVL_DL_X which results in SWLVL_RESP_X returning all 8 bits correctly. Then find the highest value of RDLVL_DL_X which results in
SWLVL_RESP_X returning all 8 bits correctly. Average the two values together (Step 16). This is the final value that should be programmed into RDLVL_DL_X. [Note:
Step 16 has a typo when stating the field.]

Question #5: Do we have any reference code for read/write leveling and gate training?

No, sorry we don’t. The best I can do is tell you to use the Processor Expert Tool for Vybrid (VYBRID DDR VALIDATION TOOL), which determines
calibration values using different methods, and not the ones written in the Reference Manual. (I am a HW engineer. I really can’t tell you how to write the
code.)

Case #2: Point-to-Point Topology

DLL_WRITE_DL, RDLVL_DL and RDLVL_GTDL are still necessary for Point-to-Point topology. Write Leveling calibration is not necessary and is
a waste of effort. The SW routine to find these three calibration values should be the same as the Fly-By Topology case. Nevertheless, I still recommend that
you use “VYBRID DDR VALIDATION TOOL”.

Hopefully my write up to this point fully answers all of your original questions. If not, or if you have other questions, please let me know.

Now to respond to your print out results:

Your first case looks exactly as expected.

Your second case (decreasing RDLVL_DL_X by 1) does not make sense. It would help if the RDLVL_DL0/1 values printed out above the responses
were the actual programmed value of RDLVL_DL_X. (I would expect that the value started at 89 and decreased down to 00, one at a time. There is no reason to
expect that the second case should not look exactly like the first case, except in reverse order, as long as all the other parameters were not changed.

In particular, I would expect the starting point of RDLVL_DL_X to be 89, and not 65535 (0xFFFF) or 255 (0xFF). Also, please make
sure that you are using RDLVL_EDGE=0. I am not convinced that, if using the falling edge, the DQ traces would return to ‘0 value after the last word is received.

If the results for Case #2 are not resolved, could you provide the portion of code the performs the RDLVL_DL_X value changes?

Cheers,

Mark

5,676 Views
norihiromichiga
Senior Contributor I

Hello Mark-san,

I'm sorry to hear that Jiri-san will be leaving NXP. I really appreciate his support for Vybrid device.

And thank you for your detailed answer and providing us your document.  I will review your document carefully.

Your answer is just what we wanted to know when we open this page. Thanks.

>If the results for Case #2 are not resolved, could you provide the portion of code the performs the RDLVL_DL_X value changes?

As for our test result of SW read leveling code, we understand that the same result we could see if we decreased the delay during read leveling process. We will investigate it again to make sure that we used the same parameters as Case#1.(working case)

If we still have problem, I will upload our software which does read leveling.

And I understand that SW  leveling should be used for fly-by-topology, but theoretically, I think SW read leveling give us the same timing parameter (delay value) as VALIDATION TOOL.

In parallel, based on your suggestion, we will retry “VYBRID DDR VALIDATION TOOL”. In fact, our customer already tried it on their board, but VALIDATION TOOL couldn't find the point of change of data on some of their board. (VALIDATION TOOL showed the reasonable delay value for most of their board, so VLAIDATION TOOL itself seems to be working.)

> Processor Expert Tool for Vybrid (VYBRID DDR VALIDATION TOOL),

> which determines calibration values using different methods,

Yes, I know VALIDATION TOOL doesn't perform SW leveling. Instead of that, it seems to use an approach of standard memory test.  (Write a specific pattern, then, read back it and compare the read value with original value.)

Our customer is asking us to provide the actual code (or detail of procedure) which is running inside of VALIDATION TOOL to calibrate the read/write/gate timing, so that they can run the same process from their firmware. Their idea is so-called "DDR training sequence" at boot-up time. Could you try to discuss with tool development team to get such information?

Thanks,

Norihiro Michigami

AVNET

0 Kudos

5,676 Views
karina_valencia
NXP Apps Support
NXP Apps Support

TheAdmiral can you help here?

0 Kudos

5,676 Views
norihiromichiga
Senior Contributor I

Hello Jiri-san,

Could you take a look at my previous comment? I put two write leveling test result as you can see.

First one was leveling with incremented delay

    => I want to know your comment if our test result matches NXP's expectation.

Second one was leveling with decremented delay

    => We couldn't find the point of change of read data. Do you know if there are any know issue on decremented delay?  

If you say that the test result of incremented delay matches NXP's expectation and increased delay approach is enough to find the point of change in the read data, we will recommend our customer to run our program on their system to set correct DQS.

Thanks,

Norihiro Michigami

AVNET

5,676 Views
karina_valencia
NXP Apps Support
NXP Apps Support

jiri-b36968​ can you continue with the follow up?

0 Kudos

5,676 Views
norihiromichiga
Senior Contributor I

Hello Jiri-san,

I'm changing my latest post to reflect our latest status again. Sorry for my frequent update.

Our engineer applied NXP's latest recommended settings to DDRC,

And they ran write leveling on TWR board again.

Here is result of slice 0 and slice 1 with both increased and decreased delay value.

As you can see, in case of increment, result is good.

We can see the changing point of DQ after 64(decimal) taps. It is corresponding to half of data period.

But in case of decrements, result is always 0xFF.

It seems that DQ in read leveling mode is not correctly sampled by Vybrid. Please review this result.

===========

Result: (Delay value is increased by 1.)

RDLVL_DL0/1: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

RESP_0:     FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

RESP_1:     FF 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

RDLVL_DL0/1: 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

RESP_0:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

RESP_1:     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

RDLVL_DL0/1: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

RESP_0:     00 00 00 00 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RESP_1:     00 00 00 00 00 FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

slice0:Rising_edge = 00 ,Falling_edge = 64 slice1:Rising_edge = 00 ,Falling_edge = 64

===========

Result:(Delay value is decreased by 1.)

RDLVL_DL0/1: 00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29

RESP_0:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RESP_1:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RDLVL_DL0/1: 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

RESP_0:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RESP_1:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RDLVL_DL0/1: 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89

RESP_0:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

RESP_1:     FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF FF

slice0:Rising_edge = 00 ,Falling_edge = 00 slice1:Rising_edge = 00 ,Falling_edge = 00

=========

Thanks,

Norihiro Michigami
AVNET

5,676 Views
jiri-b36968
NXP Employee
NXP Employee

Hello,

leveling is needed for fly-by topology only. It is to match DQS+DQ to ADD+CTRL+CLK on different data groups.

The need depends on which topology you used.

On Vybrid reference boards we have only one memory and both data slices match - so leveling is not needed. We use same value for both data slices.

What is needed is to tune DQS strobes to DQ 1/4 of period delay:

1. Read path CR105, CR110

2. Write PHY04, PHY20

then also gates and OE can be tuned.

We do not have any code to tune them. Only tools which was available was DDRv (licensed eclipse based). Is works, but it is not in perfect shape.

/Jiri

5,676 Views
norihiromichiga
Senior Contributor I

Hello Jiri-san,

Thank you for your reply.

I understand that so-called read/write leveling between difference DQS group is not needed because our customer board has single DDR(16 bit width) being connected to Vybrid with controlled trace length. (Slice 0 and Slice 1 can have the same setting for delay) 

In fact, the problem (read data was corrupted) was seen on specific 4 boards out of 40 boards. Another 36 boards didn't show this problem. (i.e. 4 non-working board and 36 working board customer has)   When they modified read delay manually on non-working board, read data corruption issue could be improved, but that setting didn't work for working board.  So I think that the problem may be caused by variation of PCB or charateristics of Vybrid. Adjusting write timing (DLL) doesn't help their problem.

Currently, our customer is running DDR verfication tools from NXP on their 40 boards to visualize the read timing.

But anyway, I think their firmware must adjust, at least, the read timing and the gate timing on each board before start operation, so that read DQS can be a center of DQ signal.

Please let me ask you some questions again.

1. Hardware read leveling and hardware write leveling are not supported by MC on Vybrid, correct? Because we can find the following descriptions in RM.

  >Hardware read leveling is not implemented in this version of the PHY.

  >Enables the hardware write leveling features in the controller.
  >For DDR3:
  >• 0 - Hardware Write Leveling Disabled
  >• 1 - Reserved
  >The PHY does not support this feature.

If above undersanding is ture, we must implement read/write leveling/gate training in our firmware per description in RM.

2. Gate timing adjustment

"10.1.6.16.3.1 Software Gate Training in MC Evaluation Mode" says that PHY02[SW_HALF_CYCLE_SHIFT], PHY18[SW_HALF_CYCLE_SHIFT] are used for gate training. But "SW_HALF_CYCLE_SHIFT" is not correct and it should be read as "EN_HALF_CAS", correct?

2. Read timing adjustment

We understand that we should follow the procedure described in "10.1.6.16.4.1 Software Read Leveling in MC Evaluation Mode".

>16. The ideal delay value for this system will be the midpoint between the
>rising and falling edge transition points. The midpoint values should be
>calculated and stored in the RDLVL_DLL_X bits for each data slice X.

RDLVL_DLL_X means RDLVL_DL_X, correct?

Thanks,

Norihiro Michigami

AVNET

5,676 Views
jiri-b36968
NXP Employee
NXP Employee

Hello Norihiro-san,

R1. HW leveling is implemented in IP but never tested. For tuning SW way was used.

R2. right. SW_HALF_CYCLE_SHIFT is used for write delay training and EN_HALF_CAS is used for gate training.

R22. right. it is delay (DL) in delay line (DLL). Just mistake in the label.

/Jiri

0 Kudos