Vybrid LPDDR2 200MHz

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Vybrid LPDDR2 200MHz

Jump to solution
7,279 Views
tfe
Contributor V

We have custom board with a Vybrid controller and LPDDR2 Memory (IS43LD16640A).

We recently managed to get it working with the DRAM clock running at 400 MHz (WL=3, RL=6), but due to power concerns, we need to clock it down to 200 MHz (WL=2, RL=4). Both the MCU and DRAM are running from the same clock (PLL1pfd3), the resulting MCU frequency will be 198MHz. Both setups use burst length (BL) 4.

Assuming this would be an easy task, we simply updated the DRAM controller timings and hoped for the best. Not surprisingly, this did not work.

Using a JLINK-debugger, we loaded the DRAM using a functional setup (running at 400 MHz), and then immediately reset the MCU and initialized the DRAM controller with the experimental setup (200 MHz). From this we were able to read back the pattern written, indicating that nothing is wrong with the read-operation.

Next we wrote an ant-pattern to the same region we just read. When reading it back, we observed that the first two bytes of every other burst were wrong.

We have previously observed that a 32bit write produces the following result:

  write -> readback 0x8000_0000 -> 0x0000_aaaa 0x4000_0000 -> 0x0000_aaaa 0x2000_0000 -> 0x0000_aaaa 0x1000_0000 -> 0x0000_aaaa [...] 0x0008_0000 -> 0x0000_aaaa 0x0004_0000 -> 0x0000_aaaa 0x0002_0000 -> 0x0000_aaaa 0x0001_0000 -> 0x0000_aaaa 0x0000_8000 -> 0x8000_aaaa 0x0000_4000 -> 0x4000_aaaa 0x0000_2000 -> 0x2000_aaaa 0x0000_1000 -> 0x1000_aaaa [...] 0x0000_0008 -> 0x0008_aaaa 0x0000_0004 -> 0x0004_aaaa 0x0000_0002 -> 0x0002_aaaa 0x0000_0001 -> 0x0001_aaaa

It seems that the missing two bytes "flow over" to the next 32-bit word:

(gdb) x/16b addr  0x80000000: 0x00 0x00 0x55 0x55 0x55 0x55 0x00 0x80 0x80000008: 0xaa 0xaa 0xaa 0xaa 0xaa 0xaa 0x00 0x55 (gdb) p/x *addr $59 = 0x55550000 (gdb) set *addr = 0xaabbccdd (gdb) p/x *addr $60 = 0xccdd0000 (gdb) x/16b addr 0x80000000: 0x00 0x00 0xdd 0xcc 0xbb 0xaa 0x00 0x80 0x80000008: 0xaa 0xaa 0xaa 0xaa 0xaa 0xaa 0x00 0x55

We are therefore wondering: What parameters do we specifically need to change when changing the DRAM clock from 400MHz to 200 MHz?

Following is a diff between the functional 400MHz setup, and the experimental 200 MHz setup:

[Diff] U-Boot DRAM diff - Pastebin.com

Labels (1)
1 Solution
5,875 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

First, I probably owe you an apology for not thinking about this earlier. I was caught up in trying to figure out how the data was being shifted in the memory.

When I saw your resgister results above, the problem became obvious. The fact that your DLL_LOCK_VALUEs at 200 MHz are all 0x00 indicates that the DLLs thinks it takes zero delay elements to make up a full clock cycle. Then, when it tries to go calculate the number of delay elements for Write and Read DQS delays, whatever number you put in still comes out as 0 delay. DQS strobes are not properly delayed and data doess't get latched in properly. I'm surprised that you actually had a board that worked in this case.

So I talked to the SOC design engineer. There are only 128 delay elements in the DLL. They are roughly 30 picoseconds each, so in full clock mode, the DLL can only handle a period of 3.84 nanoseconds, which equates to a frequency of ~ 260 MHz. Then the SOC engineer pointed out a register field that we had reserved titled PARAM_HALF_CLOCK_MODE, which is used for low frequencies. Apparantely I did not realize what low frequency was when I was working on the last revision and reserved this field.

Setting PARAM_HALF_CLOCK_MODE to 1 lets the DLL sync on only a half clock cycle with its 128 delay elements, so clock periods of 7.68 nanoseconds can then be supported (130 MHz). Even lower frequencies can be supported operating the DLL in bypass, which is a different bit that you don't need to concern yourself with.

So the full solution to your problem is simply to set PHY03/PHY19/PHY35 bit 24 to 'b1.

I think that would make the full register setting = 0x01430115 (note that I cut the DLL starting point in half).

Please give that a try and let me know how it works for you.

Cheers,

Mark

PS

The description of this field will be something like (ie draft)

Determines if the master delay line locks on a full clock cycle or a half clock cycle.

Within the Master DLL there are only 128 delay elements that can be used to determine a lock. For frequencies of operation below 300 MHz, it is necessary to limit the lock period to only a half clock cycle so that the master delay line does not become staturated.

  • For frequencies above 300 MHz, set this bit to 'b0.
  • For frequencies below 300 MHz, set this bit to 'b1.

For both LPDDR2 and DDR3:

  • 'b0 - Master DLL locks on full clock cycle delay
  • 'b1 - Master DLL locks on half clock cycle delay

View solution in original post

32 Replies
1,347 Views
tfe
Contributor V

Hi Mark,

Once again, thank you for you detailed response. I will have one of the hardware-engineers look at the clock jitter when he comes back from vacation next week.

Following is the state of the board when "all columns are filled" (referenced above)

[GDB] Vybrid LPDDR2@200MHz All columns filled - Pastebin.com

[ARM] Vybrid LPDDR2@200MHz DS5-debug init-script, all columns fill - Pastebin.com

We tried setting GATE_CLOSE_CFG=0x3 and GATE_CFG=0x0 as you asked, without any improvements:

[GDB] Vybrid LPDDR2@200MHz GATE_CFG=0, GATE_CLOSE_CFG=3 - Pastebin.com

As you can see from the ARM-paste above, the PHY49[7:0]-field is no longer being set. This has also been confirmed with our debug probe (JLink Plus)

Addressing the RL/WL-timings mentioned - following is a cut-out from our DRAM-chip datasheet, as to clear up any confusion:

IS43_RL_WL.png

/Tom

0 Kudos
1,347 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

Your operating your AXI bus at 100 MHz???

CR117 = 0x00020101  <<= Bits [17:16] are setting the frequency ratio at MMDC control to x2 that of the AXI0 bus.

CR119 = 0x00000002  <<= Bits [1:0] are setting the frequency ratio at MMDC control to x2 that of the AXI1 bus.

I just noticed this. Previously I glossed over the AXI bus settings as somewhat unimportant to stress testing, but then I saw that you actually set the bus timings to something other than Asynchronous.  This could be the very thing causing you problems. Please set:

CR117 = 0x00000101

CR119 = 0x00000000

Also please set GATE_CFG = 0 in PHY02, PHY18 and PHY34. Setting it to any other value will cause the Read FIFO to miss the first strobe signals of the returning read data.

Please set PHY02/PHY18/PHY34 = 0x002900180

BTW, I just increased RD_DL_SET by 1, which seems to improve things that are marginal.

BTW, you are setting the IOMUX settings for the DQS strobes to have a 100k Ohm pull down, correct?

0x400482C8 = 0x000101CC (or 0x0001018C if you prefer)

0x400482C4 = 0x000101CC

Where [5:0] = 0x0 sets pull down 100k

[3] = 1 enables Pull/Keeper

[2] = 1 sets Pull enabled

I did mention this, but can't check it from your register dump.

And I think your calcualtion of TDAL is off. Based on your settings for CR14 and CR158, I think you should make the following settings:

CR22 = 0x00090000

Other minor things I recommend setting:

CR138 = 0x00000100

CR87 = 0x00000000

CR09 = 0x01000000

I'm thinking that the AXI bus timing change and the GATE_CFG change will solve your problems.

They should at least give you different results which will then point to other things that could be changed.

Cheers,

Mark

1,343 Views
tfe
Contributor V

Hi Mark,

We made the changes you listed to the AXI-bus, and unfortunately it did not fix our problem:

[GDB] Vybrid LPDDR2@200MHz AXI<N>_FITYPREG=0 - Pastebin.com

[ARM] Vybrid LPDDR2@200MHz DRAM config AXI<N>_FITYPREG=0 - Pastebin.com

Notice that GATE_CFG=0 as well.

We also tried increasing RD_DL_SET by 1 and FITYPREG=0:

[GDB] Vybrid LPDDR2@200MHz AXI<N>_FITYPREG=0, RD_DL_SET=5 - Pastebin.com

[ARM] Vybrid LPDDR2@200MHz DRAM config FITYPREG=0 RD_DL_SET=5 - Pastebin.com

We also used your value for TDAL:

[GDB] Vybrid LPDDR2@200MHz TDAL=9 - Pastebin.com

[ARM] Vybrid LPDDR2@200MHz DRAM config TDAL=9 - Pastebin.com

However, we should not that we do not believe TDAL=7 to be wrong. From our datasheet:

TRPpb = 18 ns (TYP)

TWR = 15 ns

TCK = 5 ns

Thus RU{ (18 + 15)/5 } = RU{ 33/5 } = RU{ 6.6 } = 7

The state of all you suggested changes:

[GDB] Vybrid LPDDR2@200MHz NO_MRR - Pastebin.com

[ARM] Vybrid LPDDR2@200MHz DRAM config NO_MRR - Pastebin.com

From the [GDB]-links above, we can see that the first two columns are still not being filled. We also see that the 0x34-byte disappeared when setting NO_MRR in CR09.

/Tom

0 Kudos
1,343 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

I have to admit that your board is stumping me a bit. I'm pretty close to exhausting everything I can think of.

Am I correct in believing that your board worked at a frequency of 400 MHz, and the trouble you are having is with 200 MHz only?

I think I picked that up from the first post. Please let me know if I am wrong.

Also, from the MR3 settings, you are using a 40 Ohm drive strength on the DDR. Did we ever discuss what the drive strength of the processor pins are set to? Is it also 40 Ohm? I know that my examples above have all be using drive strength of 34 Ohm. Have you tried setting all drive strengths to 34 Ohms to see if that helped anything?

BTW, I assumed that you set the Memory Controller clock to 200 MHz before you try to initialize the MC.

I may have two other ideas:

Although LPDDR2 does not use a gate signal to un-gate the Read data based on: Un-gating in the middle of the Pre-amble and gating back on after 3 clock cycles, it may be that your particular board needs to have the incoming read pads enabled a cycle earlier. I am thinking this because it is the first two bytes of each line that seem corrupted. This actually suggests an experiment you might try:

After setting 128 bytes of data starting from 0x80000000 and reading it back. Try just reading it back starting from 0x80000004 and see if you get different values. This might show us if the problem is with the Writes or with the Reads.

If it is with the Read, then setting CR132 to 0x00000203 might be the answer, which would set up the pads one clock cycle early. This does not affect the end timing of the reads, since, for LPDDR2, the PHY will clock in exactly 4 strobes on a 4 byte burst length, and then will ignore everything after that (which is why setting GATE_CLOSE_CFG to maximum works).

Idea #2,

Bit 4 of PHY50 register is a Read FIFO clear bit. Have you tried writing to bit 4, either after the initialization process, or while you are doing you testing? Does this have any effect?

Cheers,

Mark

0 Kudos
1,343 Views
tfe
Contributor V

Hi Mark,

You are correct. The board in question do not have any issues when operating on 400MHz. We have indeed changed the appropriate clock settings before initializing the DRAM, and verified this by physical measurements.

We set the current drive-strength as a result of a simulation done by one of our hardware-engineers. He suggested setting the DRAM chip drive strength to 40 Ohm, and setting the MCU-side drive-strength to 48 Ohm.

We tried reading back with the offset you suggested (Note: TDAL have been set back to 7):

Reading from a 4-byte offset: [GDB] Vybrid LPDDR2@200MHz start+1 - Pastebin.com

Reading from a 2-byte offset: [GDB] Vybrid LPDDR2@200MHz (u16*) start+1 - Pastebin.com

As you can see, the unwritable cullumns do follow the offset.

We also tried making the change to CR132 as you suggested:

[GDB] Vybrid LPDDR2@200MHz RDLAT_ADJ=RDLAT-1 - Pastebin.com

We did however notice something interesting when resetting the FIFO using PHY50[4]:

[GDB] Vybrid LPDDR2@200MHz Reset FIFO - Pastebin.com

As you can see, all bytes are read back as 0x00. We tried adding a FIFO reset just after DRAM init, with the same results.

/Tom

0 Kudos
1,343 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

Interesting, and totally unexpected results for the FIFO reset signal.

You reset the pointers to the READ FIFO and suddenly there is no data on the LPDDR2 memory when there was data there before you reset the pointers. That seems very odd.

So, I do have something else to try:

Turns out that the CTLUPD_AREF field [bit 24] of CR79 is actually supported by the PHY. The IP starts out by saying that the PHY doesn't support it, but then eventually qualifies the statement by saying that the PHY does not support sending back an acknowledgement, but will actually initiate the requested update.

Therefore, I would like you to please try adding to initialization script a line to set CR79 = 0x01000000.

Please let me know how that works. It hasn't been needed up to now, but I guess there is always a first time.

If that completely breaks the LPDDR2 operation, it make be necessary to program CR127 bits [11:8] with some number. This is the minimum number of clock cycles that the update signal is asserted. This is new ground, so I'm not sure what that number would be. It probably isn't necessary, but it would be the next thing to try.

Cheers,

Mark

1,343 Views
tfe
Contributor V

Hi Mark,

Setting CTLUPD_AREF=1 did not seem to make a difference:

[GDB] Vybrid LPDDR2@200MHz CTLUPD_AREF=1 - Pastebin.com

Keeping CTLUPD_AREF=1 and clearing the fifo after DRAM init produced the same results as with CTLUPD_AREF=0

[GDB] Vybrid LPDDR2@200MHz CTLUPD_AREF=1 clear fifo - Pastebin.com

CR127 is labelled as "reserved" in the RM. We'd like some more details on what this field does in order to properly set its value.

/Tom

0 Kudos
1,343 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

We previously reserved the CTLUPD_AREF based on the PHY IP which stated that the PHY did not responde to the signal from the controller. When we did that, we also reserved two fields associated with the CTLUPD signal: One that would return an error if the PHY did not acknowledge after the specified number of clock cycles, and one that would specify the minimum length of the CTLUPD signal.

Obviously the field that returns an error if it does not receive a response from the PHY is no good since the PHY IP already states that if will not send back any acknowledgement of the signal.

The other field is named TDFI_CTRLUPD_MIN. It carries the value of the minimum allowed pulse width of the CTRLUPD signal sent to the PHY for the update. This is CR127 bits [11:8]. Unfortunately, this field is Read Only. So it will not do any good.

With that, I am out of DDR controller ideas on what could be wrong.

What we know:

- It is a Write issue.

- It does not depend on the position within the four Byte burst (x2).

- It appears to be fixed to memory addresses

Because of this, it doesn't look like any timing issue, or FIFO issue, etc.

Can you please provide me a readout of the following registers when configured for 400 MHz and when configured for 200 MHz?

PHY11

PHY12

PHY27

PHY28

Field [15:8] of PHY11 and PHY27 is the DLL lock value.

Field [23:16] of PHY12 and PHY28 is the calculated number of delay elements for the Write DQS delay

Field [7:0] of PHY12 and PHY28 is the calculated number of delay elements for the Read DQS delay

The numbers for the 200 MHz values should be roughly double the numbers for the 400 MHz values.

Not that this would explain the above "what we know", but this would just verify that the DLLs are not having a problem with 200 MHz.

I am wondering if the issue lies outside the DDR controller registers. If the problem could lie:

- Within the ARM cache structure.

- At some interface timing in the NIC and associated interconnect bus.

- With the gdb server. (only because I am unfamiliar with it)

You are loading u-boot on your device to initialize the DDR controller.

Question: Is your cache enabled or disabled?

Do you get different results if you switch the condition of cache (enabled <> disabled)?

As far as I know, you are not running our Processor Expert ddrv tool. Would you be receptive to sending me one of your boards and let me try some experiments using my tools? I have both the J-Link debugger using a gdb server and the ddrv tool, and an ARM DSTREAM which has some bare metal code that works on Vybrid.

Maybe I can see something different at my desk.

Cheers,

Mark

0 Kudos
1,343 Views
tfe
Contributor V

Hi Mark,

Once again thank you for your extremely detailed feedback.

We previously stated that we have several bards, some of which work fine at 200MHz. Thus, we hooked up one of these working boards to our debugger, and updated the DRAM config to reflect the changes you have suggested in this thread. Almost all of the changes that we implemented on the previous board (the non-working one), work on this board, except two:

PHY50: Disabling EN_SW_HALF_CYCLE caused our boot-time memory tests to fail, and the bootloader (U-Boot) failed to relocate to DRAM. From the debugger, we could see that the first four columns of the memory appeared inaccessible.

CR132: Setting RDLAT_ADJ = RDLAT Also caused our boot-time memory tests to fail. The debugger suggest symptoms of addressing issues.

The DRAM config currently look like this:

[GDB] Vybrid LPDDR2@200MHz goodboard ok - Pastebin.com

Following is also our clock setup:

[C] Vybrid LPDDR2@200MHz CCM - Pastebin.com

The Cache is disabled. As far as we know, U-Boot have not implemented the memory cache on Vybrid, and we have not taken the time to implement it ourselves.

We will from this point refer to the working board as the "good board" and the non-working one as the "bad board". Following is a quick summary of the two:

     Bad board: The board used in this thread. Boot-time memory tests fail at 200MHz, but works without issues at 400 MHz

     Good board: LPDDR2 works at both 200 and 400 MHz. The boot-time memory-tests to occasionally fail ( < 10% of the time).

We have compiled some tables for the PHY11/12/27/28 readouts:

PHY11200 MHz400 MHz
Bad board0x00ff_00010x00ff_3f01
Good board0x00ff_0001

0x00ff_3001

PHY12200 MHz400 MHz
Bad board0x0000_00000x0012_000f
Good board0x0000_0000

0x0012_000f

PHY27200 MHz400 MHz
Bad board0x0000_00000x00ff_4001
Good board0x00ff_0001

0x00ff_3001

PHY28200 MHz400 MHz
Bad board0x0000_00000x0013_000f
Good board0x0000_0000

0x0012_000f

We recognize that the values from PHY12/28 at 200MHz seem odd. What might be the cause of this? Previously, we have ad reasonable readouts from these registers, but we are unable to pinpoint what change/changes have lead to this state.

Considering the VDDSS_LDOIN and VDD1P1_OUT (DECAP_V11_LDO_OUT) supplies referenced earlier in the thread: Our HW engineer is saying that there are no noticeable differences between the two boards.

About sending the board to you: This is something we have done in the past, when we sent the first revision of our board to Jiri. Hence we are open to it, but we need to confirm this with our project lead. Could you send me your address as a private message, and I will respond as soon as we have talked to him.

/Tom

0 Kudos
5,876 Views
TheAdmiral
NXP Employee
NXP Employee

Hi Tom,

First, I probably owe you an apology for not thinking about this earlier. I was caught up in trying to figure out how the data was being shifted in the memory.

When I saw your resgister results above, the problem became obvious. The fact that your DLL_LOCK_VALUEs at 200 MHz are all 0x00 indicates that the DLLs thinks it takes zero delay elements to make up a full clock cycle. Then, when it tries to go calculate the number of delay elements for Write and Read DQS delays, whatever number you put in still comes out as 0 delay. DQS strobes are not properly delayed and data doess't get latched in properly. I'm surprised that you actually had a board that worked in this case.

So I talked to the SOC design engineer. There are only 128 delay elements in the DLL. They are roughly 30 picoseconds each, so in full clock mode, the DLL can only handle a period of 3.84 nanoseconds, which equates to a frequency of ~ 260 MHz. Then the SOC engineer pointed out a register field that we had reserved titled PARAM_HALF_CLOCK_MODE, which is used for low frequencies. Apparantely I did not realize what low frequency was when I was working on the last revision and reserved this field.

Setting PARAM_HALF_CLOCK_MODE to 1 lets the DLL sync on only a half clock cycle with its 128 delay elements, so clock periods of 7.68 nanoseconds can then be supported (130 MHz). Even lower frequencies can be supported operating the DLL in bypass, which is a different bit that you don't need to concern yourself with.

So the full solution to your problem is simply to set PHY03/PHY19/PHY35 bit 24 to 'b1.

I think that would make the full register setting = 0x01430115 (note that I cut the DLL starting point in half).

Please give that a try and let me know how it works for you.

Cheers,

Mark

PS

The description of this field will be something like (ie draft)

Determines if the master delay line locks on a full clock cycle or a half clock cycle.

Within the Master DLL there are only 128 delay elements that can be used to determine a lock. For frequencies of operation below 300 MHz, it is necessary to limit the lock period to only a half clock cycle so that the master delay line does not become staturated.

  • For frequencies above 300 MHz, set this bit to 'b0.
  • For frequencies below 300 MHz, set this bit to 'b1.

For both LPDDR2 and DDR3:

  • 'b0 - Master DLL locks on full clock cycle delay
  • 'b1 - Master DLL locks on half clock cycle delay
1,343 Views
tfe
Contributor V

Hi Mark

That did it!

The DRAM is now running flawlessly, and our deep tests are unable to find any issues on the board we have available on location.

Thank you ever so much for your help and persistence.

I am pasting the DRAM config for future Googlers:

[ARM] Vybrid LPDDR2@200MHz - Pastebin.com

Also adding a cherry on top: The last line of our memory test:

Memtest execution complete: Success

/Tom

1,347 Views
tfe
Contributor V

jiri-b36968​, do you have any follow-up?

/Tom

0 Kudos