MX6Q+LPDDR2(32bit) boot issue

raymondwang · ‎06-05-2014

As I mentioned in some threads, we are trying on MX6Q+LPDDR2 combined design. but make no sense why boot failure in mfg bootimage.

We use Mx6DQSDL LPDDR2 Script Aid V0.04.xlsx to generate flashheader DCD part (attached file 6q_lpddr2_32.inc). The Samsung eMCP

DS attched,too.

Our DDR config:

Single Channel 32bit

2CS, each cs is 128Mx32 (totally 1GB)

With default DSE 40ohm (SI configuration in Mx6DQSDL LPDDR2 Script Aid V0.04.xlsx ), 400MHz DDR stress test failure, but it's okay change them

to 34ohm. But with this stress pass DCD, my uboot_mfg can't boot properly (no output in default console).

Please help us!

Original Attachment has been moved to: 6q_lpddr2_32.inc.zip

Original Attachment has been moved to: mx6q_tdh_lpddr2_400_v004.inc.zip

raymondwang · ‎07-23-2014

With help of FSL FAE, now it can boot up properly. Root cause :

We can't gate/ungate 528 pfd2 (FSL's explanation).

int arch_cpu_init(void){

	/* Due to hardware limitation, on MX6Q we need to gate/ungate all PFDs
	* to make sure PFD is working right, otherwise, PFDs may
	* not output clock after reset, MX6DL and MX6SL have added 396M pfd
	* workaround in ROM code, as bus clock need it
	*/
	writel(BM_ANADIG_PFD_480_PFD3_CLKGATE \|
		BM_ANADIG_PFD_480_PFD2_CLKGATE \|
		BM_ANADIG_PFD_480_PFD1_CLKGATE \|
		BM_ANADIG_PFD_480_PFD0_CLKGATE,
		ANATOP_BASE_ADDR + HW_ANADIG_PFD_480_SET);
	writel(BM_ANADIG_PFD_528_PFD3_CLKGATE \|

#if defined(CONFIG_MX6Q) && !defined(CONFIG_LPDDR2)

BM_ANADIG_PFD_528_PFD2_CLKGATE |

#endif

		BM_ANADIG_PFD_528_PFD1_CLKGATE \|
		BM_ANADIG_PFD_528_PFD0_CLKGATE,
		ANATOP_BASE_ADDR + HW_ANADIG_PFD_528_SET);

	writel(BM_ANADIG_PFD_480_PFD3_CLKGATE \|
		BM_ANADIG_PFD_480_PFD2_CLKGATE \|
		BM_ANADIG_PFD_480_PFD1_CLKGATE \|
		BM_ANADIG_PFD_480_PFD0_CLKGATE,
		ANATOP_BASE_ADDR + HW_ANADIG_PFD_480_CLR);
	writel(BM_ANADIG_PFD_528_PFD3_CLKGATE \|

#if defined(CONFIG_MX6Q) && !defined(CONFIG_LPDDR2)

BM_ANADIG_PFD_528_PFD2_CLKGATE |

#endif

		BM_ANADIG_PFD_528_PFD1_CLKGATE \|
		BM_ANADIG_PFD_528_PFD0_CLKGATE,
		ANATOP_BASE_ADDR + HW_ANADIG_PFD_528_CLR);

...

}

在原帖中查看解决方案

TheAdmiral · ‎11-24-2014

Hi Ofer,

Darn. I really thought I had a break through for you.

I am running my board rock solid at 556 MHz. It will then work with intermittent errors up to 571 MHz before it won't pass data at all at 576 MHz.

The board I am using is an internal Freescale validation board we used for validating the LPDDR2 interface. It is not available to the general public.

I am atttaching the *inc file I am using to get these results. The Micron chip listed in the file is the one I am using.

If you send back the script you are using, I can take another look to see if there is anything else I can think of.

The problem at this point is trying to find that one parameter that is hanging things up. Unfortunately I don't have any secret recipe for making that easier. It just has to be brute force at this point, adjusting things one at a time.

If you have JTAG capability, you can take a look at the DRAM memory directly. This might help you determine if the problem is with a Write or a Read, or might help you zero in on a particular byte lane. If you need the .elf file for use with a JTAG debugger, I can send you that file.

One other general insight: The tricky thing with LPDDR2 is that the CA traces ared strobed on both the rising and falling edges of the clock, so there is a certain amount of alignment of the clock signals to the CA traces required. The traces lengths need to be matched pretty well, and the clock signal must be aligned to transition in the middle of the valid CA signal (which is the purpose of the CA ABS Delay field setting). Unfortunately, there is no calibration routine for aligning this clock edge. It was the registers that adjust this alignment that I was trying to change with my recommendations.

Cheers,

Mark

oferfederovsky · ‎11-26-2014

Hi Mark,

Following your note regarding the CA traces, we examined the CA traces in our layout, and found that all CA nets are pretty much matched in length, but the CKA net in longer by ~50MIL.

We tried to play with the resistance of the CA lanes, but it didn't help.

We think that the best values for us is 34.4Ohms.

Our latest configuration (*inc file) is attached.

I guess we did manage to progress a bit.

We got back with playing with register 8c0 (MPDCCR), and tried different duty-cycles at CK_FTx_DCC fields.

At this point the Freescale calibration/test gave much better results: up to 576MHz (including) with no errors at all, and it ran in loop overnight.

But,

When we use this configuration on Linux, we still see problems.

On Linux 3.0.35, the stress test we are running still fails.

This is how it looks at the memtester (version 4.3.0 (32-bit)):

Loop 175/1000:

Stuck Address : ok

Random Value : ok

Compare XOR : ok

Compare SUB : ok

Compare MUL : ok

Compare DIV : ok

Compare OR : ok

Compare AND : ok

Sequential Increment: ok

Solid Bits : ok

Block Sequential : ok

Checkerboard : ok

Bit Spread : ok

Bit Flip : testing 8FAILURE: 0xfffffffd != 0xffffffed at offset 0x000cf800.

Walking Ones : ok

Walking Zeroes : ok

8-bit Writes : ok

16-bit Writes : ok

Almost each loop had 1 failure. The complete test result is attached.

Can we completely trust this calibration/stress tool?

If we can, what could be still wrong at our system?

I wonder if there is something wrong with our kernel configuration related to changing the MMDC frequency.

I tried to use kernel 3.10.17, but for some reason it didn't complete boot-up, unless I configured the MMDC in u-boot back to 400MHz.

Do you have any idea why?

Is there some configuration I must change in the kernel in order to change MMDC clock frequency?

In 3.0.35 I disabled the writes to the relevant clock selectors. Now I wonder if this is ok.

diff --git a/arch/arm/mach-mx6/clock.c b/arch/arm/mach-mx6/clock.c

index 48d3999..9067bbe 100644

--- a/arch/arm/mach-mx6/clock.c

+++ b/arch/arm/mach-mx6/clock.c

@@ -1390,12 +1390,12 @@ static int _clk_periph_set_parent(struct clk *clk, struct clk *parent)

reg = __raw_readl(MXC_CCM_CBCMR);

reg &= ~MXC_CCM_CBCMR_PRE_PERIPH_CLK_SEL_MASK;

reg |= mux << MXC_CCM_CBCMR_PRE_PERIPH_CLK_SEL_OFFSET;

- __raw_writel(reg, MXC_CCM_CBCMR);

+/* __raw_writel(reg, MXC_CCM_CBCMR); */

/* Set the periph_clk_sel multiplexer. */

reg = __raw_readl(MXC_CCM_CBCDR);

reg &= ~MXC_CCM_CBCDR_PERIPH_CLK_SEL;

- __raw_writel(reg, MXC_CCM_CBCDR);

+/* __raw_writel(reg, MXC_CCM_CBCDR); */

} else {

reg = __raw_readl(MXC_CCM_CBCDR);

/* Set the periph_clk2_podf divider to divide by 1. */

BTW,

When I'm running the memtester, I'm also running a tool in the background that adds stress to the memory.

Thanks again for you effort!

Waiting to hear from you.

Regards,

Ofer

asim_zaidi · ‎12-05-2014

Doron

Thanks for the update. We were really hoping you would resolve your instability issues with the additional capacitors.

Please find attached a presentation on the Capacitor placement guidelines.

We have only done an internal validation board with LPDDR2 and a not a proper reference design like we do for DDR3.

We can share the LPDDR2 validation board information if you need it but keep in mind it was designed validate the LPDDR2 and to check signal integrity issues rather than optimizing for a product type layout. Typically, all our internal validation boards are 14 layer boards. This is done mostly to be able to split out different power planes so that current/power readings can be taken on individual rails. We then follow up the validation boards with reference boards for the different market segments (ie, Smart Devices, Automotive) that are 8-layer boards. For i.MX6DQ this LPDDR2 reference board was not designed.

We can supply:

1. Schematics is both source (OrCAD 16.3) and PDF.

2. Generic HW design guides for i.MX6 processors. The one for i.MX6SL has more information on LPDDR2.

3. For layout instructions, same as #2.

4. Layout file source (in Allegro 16.3)

· On another note can you also confirm that you have memory that is rated to 533/1066 MHz, since the part number supports lower speed grades as well.

· We also have a more intensive DDR stress that which includes more memory algorithms that you can try.

Regards

Asim

DoronWeizman · ‎12-05-2014

Hi,

We soldered the bulk capacitors on the decoupling you asked very close and without wires at all. As you an see in the layout the decoupling caps are 0402 and not 0201 so it was much easy to solder the bulk 22uF 0603 on closed pair (with same voltage) decoupling 0402 in diagonal.So I confirmed it was very good connectivity without any antenna on all the voltages lines you recommended as well as the 10uF on the NVCC_PLL_OUT.

Regarding the traces length. we based on the the FSL layout reference design for IMX6Q including LPDDR2. there the maximum length was 1400mils. BTW , We consulted with FSL before we produce this board and after we made the SI simulations and I have to tell you we didn't receive any comments regarding traces length in our design. In addition the SI expert that ran the post layout simulation commented that the ddr drive strength of the imx6 (based on the ibis model) is weak and even we will short the traces the results will not be better.

However this is history and I agree with you it could be a problem but I want your confirmation that if We will change the layout to what you recommend the problem will disappear. It will be much better if you can share with us the documentation of your reference board with the imx6q and LPDDR2 and its SI simulation when running at 528MHz.

Ofer will answer regarding the drive strength settings during the tests.

Ofer will try to reduce the VDD_SOC up to minimum and retest it.

Thank you very much for your help !

Doron Weizman

asim_zaidi · ‎12-04-2014

Thanks for your response.

We are still concerned of the bulk caps on the edge of the processor.

Can you confirm that on a board that is experiencing failures, add 10/22uF caps to VDD_ARM, VDD_SOC and DRAM_1.2V directly under the SOC. (no wires, the caps must be stacked on top of an existing 0201 cap)

Can you also confirm if a 10uF was added to the NVCC_PLL_OUT

The LPDDR2 traces are approx. 1.6” away. This is long enough that it can’t really be considered a lumped circuit and you might be seeing reflections.

Remember that LPDDR2 does not use termination.

Can you confirm that they reduced the drive strength settings on the i.MX6 as Mark recommended and the drive settings on the LPDDR2.

One other option to try is to reduce the VDD_SOC voltage to see if this reduces the noise/jitter in the system.

DoronWeizman · ‎12-04-2014

Hi Mark,

We made your recommendations with the bulk capacitors and unfortunately we still see ddr falls during the stress test that Ofer run.

Thanks,

Doron

TheAdmiral · ‎12-03-2014

Hi Ofer,

I took a quick look at your schematic and layout files. Two things jumped out at me:

1) Bulk capacitance on NVCC_PLL_OUT. The Hardware User Guide recommends a minimum value of 10 uF, and I actually have a 22 uF capacitor on the SABRE SDB board. NVCC_PLL_OUT is the pin connected to the output side of the LDO_1p1 internal regulator which supplies a number of critical items in the SNVS domain: Chief among them is the clock generating block (CCM). Without the bulk capacitor, you run the risk of increased jitter on all of your clocks. Jitter is a very likely suspect in this case. It would tend to explain why one frequency would work better than another, and the effects of jitter become more pronounce with higher frequency operations. It may be that with the Linux kernal running, the increased loading on NVCC_PLL_OUT is causing issues that are not seen with the DDR Stress test, due to the presence of higher jitter.

Recommendation: Try soldering a 10 uF capacitor on top of C242 underneath the processor and see if this makes your errors go away.

Jitter is cumulative and can be also be decreased by lowering the drive strengths on your DDR pads. I don't think the high drive strengths you are using are causing the problems by themselves, but if you lower DSE to a setting of 0x28, you might see some additional improvement if jitter is your only problem.

2) One bulk capacitors for each of VDDARM_CAP, VDDSOC_CAP, VDDPU, and LPDDR_1V2 should be place directly underneath the processor. (The Hardware User Guide recommends a maximum distance of 50 mils to processor pin. Testing has shown that outside the boundaries of the processor is really too far away. That was specifically on the VDDSOC_CAP that was causing problems, but the bulk cap for LPDDR_1V2 being far away would definately be a major cause of jitter on the DDR pads, coupled with the high drive strengths. VDDSOC_CAP actually supplies the logic levels for the MMDC signals, so increased demand on VDDSOC_CAP could be causing problems in the MMDC.

Recommendation: Try soldering a 22 uF capacitor on top of the following decoupling caps:

- C171

- C213

- C222

- C131

I think if you try adding these capacitors, you should be able to tell very quickly if they are helping or not.

Cheers,

Mark

oferfederovsky · ‎12-03-2014

Hi Asim,

I answered some of you questions, please see inline.

asimzaidi wrote:

Hi Ofer

Thanks for your responses. We still need to understand why your boards pass the FSL DDR stress test yet fail in your application. We had an internal meeting to discuss the issues you are encountering and came up with some further questions/experiments.

Memory Testing

·        The failure you posted below is strange stating that the complete background pattern word was incorrect. Is this indicating that DDR was reading all 0’s instead of all F’s and vice versa, for multiple consecutive addresses. Is this consistently reproducible and what memory test (Bit Flip or other) is reporting this?

o If the Entire word is wrong or random this may indicate some issue with address and/or command signals.

YES, this type of failure is repeating (but not in every run). It happens when running "Solid Bits" and "Bit Flips".

I also noticed that when it happens, it happens in a burst of 4 or 8 failures, for example:

Loop 17/1000:

Stuck Address       : ok

Random Value        : ok

Compare XOR         : ok

Compare SUB         : ok

Compare MUL         : ok

Compare DIV         : ok

Compare OR          : ok

Compare AND         : ok

Sequential Increment: ok

Solid Bits          : ok

Block Sequential    : ok

Checkerboard        : ok

Bit Spread          : ok

Bit Flip            : testing 147FAILURE: 0xfffbffff != 0x00040000 at offset 0x00f8d440.

FAILURE: 0x00040000 != 0xfffbffff at offset 0x00f8d444.

FAILURE: 0xfffbffff != 0x00040000 at offset 0x00f8d448.

FAILURE: 0x00040000 != 0xfffbffff at offset 0x00f8d44c.

Walking Ones        : ok

Walking Zeroes      : ok

8-bit Writes        : ok

16-bit Writes       : ok

Loop 18/1000:

Stuck Address       : ok

Random Value        : ok

Compare XOR         : ok

Compare SUB         : ok

Compare MUL         : ok

Compare DIV         : ok

Compare OR          : ok

Compare AND         : ok

Sequential Increment: ok

Solid Bits          : testing 32FAILURE: 0x00000000 != 0xffffffff at offset 0x00b60f9c.

FAILURE: 0xffffffff != 0x00000000 at offset 0x00b60fa0.

FAILURE: 0x00000000 != 0xffffffff at offset 0x00b60fa4.

FAILURE: 0xffffffff != 0x00000000 at offset 0x00b60fa8.

Block Sequential    : ok

Checkerboard        : ok

Bit Spread          : ok

Bit Flip            : ok

Walking Ones        : ok

Walking Zeroes      : ok

8-bit Writes        : ok

16-bit Writes       : ok

·          You have previously confirmed the DDR settings in the stress test initialization and UBOOT are the same. Can you please read out the MMDC registers after your DDR initialization in UBOOT.

                    Do you mean: dump the relevant registers from the u-boot prompt?

                    What I did previously, was to dump the relevant registers after the system was up, from kernel using memtool. Isn't it even better?

                    (Because that rules out the possibility that the kernel does something wrong.)

o We would like to confirm whether the registers you programmed are correctly and match the DDR stress test initialization ?

o Another similar experiment would be to run the FSL DDR stress test after UBOOT has initialized by attaching with JTAG.

·        Byte wise failures are usually indicative of a problem with the DQS signals. Either the DQS signals have too slow a rise/fall time, or there is a glitch or over/under shoots (signal integrity issues).

o It will be helpful to test over temperature per above to assist in narrowing down the issue with the DQS signals

o Ideally we recommend using calibration values which are a mean of multiple boards and temperatures

·        Do you see similar behavior/failures using both fixed and interleaved modes ?

                    Since we understood that we can't use 64-bit for LPDDR2, we switched to Interleaving mode, and haven't tried Fixed mode.

Clocking

·        Is your system changing the DDR frequency or does the system boot up and stay at 528 MHz?

                    No, u-boot sets the DDR frequency, and the kernel doesn't modify MXC_CCM_CBCMR and MXC_CCM_CBCDR.

                    I used this patch in order to achieve that in 3.0.35 4.1.0:

diff --git a/arch/arm/mach-mx6/clock.c b/arch/arm/mach-mx6/clock.c

index 48d3999..9067bbe 100644

--- a/arch/arm/mach-mx6/clock.c

+++ b/arch/arm/mach-mx6/clock.c

@@ -1390,12 +1390,12 @@ static int _clk_periph_set_parent(struct clk *clk, struct clk *parent)

                reg = __raw_readl(MXC_CCM_CBCMR);

                reg &= ~MXC_CCM_CBCMR_PRE_PERIPH_CLK_SEL_MASK;

                reg |= mux << MXC_CCM_CBCMR_PRE_PERIPH_CLK_SEL_OFFSET;

-               __raw_writel(reg, MXC_CCM_CBCMR);

+/*             __raw_writel(reg, MXC_CCM_CBCMR); */

                /* Set the periph_clk_sel multiplexer. */

                reg = __raw_readl(MXC_CCM_CBCDR);

                reg &= ~MXC_CCM_CBCDR_PERIPH_CLK_SEL;

-               __raw_writel(reg, MXC_CCM_CBCDR);

+/*             __raw_writel(reg, MXC_CCM_CBCDR); */

        } else {

                reg = __raw_readl(MXC_CCM_CBCDR);

                /* Set the periph_clk2_podf divider to divide by 1. */

·        You had previously stated some issues in setting the DDR clock. Please refer to the following thread:

https://community.freescale.com/thread/306143

·        If modifying the DDR clock are you changing any other system clocks ?

                   No. I'm using this for 480MHz:

MXC_DCD_ITEM(1, CCM_BASE_ADDR + 0x14, 0x2018D00) // 480MHz

MXC_DCD_ITEM(2, CCM_BASE_ADDR + 0x18, 0x20324)   // 480MHz

   And keep the default for 528MHz.

HW Checks

·        To rule out any possible HW issues we would like to ensure that power supply and decoupling network on your board is correct . We are in the process of reviewing your provided design files as well.

·        Can you confirm if your board design meets the FSL decoupling requirements for the VDD_SOC and other domains as outlined in the i.MX6 HW users guide

o We have seen poor power delivery network can issues when stressing the part with higher instantaneous current requirements

                    YES, we reviewed our design, and it looks good. (Re: Re: ORCAM IPU/LPDDR2 Issues).

·        Could you try increasing the VDD_SOC domain as well as the 1V8 and LPDDR_1V2_DDR to see if this has any impact.

                    We increased the VDD_SOC to 1.375v and we noticed no influence. We didn't change 1V8 and LPDDR_1V2_DDR.

·        Do you have any boards using a different memory vendor just to rule out any DDR memory issues ?

                    Yes, we have some, but in the past we got poor results with them. We will try them again.

Errata Check

Can you please confirm that the BSP/kernel you are using has the patch for the following issue:

ERR003740 ARM/PL310: 752271—Double linefill feature can cause data corruption: only workaround to this erratum is to disable the double linefill feature.

/*

120          * The L2 cache controller(PL310) version on the i.MX6D/Q is r3p1-50rel0

121          * The L2 cache controller(PL310) version on the i.MX6DL/SOLO/SL is r3p2

122          * But according to ARM PL310 errata: 752271

123          * ID: 752271: Double linefill feature can cause data corruption

124          * Fault Status: Present in: r3p0, r3p1, r3p1-50rel0. Fixed in r3p2

125          * Workaround: The only workaround to this erratum is to disable the

126          * double linefill feature. This is the default behavior.

127          */

128        if (!cpu_is_mx6q())

129                val |= 0x40800000;

130        writel(val, IO_ADDRESS(L2_BASE_ADDR + L2X0_PREFETCH_CTRL));

131

132        val = readl(IO_ADDRESS(L2_BASE_ADDR + L2X0_POWER_CTRL));

133        val |= L2X0_DYNAMIC_CLK_GATING_EN;

134        val |= L2X0_STNDBY_MODE_EN;

135        writel(val, IO_ADDRESS(L2_BASE_ADDR + L2X0_POWER_CTRL));

136

137        l2x0_init(IO_ADDRESS(L2_BASE_ADDR), 0x0, ~0x00000000);



Confirmed - I found the marked code in arch/arm/mach-mx6/mm.c

We realize that this is a slow and painful debug exercise but we are hopeful that we will discover the root cause of your instabilities.

Regards

Asim

Thanks!

Ofer F.

ORCAM

asim_zaidi · ‎12-03-2014

Hi Ofer

Thanks for your responses. We still need to understand why your boards pass the FSL DDR stress test yet fail in your application. We had an internal meeting to discuss the issues you are encountering and came up with some further questions/experiments.

Memory Testing

· The failure you posted below is strange stating that the complete background pattern word was incorrect. Is this indicating that DDR was reading all 0’s instead of all F’s and vice versa, for multiple consecutive addresses. Is this consistently reproducible and what memory test (Bit Flip or other) is reporting this?

o If the Entire word is wrong or random this may indicate some issue with address and/or command signals.

· You have previously confirmed the DDR settings in the stress test initialization and UBOOT are the same. Can you please read out the MMDC registers after your DDR initialization in UBOOT.

o We would like to confirm whether the registers you programmed are correctly and match the DDR stress test initialization ?

o Another similar experiment would be to run the FSL DDR stress test after UBOOT has initialized by attaching with JTAG.

· Byte wise failures are usually indicative of a problem with the DQS signals. Either the DQS signals have too slow a rise/fall time, or there is a glitch or over/under shoots (signal integrity issues).

o It will be helpful to test over temperature per above to assist in narrowing down the issue with the DQS signals

o Ideally we recommend using calibration values which are a mean of multiple boards and temperatures

· Do you see similar behavior/failures using both fixed and interleaved modes ?

Clocking

· Is your system changing the DDR frequency or does the system boot up and stay at 528 MHz?
· You had previously stated some issues in setting the DDR clock. Please refer to the following thread:
- https://community.freescale.com/thread/306143

· If modifying the DDR clock are you changing any other system clocks ?

HW Checks

· To rule out any possible HW issues we would like to ensure that power supply and decoupling network on your board is correct . We are in the process of reviewing your provided design files as well.

· Can you confirm if your board design meets the FSL decoupling requirements for the VDD_SOC and other domains as outlined in the i.MX6 HW users guide

o We have seen poor power delivery network can issues when stressing the part with higher instantaneous current requirements

· Could you try increasing the VDD_SOC domain as well as the 1V8 and LPDDR_1V2_DDR to see if this has any impact.

· Do you have any boards using a different memory vendor just to rule out any DDR memory issues ?

Errata Check

Can you please confirm that the BSP/kernel you are using has the patch for the following issue:

ERR003740 ARM/PL310: 752271—Double linefill feature can cause data corruption: only workaround to this erratum is to disable the double linefill feature.

/*

120 * The L2 cache controller(PL310) version on the i.MX6D/Q is r3p1-50rel0

121 * The L2 cache controller(PL310) version on the i.MX6DL/SOLO/SL is r3p2

122 * But according to ARM PL310 errata: 752271

123 * ID: 752271: Double linefill feature can cause data corruption

124 * Fault Status: Present in: r3p0, r3p1, r3p1-50rel0. Fixed in r3p2

125 * Workaround: The only workaround to this erratum is to disable the

126 * double linefill feature. This is the default behavior.

127 */

128 if (!cpu_is_mx6q())

129 val |= 0x40800000;

130 writel(val, IO_ADDRESS(L2_BASE_ADDR + L2X0_PREFETCH_CTRL));

131

132 val = readl(IO_ADDRESS(L2_BASE_ADDR + L2X0_POWER_CTRL));

133 val |= L2X0_DYNAMIC_CLK_GATING_EN;

134 val |= L2X0_STNDBY_MODE_EN;

135 writel(val, IO_ADDRESS(L2_BASE_ADDR + L2X0_POWER_CTRL));

136

137 l2x0_init(IO_ADDRESS(L2_BASE_ADDR), 0x0, ~0x00000000);

We realize that this is a slow and painful debug exercise but we are hopeful that we will discover the root cause of your instabilities.

Regards

Asim

oferfederovsky · ‎12-02-2014

Hi Asim,

Please see inline:

asimzaidi wrote:

Hi Ofer

   Can you confirm the following please:

·         Issue is seen on multiple boards and all exhibit the same bit failures ?     YES

·         The bit showing the failure is same on all HW ?     NO, here is another example of failures:



FAILURE: 0xffffffff != 0x00000000 at offset 0x00d272a0.

FAILURE: 0x00000000 != 0xffffffff at offset 0x00d272a4.

FAILURE: 0xffffffff != 0x00000000 at offset 0x00d272a8.

FAILURE: 0x00000000 != 0xffffffff at offset 0x00d272ac.

FAILURE: 0xffffffff != 0x00000000 at offset 0x00d272b0.

FAILURE: 0x00000000 != 0xffffffff at offset 0x00d272b4.

FAILURE: 0xffffffff != 0x00000000 at offset 0x00d272b8.

·         Issue is seen on multiple boards and all exhibit the same bit failures ? YES, failures are on multiple boards, but there are many kinds of failures.

·         The bit showing the failure is same on all HW ?     NO.

·         The actual clock frequency of the DDR clock – is it truly at 528 MHz ?     YES (or 480MHz, we are trying both frequencies).

·         The Linux Memtester application is running correctly at 400 MHz ?     YES.

o   Do you still see the issue when not running a tool in the background that adds stress to the memory. Could this be corrupting the Memtester application ?     YES.

·         Are you seeing any issues when running the Linux OS (not memtester) at 528 MHz?     YES, hangs, exceptions etc.

·         Any temperature or voltage dependencies on the failure     I'm not sure yet, we are trying to identify dependencies.

·         Have you tried increasing the drive strength for the failing bits (bye segment)     NO. I don't think it is relevant, since we are seeing multiple failures at total (we are using more than one board now).

Regards

Asim

Thanks,

Ofer

asim_zaidi · ‎11-27-2014

Hi Ofer

Can you confirm the following please:

· Issue is seen on multiple boards and all exhibit the same bit failures ?

· The bit showing the failure is same on all HW ?

· The actual clock frequency of the DDR clock – is it truly at 528 MHz ?

· The Linux Memtester application is running correctly at 400 MHz ?

o Do you still see the issue when not running a tool in the background that adds stress to the memory. Could this be corrupting the Memtester application ?

· Are you seeing any issues when running the Linux OS (not memtester) at 528 MHz?

· Any temperature or voltage dependencies on the failure

· Have you tried increasing the drive strength for the failing bits (bye segment)

Some pointers below for Bit wise failures

− Normally indicates one or more data lines experiencing glitch due to signal integrity issues (too slow rise/fall time, or too fast rise and fall time attributing to ringing)

− Varying temperature is one method to narrow down the root cause

− If cooling down the part causes more failures, then it is likely the drive strength is too high causing more overshoots and undershoots

− If heating up the part causes more failures, then the drive strength is too low and the signals may not rise/fall fast enough

− Playing around with drive strengths often help (start with i.MX side and then try DRAM side)

Regards

Asim

oferfederovsky · ‎11-27-2014

Hi Mark,

I appreciate your effort to help, but I am not as glad as you are.

Button line, we can't really work with the LPDDR2 at 528MHz, and I can't see how it can be a software problem:

I tried a different u-boot version (2015).
I tried a different kernel version (3.10.17).
I verified that all of the registers configured in the DCD at u-boot, are still containing the same values under Linux.
The same test setup shows no failures when running at 400MHz.
The test results imply there is a problem with one specific bit.

I don't really have any ideas about how to progress with this issue.

Thanks,

Ofer

TheAdmiral · ‎11-26-2014

Hi Ofer,

I am very glad to hear that you are now successfully passing data up to 576MHz. I am hoping that you are stressing all of the available physical memory space.

I am sorry that this has taken so long, but I think I warned you up front that this was a methodical, one step at a time process.

I see that you are using the maximum values in the MPWRCADL registers. I think this may be one of the keys. I am a little concerned that you are using a value of 0x24911249 in the MPDCCR registers. My concern is with the chip version number you are using. If you are using an i.MX 6Q processor that ends in "C" or later, there should be no problem. If you are using a processor that ends in "A" or "B", then you may have an issue when switching to the "C" version. The "issue" may require you to readjust the MPDCCR register again. I think it is important just to keep this in mind.

So, to clearly state where you are now: You have register settings that work with the DDR Stress Test. Those same register settings now reliably boot u-boot and the linux kernal and you are seeing stable operations in the linux kernal. The issue is that now, the memtester which should work and show no memory issues, is actually showing problems. [I think that is significant progress.]

I think I also warned you that I am a hardware engineer, and now that we are talking about testing under the Linux environment, you have reached the limit of my expertise.

One of the greatest advantages of the DDR Stress test is that it does not come with the baggage of an operating system and it runs out of OCRAM. So there is never any danger of overwriting memory spaces, either spaces that contain running code, or having a process come in and overwrite an area that you were just testing. I understand that is the purpose of the memlock, but I am always skeptical. I believe that the stress test is very good at what it does. There are some additional tests, not normally compiled, that do flesh out some really odd ball errors. But in four years, I have not seen a need to run one of these additional tests. The problem is that someone would have to look at the source and make sure that these odd ball tests would work in the "interleave" mode.

One idea that I had: Have you tried specifying a physical area to test by setting the -p flag? For example, do you get the same errors if you specify memtester 50 1000 -p 0x50000000? Do you know exactly the physical address region where the Linux Kernal is stored?

The other idea that I have: Does this memtester utility work on a design that uses the same Linux kernal, but is a DDR3 design? (with the same additional utility running in the background?)

I am going to be out of the office for the rest of the week, but if you are still having trouble on Monday, maybe we can find you someone from the software side to help you.

Cheers,

Mark

davidroach · ‎11-12-2014

Hi Mark,

I stumbled across this thread while revisiting our LPDDR2 timing and thought I could add some clarity on what I experimentally determined about interleave mode. This is the explanation I wrote up for our software engineers:

Hi Guys,

OK, I think I’ve got the whole interleaving thing figured out. The i.MX6 has two fully independent memory controllers for LPDDR2, each driving a 32-bit bus and each having 2 chip selects so that each channel could have two memory chips populated. We are using only CS0 on each channel.

The memory addresses assigned to each channel are fixed: 0x10000000-0x7FFFFFFF on Channel 1 and 0x80000000-0xFFFFFFFF on Channel 0. The address range assigned to CS0 on each channel is determined by the value in MMDCx_MDASP. The 7 MSBs of the internal 32-bit address are compared to the value in this register and if the comparison is less than or equal, CS0 gets activated. Otherwise it’s CS1. For example, on our design there are two 512MB memory chips, each on CS0 on each channel. Channel 1 CS0 address range would be 0x10000000-0x2FFFFFFF and Channel 0 CS0 would be 0x80000000-0x9FFFFFFF. The corresponding register settings would be MMDC1_MDASP = 0x00000017 and MMDC0_MDASP=0x0000004F.

The above case is for non-interleaved LPDDR2. Interleaving is done by sending memory accesses to a channel based on 4KB pages, determined by bit 12 of the internal address. Even-numbered pages go to MMDC0 while odd-numbered pages go to MMDC1. The total addressable memory of this arrangement is the same as non-interleaved: 0x10000000-0xFFFFFFFF. But the base address is fixed at 0x10000000.

The chip select comparision logic in each channel gets rearranged somehow to compensate for the redirected memory accesses due to interleaving, but it’s not a perfect remapping. With 512MB chips on CS0 on each channel, MMDC1_CS0 crosses a binary boundary at the first 256MB region, going from 0x1FFFFFFF to 0x20000000. This does not occur in MMDC0 at 0x8FFFFFFF to 0x90000000. With the two channels interleaved, this crossing would occur at 0x3FFFFFFF-0x40000000.

In order to get the chip selects to properly respond to the internal addresses in an interleaved LPDDR2 configuration, the chip select comparison logic must be lied to about how much memory is attached to CS0. Specifically, our case requires setting MMDC0_MDASP to 256MB larger than the actual memory chip, so that the chip select comparison logic will alias the internal addresses from 0x40000000-0x4FFFFFFF to CS0 on each channel.

Bottom line: To enable interleaving, use the settings obtained from the MX6Q_MMDC_LPDDR2_programming_aid_v0.6 spreadsheet and modify as follows:

1) Remove the line that sets address 0x00b00000.

2) Add:

mw 0x021bc000 0x3e770005 1

mw 0x021bc450 0x00200000 1

3) Change the setting for register 0x021b0040 to 0x53.

TaDa! A full 1GB array is present from 0x10000000-0x4FFFFFFF.

If you'd like more detail about our specific design, including the DCD settings, let me know.

Regards,

Dave

oferfederovsky · ‎11-13-2014

Hi Dave,

Sounds interesting.

Where can I get MX6Q_MMDC_LPDDR2_programming_aid_v0.6?

We are using this Mx6DQSDL LPDDR2 Script Aid V0.04.xlsx (from i.Mx6DQSDL LPDDR2 Script Aid).

This excel script already sets 0x53 to 0x021B0040.

Are you working at 528MHz frequency?

I'll be happy to review the DCD setting you are using, it might give us hints about whats we are doing wrong.

Thanks,

Ofer

raymondwang · ‎06-10-2014

I tried the iram application, my console print following messages:

**************************************************************************

Platform SDK (1.1) for MX6DQ TO1.2 Smart Device (SD) rev. C

Build: Jun 10 2014, 16:04:15

**************************************************************************

Oops, prefetch abort occurred!

Registers at point of exception:

cpsr = nZCvqeAift Supervisor (0x60000113)

r0 = 0x00000000 r8 = 0x00902490

r1 = 0x00937248 r9 = 0x00000094

r2 = 0x00000001 r10 = 0x00000000

r3 = 0x00000000 r11 = 0x0092d230

r4 = 0xdeadfeed r12 = 0x0092d234

r5 = 0x009123b0 sp = 0x0092d21c

r6 = 0x00000000 lr = 0x00912414

r7 = 0x009024b0 pc = 0x0091241c

ifsr = 0x00000002

ifar = 0x4259388a

Fault status: 0x2

I add following lines in iram main loop.

while(1);

oops disappeared.

I add following lines to dump clock details:

show_freq();

printf("HW_CCM_CBCMR_RD()=0x%x\n",HW_CCM_CBCMR_RD());

And output as below:

========== Clock frequencies ===========

CPU: 792000 kHz

DDR: 528000 kHz

IPG: 66000 kHz

Debug UART: 80000000 Hz

========================================

HW_CCM_CBCMR_RD()=0x20324

So CCM_CBCMR register always is default value, setting it to 0x60324 in DCD does not work.

What can affect this ?

raymondwang · ‎06-09-2014

I am not familiar on linux or mx6 SDK. But I reviewed sdk source tree, no instinct difference between SDK and UBOOT stand for

board dcd configuration.

raymondwang · ‎06-06-2014

Dear igorpadykov

1.I use different profile to test my mfg uboot.

<CMD state="BootStrap" type="boot" body="BootStrap" file ="u-boot-mx6q-tdh-lpddr2_nopad.bin" >Loading U-boot</CMD>

<CMD state="BootStrap" type="jump" > Jumping to OS image. </CMD>

</LIST>

2.Following your instruction, I cut first 1k bytes of my mfg uboot

dd if=u-boot-mx6q-tdh-lpddr2.bin of=u-boot-mx6q-tdh-lpddr2_nopad.bin bs=1024 skip=1

142+1 records in

142+1 records out

146332 bytes (146 kB) copied, 0.00187831 s, 77.9 MB/s

Still there is no message output in uart console.

igorpadykov · ‎06-06-2014

Hi Raymond

could you try to rebuild just Linux Uboot

(not Android) and try ?

L3.0.35_4.1.0_ER_SOURCE_BSP

Actually does board output anything ?

Did you try SDK ?

i.MX 6Series Platform SDK :

Best regards

chip

raymondwang · ‎06-06-2014

Actually, there is nothing output in uart1 (1st uart). Are you sure that rebuild uboot source useful?

igorpadykov · ‎06-07-2014

Hi Raymond

suggest to start with SDK and oscilloscope

i.MX 6Series Platform SDK :

Best regards

chip

	writel(BM_ANADIG_PFD_480_PFD3_CLKGATE \|
		BM_ANADIG_PFD_480_PFD2_CLKGATE \|
		BM_ANADIG_PFD_480_PFD1_CLKGATE \|
		BM_ANADIG_PFD_480_PFD0_CLKGATE,
		ANATOP_BASE_ADDR + HW_ANADIG_PFD_480_CLR);
	writel(BM_ANADIG_PFD_528_PFD3_CLKGATE \|

MX6Q+LPDDR2(32bit) boot issue

MX6Q+LPDDR2(32bit) boot issue

i.MX6Dual

i.MX6Quad