Executing from LPDDR2

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Executing from LPDDR2

4,385 Views
chrispage
Contributor II

I am currently using a Micron LPDDR2 (MT42L32M16D1) part.  I have it configured and have been able to run a memory test executing from internal SRAM that can write and read from the LPDDR2 memory without any errors.  When I go to execute out of the LPDDR2, I am receiving errors(chip is going off into the weeds) upon execution of any instruction that utilizes the stack(push, bl, etc.).  If I adjust the TBST_INT_INTERVAL from 2 to 4, I am able to use a debugger and step through these instructions without error, but if I attempt to run without stepping, I see the crash again.  Adjusting the interval more in either direction does not help (if anything makes it worse). Any idea which DRAM configurations/settings would be causing this problem?  Thanks.

Tags (1)
26 Replies

2,043 Views
jiri-b36968
NXP Employee
NXP Employee

2,043 Views
chrispage
Contributor II

I have looked at the example and have been able to read/write the memory when executing from internal SRAM.  I am running at 400MHz instead of 200MHz, so many of my parameters have been changed based on my LPDDR2's data sheet.  I am looking more specifically for areas to investigate(which register configurations I should be focusing on) as to why reading/writing to memory from internal memory works yet executing from the external memory gives me errors when running at full speed.  As well as more information as to what TBST_INT_INTERVAL matches up with on memory data sheets since that is the only parameter adjustment that seems to cause a slight change in what I am seeing.

Thanks,

Chris

0 Kudos

2,043 Views
naoumgitnik
Senior Contributor V

Hello Chris, Do you mean your problems started when you increased the LPDDR2 speed from 200 to 400 MHz? Regards, Naoum Gitnik.

0 Kudos

2,043 Views
chrispage
Contributor II

Unfortunately running at 200MHz using the example code did not work at all.  The parts must be different enough that the example code is not compatible.

0 Kudos

2,043 Views
naoumgitnik
Senior Contributor V

Hello Chris,

Please, also take a look at the Processor Expert tool released for Vybrid - on the product pages, under Software and Tools -> Updates and Patches.

I don't have any personal experience of it, but it claims to have a DDR configuration wizard included.

Regards, Naoum Gitnik.

0 Kudos

2,043 Views
chrispage
Contributor II

Thanks for all the information.  I am currently trying various configurations to try to narrow down configurations in which this problem is being seen and I will post those once I have completed all my testing.

0 Kudos

2,043 Views
chrispage
Contributor II

I have tried matching up my configuration with all of the available examples and I still see the original problem.  The only configuration parameter that seems to effect the way the error is seen is TBST_INT_INTERVAL.  Do you have any specifics about what this configuration parameter does?  The reference sheet is not very specific.  Also is there any way I could get the rest of the code that was used when testing these DRAM configurations to see if there is anything we are missing?  Do you know if executing out of LPDDR2 has been tested?

0 Kudos

2,043 Views
billpringlemeir
Contributor V

The SDRAM runs lots of different cycles (types of access). TBST_INT_INTERVAL is the bits in DDRMC_CR13?  Wikipedia has a nice chart of the different commands.  You can alter things by turning on/off the cache.  Your stack access for instance, maybe the first write access.  A back to back read/write can present problems, etc.  I had issues with another Freescale CPU and it only manifested when people did SSH transfers over the FEC DMA.  Ie, only certain data patterns and access cycles would trigger the problem.  The ARM stm and ldm instructions can be used in memory test code to simulate the SDRAM bursts that would be possible with a cache turned on.  If you have gcc for the test code then,

    register ulong t1 asm ("r0")  = 0;                              \
    register ulong t2 asm ("r4")  = t1 + incr;                      \
    register ulong t3 asm ("r6")  = t2 + incr;                      \
    register ulong t4 asm ("r8")  = t3 + incr;                      \
        /* Run an entire burst line. */                             \
        __asm__ (" stmia  %[ptr], {%0,%1,%2,%3}\r\n" : :            \
                 "r" (t1), "r" (t2), "r" (t3), "r" (t4),            \
                 [ptr]"r" (start + (addr<<2)) : "memory") \
        /* Read four 32 bits values. */                             \
        __asm__ (" ldmia   %[ptr], {%0, %1, %2, %3}\r\n" :          \
                 "=r" (t1), "=r" (t2), "=r" (t3), "=r" (t4) :       \
                 [ptr]"r" (start + (addr<<2)));

Maybe helpful in your IRAM test code to simulate some different DDR transfers.  Most ARM compilers will generate the stm or ldm instructions when accessing the stack.  The code pre-fetch hardware may only get a few words at a time and for certain it will only be in read mode.

The TBST_INT_INTERVAL is just saying how many burst to run in a row.  Note, that this usually has to match some configuration of the DDR; there are special configuration cycles that you run to configure the DDR with the address bits (load mode registers at the wikipedia link).

Check the IOMUX settings for the DDR pins.  The voltages are different for LPDDR and I don't see in the manual where this will be set.  However, the memory will often function even if these values are not set properly. As well the Asynchronous/Synchronous clocking has to be set properly and depends on the CORE clocks.

The Micron DDR3 ZQ calibration tech note maybe of interest.  Both the LPDDR2 and DDR3 can use this scheme and it seems to be supported by Vybrid DDRMC_CR66+ as well as 34.6.14.   DDRMC_CR132+ seem like register that need to be tweaked per board; depending on PCB layout.  All of the DDRMC_PHYSxx seem like places to search; try the memory test with different values and look at register like DDRMC_PHY29 to see what the controller is setting delays to.  I would work on getting a SDRAM test that fails; random address and data are useful to try and exercise all transitions/cross-talk.  Just because one type of access works, doesn't mean they all will.

0 Kudos

2,043 Views
tomsaluzzo
Contributor III

Chris Page is having problems posting to the community, so I am posting this on his behalf.

I have finally been able to get the LPDDR2 working with a DMA memory test. The stack problems look to be solved when caching is disabled.  It ended up looking like it was a burst problem.  I am now at the point that once caching is enabled, I am still seeing problems.  Once I have enabled caching (instruction and data), a "bx lr" instruction to return from the function that enabled caching seems to just increment the program counter and not actually branch (this works on the DDR3 but not on the LPDDR2). 

Are there any memory settings that would affect caching or are there any cache controller settings that need to be configured differently for LPDDR2(running at 300MHz) vs DDR3(running at 400MHz)? 

0 Kudos

2,041 Views
billpringlemeir
Contributor V

Make sure you flush the cache before enabling it.  A "bx lr" by itself doesn't involve DDR memory at all?  The ARM cache will try to fill 4*32bytes.  If you DDR is 16bit, this is a burst of 8.  If you have only burst size of four, this will be a back-to-back burst.  The DMA test may only do a single beat (one 16/32 bit value) or at most a cache line; it will depend on the peripheral.  Often DMA wants to relinquish the BUS so that other devices are not slowed down.  Using stmia/ldmia, you can emulate the ARM cache by saving/restoring multiple registers.  I don't know how you are diagnosing the stack issue.  It can appear that the return address is wrong, if a code fetch failed.  The code maybe inside the ARM cache; debuggers may not fetch the value from the cache, so the instruction looks like "bx lr" when in fact it maybe something completely different.  In ARM mode only one leading nibble is execute always.  If it changes to some conditional that is not set, then the 'bx lr' may be a NOP.  Can you rule out a code issue?  Ie, when cache is enabled, a burst read has failed (perhaps, a back-to-back burst without an activation cycle).

0 Kudos

2,041 Views
chrispage
Contributor II

Tom, Thank you for sending the information for me. 

The same exact binary code that enables the L2 cache works on the tower kit with the DDR3 but does not work on our board with LPDDR2. Once the L2 enable bit is set, it looks as if the memory controller stops working only for the areas designated to be cached.  I am not able to access these external memory locations (reading with the debugger returns all 0x00000000 and writing fails),  but I am able to access read/write the areas of memory that are not designated to be cached.  I implemented a small memory test with the stm and ldm instructions as Bill mentioned and the memory test works using my LPDDR2 configuration, but the problem with L2 is still occurring with that same configuration.  Is there any good way to emulate what occurs with the DRAM when the L2 enable bit is set?  Also are there any specific DRAM configurations that I should be focusing on that would be hit the hardest in the scenario of enabling the L2 cache?

Thanks,

Chris

0 Kudos

2,045 Views
naoumgitnik
Senior Contributor V

Hello Everybody,

Based on the information "from the side", unlike on on the Tower kit, quite possibly the Vybrid part being used is of the type having the L2 cache disabled (refer to the Vybrid documentation).

Regards, Naoum Gitnik.

0 Kudos

2,045 Views
GordyCarlson
NXP Employee
NXP Employee

Expanding on Naoum's response with more detail for the benefit of other Vybrid users who may search/find this thread.....the Vybrid Tower board uses a version that has an L2 Cache.....however we just discovered that a Vybrid version without L2 Cache was inadvertently provided to the customer and that was the device they fabricated onto their target board.  Initial debugging showed their test code working on a DDR3 based Tower Board, and not working on an LPDDR2 based target board.  We all focused on the LPDDR settings as the most likely culprit until the customer noticed that the memory failures were all in the LPDDR2 block that was supposed to be cached.  Their Vybrid code was initialized to enable cache,  but unknown to us until now.....there was no L2 in the chip on their board.

Here is table 2.4 from the Vybrid data sheet.  If you need L2 cache....use the MVF51 or 61 derivatives,  the MVF50 and 60 derivatives do not have L2 cache.

Part Number Package Description

 

MVF30NN151CKU26 LQFP-EP 176 24*24*1.6 A5-266, No Security, 176LQFP

MVF30NS151CKU26 LQFP-EP 176 24*24*1.6 A5-266, Security, 176LQFP

MVF50NN151CMK40 MAP 364 17*17*1.5 P0.8 A5-400, No Security, 364BGA

MVF50NS151CMK40 MAP 364 17*17*1.5 P0.8 A5-400, Security, 364BGA

MVF50NN151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, No Security, 364BGA

MVF50NS151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, Security, 364BGA

MVF51NN151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, L2 Cache, No Security, 364BGA

MVF51NS151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, L2 Cache, Security, 364BGA

MVF60NN151CMK40 MAP 364 17*17*1.5 P0.8 A5-400, M4, No Security, 364BGA

MVF60NS151CMK40 MAP 364 17*17*1.5 P0.8 A5-400, M4, Security, 364BGA

MVF60NN151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, M4, No Security, 364BGA

MVF60NS151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, M4, Security, 364BGA

MVF61NN151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, M4, L2 Cache, No Security, 364BGA

MVF61NS151CMK50 MAP 364 17*17*1.5 P0.8 A5-500, M4, L2 Cache, Security, 364BGA

MVF62NN151CMK40 MAP 364 17*17*1.5 P0.8 A5-400, M4 Primary, No Security, 364BGA

We are now providing Vybrid samples that HAVE cache,  and expect that will resolve the issue.  The customer will mark this thread "Answered" when that is tested and confirmed with the new parts.

0 Kudos

2,046 Views
chrispage
Contributor II

Tom,

Thank you for posting this for me.  It looks like our IT department was blocking this forum, but it should hopefully be fixed now.

-Chris

0 Kudos

2,045 Views
naoumgitnik
Senior Contributor V

Hello Chris,

I am a bit confused by your last email ("I have also implemented the test code that Bill mentioned using stm and ldm, and they both seem to work.  I feel as though this may be a timing issue with reads from memory into the cache...") - is the issue resolved or not yet? May you clarify, please?

Regards, Naoum Gitnik.

0 Kudos

2,041 Views
tomsaluzzo
Contributor III

I think the Engineer may be away for the Holiday so I am replying on his behalf with this information he provided.

This only happens after enabling the L2 cache. When it is just the L1 cache that is enabled, I don't see this problem. I have a boot loader that is coming up and configuring the memory and then it is booting into a different app. This separate app is running an OS that is enabling the cache (this code is flushing and looks to be setting up the cache properly). I am able to run this app on the tower development kit and it works. I step through the L2 cache initialization. I am able to connect with the debugger and view the external memory that has the code that is being executed. When I put this same app on our board with LPDDR2, I am able to get to the L2 cache initialization but once the L2 enable bit is set, all the memory looks to “change” to 0x00000000. It looks as though the code fetch from the external memory into the cache seem to be failing and causing problems. Then since all the instructions look to be 0x00000000, the cpu is basically just executing NOPs (which is why the program counter is just being incremented, but nothing is happening). I was going to try to run my memory test after the L2 cache initialization code, but since I am running from external memory and the processor sees all my code as 0x00000000, my code won’t execute.

I have also implemented the test code that Bill mentioned using stm and ldm and they both seem to work. I feel as though this may be a timing issue with reads from memory into the cache based on the findings I mentioned in the previous email. I am wondering what DRAM controller settings I should focus on based on the previous findings to be able to try to debug this problem.

Tom Saluzzo

Field Application Engineer

Arrow Electronics

2165 Brighton-Henrietta Townline Rd

Rochester, NY 14623

585.820.2781

tsaluzzo@arrow.com <mailto:gfredricks@arrow.com>

www.arrow.com <http://www.arrow.com/>

0 Kudos

2,043 Views
naoumgitnik
Senior Contributor V

Hello Chris,

While working on this issue, I got the following advice from one of my colleges who is more DDR/Vybrid expert than me:

"I'd recommend trying to do some memory tests using the DMA to read the memory with the SSIZE set to 32-byte.  That way you make sure that you are using the full burst of data from the DDR and getting back the expected results. Hope it helps."

Regards, Naoum Gitnik.

0 Kudos

2,043 Views
chrispage
Contributor II

I will try this, I noticed there is a DMA test in the example code that has been posted, but it is using a file dma.h to help set up the DMA.  Do you have dma.h that you could post so I don't have to recreate it?

Thanks,

Chris

0 Kudos

2,043 Views
juangutierrez
NXP Employee
NXP Employee

Hi Chris

Find attached the dma.h file

2,043 Views
GordyCarlson
NXP Employee
NXP Employee

I'm the local Freescale FAE, supporting this customer. I went onsite to customer this week. Here is additional background info.  I sent this to Naoum and Jiri via email, they request this be posted within the Community thread for continuity.

-          Customer runs its LPDDR at 300Mhz.  They see memory failures and system lockup whenever a stack pop/push occurs when executing from LPDDR.

-          These failures do not occur when they execute code or tests from internal Vybrid memory.

-          We (Local Arrow DFAE and I) advised customer to relocate their stack and re-test.   It fails regardless if the stack is internal or external
           in LPDDR when executing from LPDDR.  Its works fine when stack is internal OR external if executing from internal memory.

-          Jiri/Naoum provided LPDDR settings for 200Mhz, and also from our 400Mhz LPDDR validation board.  Customer is study those for differences with their settings.


-          Is there a difference in how a stack instruction cycle works in LPDDR from a memory system handshake point of view?  That’s the only time it fails.  I don’t think so, but it seems unusual that all other instructions execute fine from LPDDR and moving the stack internal or external doesn’t cause a failure,  only the external stack pop/push instruction execution does.

Actions requested

--------------------------
** Customer asks if we can provide the complete codebase for the validation code….ie…PLL settings/general init files……as perhaps their problem is in a system setting they overlooked that is outside the LPDDR init values.

** Customer asks if we have run Linux from LPDDR memory?  Or did we just run memory tests  (executing from internal memory) on our LPDDR validation board?

** Can you provide schematics for our LPDDR Validation board?  I have some from a couple years ago, not sure if they are the current rev.  We will compare this to customers LPPDR schematic.

0 Kudos