We have an iMX287 based platform, heavily inspired by the iMX28 EVK. One DDR2 memory, ISSI, IS43DR16640B-25DBLI.
I have spent the last 4 weeks chasing a problem in this platform. It would be a very, very long story to cover everything that has been done.
This is the summary:
5V only, no battery. Hardware exactly according to the PMU White paper for 5V only operation.
Originally we run uBoot + Yocto Linux.
We also tests "latest" LTIB with Freescale bootlets instead of uBoot.
The platform has worked for 1 year during development without any problems at all. Then, when we tested the platform in our temperature chamber at -20 degrees Celsius we observe the following:
*With uBoot based firmware: SPL boot works fine, but when all "relocating" stuff runs, the entire power system shuts down. The 4P2 turns off and thats it. Reset dosn't work, only a 5V power cycle and a cold start.
*With bootlet+LTIB, initial startup works OK, uncrompressing kernel works OK, but exactly same power shutdown just after "Starting kernel" console message appear. The problem is 100% repeteable in -20 degrees C. In room temperature kernel boots OK in 99.99999% of all cases.
We have spent Days/weeks checking DDR2 settings, tested different DDR2 settings, no change at all. We have logged ALL supply rails, both 5V and the iMX generated ones with a 50MHz analogue data logger. Looks ultra clean, no glitches, ripple or anything. It appears as the PMU simply turns off power and dies.
10+ boards tested with exactly same behaviour, different iMX287 batches also.
I have our boards stripped down to contain only iMX287, DDR2, decoupling caps, console terminal and USB connector for sb-loading. No other stuff on the board.
I have checked the current waveforms on the 5V feeding rail to the iMX with a 50MHz LEM probe. No "strange" spikes or anything before the shutdown. It simply shuts down and dies.
One interesting observation has been made last week: If I test the board in room temperature it appears as everything is working OK. But when I add a small 51 ohm resistor on the 1.8V DDR2 voltage rail, I get 100% boot failure as described above. Same if I add the load on the 3.3V rail.
I believe I have tried disabling all brownout, maxing current limiters etc. etc. etc. but with no improvement. We have been ALL over the Place with theories, memory failures, PCB issues but everything appears to check out OK.
I can't understand what is causing this. How can I narrow down what causes the PMU to make the decision to shutdown? I'm running out of ideas totally.
All feedback is more than welcome.
Thanks,
Martin Voss
Hi Martin and Lorenzo,
Did you get any further with your investigations? I am looking into a rare situation where the iMX28 may be going into an odd reset state (PMU in its reset defaults from what I can tell).
Any info you can share on your issue (and if you have resolved it) would be useful.
Thank you in advance, Mark
Hi Martin & Igor,
We have exactly the same problem with a platform based on the i.mx 287.
This is unstable and if the temperature is low, not even Boot.
You got some conclusion with your tests?.
We have changed all the parameters of the DDR2 following the freescale configuration excel, changed NAND type, adjusted the rails' capacities, but without any success.
Thanks and regards
Lorenzo
I have attached the simplest case possible showing the problem we struggle with on the EVK.
Basically, an infinite loop that writes to the debug UART.
If we set the POWER bit in the PLL0 Control register, the CPU starts to misbehave and will stop executing code after a few 100ms.
If we move the UART write infinite loop before the write to PLL0 Control registers, it executes perfectly and without any issues, although at a very reduced speed I believe.
Everything tested on official iMX287 EVK.
I can't understand how this can happen, and have very little ideas to test. And since our original problem is related to a mystery stability issue, I would suspect this has something to do with that. That's why I so eager to solve this...
All ideas welcome...
Regards,
Martin
Hi Martin
I am not too knowledgeable in this area but I noticed the RM mentions this :
"The requirement is that the roots of the clock are configured and stable before the
elements higher up in the tree are programmed. This will allow the roots to stabilize
before selected as a valid source to drive a clock trunk/tree. If this sequence is not
honored, unpredictable frequencies can occur...."
Likewise, later in setting the power bit :
"HW_CLKCTRL_PLL0CTRL0_WR(BF_CLKCTRL_PLL0CTRL0_POWER(1));
// enable PLL and wait 10 us to let PLL0 lock before using it"
and for locking with HW_CLKCTRL_PLL0CTRL1 :
"The lock count is driven off of xtal. So after the PLL0 is powered on, the PLL0 Lock
should be asserted after 50 us."
So I wonder if you added the lock and the required delay, would this make any
difference in your test results.
Best regards
Sinan Akman
Hi Martin
there is OBDS baremetal test which tested with EVK, please try it.
Lab and Test Software (1)
On-Board Diagnostic Suit for the i.MX28
http://www.nxp.com/products/software-and-tools/software-development-tools/i.mx-software-and-tools/i....
Note LI-ION_BATTERY is imitated by BATT_REG, produced from 5V
(SCH-26241) i.MX28 EVK schematic.
For "5V only" configuration should be used attached example of schematic with
Linux patch for i.MX28 SDK 2010.12 to add VDD5V only configuration
The programming sequence to go from a clock that is referenced from the xtal clock to the PLL is outlined below:
sect.10.6 CPU and EMI Clock Programming i.MX28 RM
http://cache.freescale.com/files/dsp/doc/ref_manual/MCIMX28RM.pdf
Regarding "POWER bit in the PLL0" issue one can try to run test from internal ram to check
if that is caused by some brown-out conditions/or power supplies issue. May be useful to run on "Battery only"
option (power provided from Battery).
Best regards
igor
Hi,
I will test the Linux images on the EVK. But I'm sure they will boot.
But still, the stability issue we see with the simple bare metal code must be understood. Why does the iMX crash and burn early in boot immediately after this statement
//****************************
// CLOCK set up
//****************************
// Power up PLL0 HW_CLKCTRL_PLL0CTRL0
(*(R32)( 0x80040000)) = 0x00020000 ;
I think it will be important to understand that.
Regards,
Martin
For the moment we focus only on getting the most basic program running on the EVK. We can skip DDR2 test completely. As mentioned above, the EVK becomes unstable and crashes when the following statement is executed:
//****************************
// CLOCK set up
//****************************
// Power up PLL0 HW_CLKCTRL_PLL0CTRL0
(*(R32)( 0x80040000)) = 0x00020000 ;
I use a very simple UART write test to debug, DDR2 memory is not involved in anyway.
/Martin
could you try i.MX28 EVK Demo Images
~igor
Hi,
Sorry forgot to mention. Room temperatur until we get that stable.
- I.e. standard EVK
- Wall adapter supply shipped with EVK
- 21 degrees ambient
Do you know which platform the mem_test has been successfully run on when it was developed?
Thanks,
Martin
Hi Martin
one can start with more simple memory test included in i.MX28 EVK OBDS (drivers/ddr/ddr_test.c)
Lab and Test Software (1)
On-Board Diagnostic Suit for the i.MX28
http://www.nxp.com/products/software-and-tools/software-development-tools/i.mx-software-and-tools/i....
then add additional tests one by one from mem_test.7z.zip
Best regards
igor
Update after todays work 2017-05-30:
All focus is still on the EVK + mem_test application. No work done on our own hardware until EVK is stable in all aspects.
Tried changing drive strength on the iMX side, no improvement at all.
I have reduced the test to a VERY simple main loop:
int main ()
{
prep() ;
while(1)
{
outbyte(0x55);
}
.
.
.
I then hooked up oscilloscope on debug UART TX pin. Since outbyte() is blocking until UART is ready I would expect a continous back-to-back stream of 0x55 on the scope.
I can see lots of 0x55 transmitted on the TX pin for a certain amount of time, perhaps a few milliseconds, then it get more and more sporadic until all activity stops on the TX pin and the iMX seems dead in the water. Happens EVERY time. No idea what the CPU does when this happens.
Further testing shows that if I modify prep() so I return back to main BEFORE the below statement that writes to 0x80040000 is executed
void prep()
{
//****************************
// VDDD setting
//****************************
//set VDDD =1.55V =(0.8v + TRIG x 0.025v), TRIG=0x1e
(*(R32)( 0x80044010)) = 0x0003F503 ;
(*(R32)( 0x80044040)) = 0x0002041E ;
//****************************
// CLOCK set up
//****************************
// Power up PLL0 HW_CLKCTRL_PLL0CTRL0
----> return; <---- ADDED THIS RETURN HERE EVERYTHING WORKS
(*(R32)( 0x80040000)) = 0x00020000 ;
// Set up fractional dividers for CPU and EMI - HW_CLKCTRL_FRAC0
// EMI - first set DIV_EMI to div-by-3 before programming frac divider
(*(R32)( 0x800400F0)) = 0x80000003 ;
the TX output is PERFECTLY stable, all the time, just as I would expect. But I assume the CPU runs at 24MHz then.
If I put the return statement AFTER the write to 0x80040000 everything goes haywire as described above, the iMX spits and stumbles until it dies in some strange way.
void prep()
{
//****************************
// VDDD setting
//****************************
//set VDDD =1.55V =(0.8v + TRIG x 0.025v), TRIG=0x1e
(*(R32)( 0x80044010)) = 0x0003F503 ;
(*(R32)( 0x80044040)) = 0x0002041E ;
//****************************
// CLOCK set up
//****************************
// Power up PLL0 HW_CLKCTRL_PLL0CTRL0
(*(R32)( 0x80040000)) = 0x00020000 ;
----> return; <---- ADDED THIS RETURN HERE CAUSES CRASH after some time
// Set up fractional dividers for CPU and EMI - HW_CLKCTRL_FRAC0
// EMI - first set DIV_EMI to div-by-3 before programming frac divider
(*(R32)( 0x800400F0)) = 0x80000003 ;
So, something is very wrong when the POWER bit (17) is set. This causes a fatal system instability that leads to a crash.
As mentioned earlier, stock EVK and stock mem_test application code.
Regards,
Martin
what temperature is used for testing i.MX28 EVK, I am not sure that it was tested on lower temperatures.
mem_test.7z.zip is not official test, it was referenced just as starting point for developing custom ddr tests.
for testing EVK please try Demo Images on room temperature
Documentation included in
for helping bring-up non standard designs like -40C operation, may be recommended to apply NXP Professional Services:
http://www.nxp.com/support/nxp-professional-services:PROFESSIONAL-SERVICE
Best regards
igor
Update 2017-05-29:
Still working hard with this issue we have. To revert to what we think is a KNOWN working solution we have 2pcs. iMX28 EVK Rev. D boards from Freescale in the lab. Running från external wall adapter.
When I sb-load the prebuilt sb file found in mem_test.7z.zip we unfortunately see the same console output as on our own hardware, i.e. simple memory test pass, and then it hangs forever inside the t0: memcpy11 SSN test.
I did install the toolchain to be able to modify the mem test application and have added printf() debug printouts and did add one printf("Start") just after the printf("Program the source memory \n"); statement in the DDRtest_memcpy11_SSN.c file, i.e. just before entering the for() loop that writes walking 1's. I also added one printf("End") immediately AFTER the for()-loop that writes the walking 1's. Then I hang the CPU with a while(1);
When I test this (on the official Freescale EVK) I see the following:
- Out of 20 tests, I only get "End" printed out once. I.e.in 19 cases I never reach the last printf. No idea why. But it explains why the t0 test never completes.
- If I remove the actual writes, i.e. *ps=.... statements from the loop the "End" printf() is reached every time, but in 19 times out of 20 it takes around 4-5 seconds, which I think is unlogical with a 454MHz CPU with a simple for() loop. In 1 out of 20 times, the "End" is printed out instantly after the "Start". More what I would expect. It's like the CPU is falling out of PLL lock and running on 24MHz only or something in the other cases. Very strange.
- To add to the fragile feeling this platform has, I have noticed that if I sb-load and run my test once until I reach the while(1); and then spray the 24MHz crystal on the EVK with just the SMALLEST TINY amount of cold spray, the CPU does a reset and wait to be sb-loaded again. I must stress a TINY amount, like a 1ms squirt. Not a good sign.
So, this is where I am right now. I really do not know what to test or do next. I will get a brand new EVK in tomorrow or on Wednesday to rule out having two bad EVKs, but this is an ultra long shot. I bet the new EVK will behave identical.
Our own hardware platform will have to wait until we can run a reliable memory test REPEATABLY on the official Freescale iMX287 platform, both in -40 degress in our climate chamber and also at elevated temperature. I want to see this working 1000 times in a row without one single fail. Until that is stable, I think we are wasting our time chasing things in our own platform, which is heavily influence by the EVK.
All ideas are very, very much appreciated.
Regards,
Martin
Hi Martin
had you tried to change drive strength, it can be changed on both
i.MX28 and DDR side. Please check sect.9.2.2.1 Pin Drive Strength Selection
i.MX28 Reference Manual
http://cache.freescale.com/files/dsp/doc/ref_manual/MCIMX28RM.pdf
Best regards
igor
Hi Martin
drive strength increases at low temperatures so memory may have
faults, also increased noise may cause brown out condition.
May be recommended to test memory at -20 degrees with memory test (mem_test.7z.zip)
provided on
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Hi,
Thanks. I did prepare a couple of our boards with the required hardware configuration and 1-cell LiIon battery and sb-loaded the test.sb file.I get the following console output:
"simple test
DDR test passed
t0: memcpyll SSN test"
And then nothing more. I have tested several boards and aborted the test after 4 hours. I guess it should not take longer than that?
Any ideas why the t0 test gets stuck?
I measured voltages with following results:
Vbatt = 3.64V
VDDIO = 3.17V
VDDA=1.81V
VDDD = 1.55V
Regards,
Martin
reason may be found using jtag and running that test with jtag
(mem_test.7z.zip has full sources).
Had you used tool MX28_mDDR_register_programming_aid_v0.4.zip
from below link to find optimal ddr settings for custom board
https://community.freescale.com/docs/DOC-1455
Best regards
igor
Hi Igor,
Unfortunately I don't have access to the JTAG pins on my platform.
I have worked with the MX28_mDDR.... spreadsheet alot last month to test different settings in our original setup using uBoot. But I have not tested this in the DDR2 stress test application yet. I must try that.
What toolchain do you recommend to built the test.sb file from scratch?
Once I can rebuild the tool I can probably figure out what happens and where.
Regards,
Martin