Hi All,
We're having some very confusing problems with an i.MX537 board. Note that we've built thousands of these, most are fine, but every now and then we get a batch that doesn't seem to behave. The one I'm using now is nice because at least it fails pretty consistently; more common is that they fail about one time in twenty, which makes picking up the fault pretty challenging.
About 90% of the time (from over a thousand test runs on this unit), the behaviour is:
The remaining 10% of the time, either it locks up when trying to start the kernel the first time, or it does actually boot up on the first attempt. Note that the second attempt to start the kernel has a 100% success rate; it has never taken a third U-boot attempt.
Typical boot log below.
U-Boot 2017.01-00008-g2e9b9a3-dirty (Jul 02 2021 - 14:33:52 +1000)
Board: MX53 LOCO
I2C: ready
DRAM: 1 GiB
i2c: I2C2 SDA is low, start i2c recovery...
I2C2 Recovery success
MMC: FSL_SDHC: 0
In: serial
Out: serial
Err: serial
Net: FEC
Booting in 5 sec. Type #load to abort
Booting from mmc ...
switch to partitions #0, OK
mmc0 is current device
MMC read: dev # 0, block # 2048, count 6144 ... 6144 blocks read: OK
## Booting kernel from Legacy Image at 72000000 ...
Image Name: Linux-2.6.35.3-1129-g691c08a-svn
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2884468 Bytes = 2.8 MiB
Load Address: 70008000
Entry Point: 70008000
Verifying Checksum ... OK
Loading Kernel Image ... OK
Starting kernel ...
data abort
pc : [<aff9a3a4>] lr : [<aff559d4>]
reloc pc : [<778473a4>] lr : [<778029d4>]
sp : af550b30 ip : 00000000 fp : 72000040
r10: 00000000 r9 : af550ed0 r8 : af5531ac
r7 : affa47b0 r6 : 00000000 r5 : aff9a3a7 r4 : 6964616f
r3 : 00000000 r2 : aff5f568 r1 : 72000040 r0 : 0a000023
Flags: Nzcv IRQs off FIQs off Mode SVC_32
Resetting CPU ...
resetting ...
U-Boot 2017.01-00008-g2e9b9a3-dirty (Jul 02 2021 - 14:33:52 +1000)
Board: MX53 LOCO
I2C: ready
DRAM: 1 GiB
MMC: FSL_SDHC: 0
In: serial
Out: serial
Err: serial
Net: FEC
Booting in 5 sec. Type #load to abort
Booting from mmc ...
switch to partitions #0, OK
mmc0 is current device
MMC read: dev # 0, block # 2048, count 6144 ... 6144 blocks read: OK
## Booting kernel from Legacy Image at 72000000 ...
Image Name: Linux-2.6.35.3-1129-g691c08a-svn
Image Type: ARM Linux Kernel Image (uncompressed)
Data Size: 2884468 Bytes = 2.8 MiB
Load Address: 70008000
Entry Point: 70008000
Verifying Checksum ... OK
Loading Kernel Image ... OK
Starting kernel ...
[ 0.000000] Linux version 2.6.35.3-1129-g691c08a-svn27886 (trusty@freescaledev) (gcc version 4.4.4 (4.4.4_09.06.2010) ) #8 PREEMPT Fri May 10 14:08:50 AEST 2019
[ 0.000000] CPU: ARMv7 Processor [412fc085] revision 5 (ARMv7), cr=10c53c7f... and it works fine from here.
Testing we've done, none of which made any difference to the behaviour:
At this point I am at a loss to explain what is going on!
Any ideas for what might be wrong, or ideas for further useful testing, would be very much appreciated.
Thanks!
Hi Evan
for bad board one can update calibration coefficients using below link
then run overnight ddr test (preferably at different temperatures)
also may be useful
If this will not help may be suggested (just for test) to resolder chip, to check if it is caused by poor soldering.
Best regards
igor