HDMI connection causes crash on i.MX6 Solo

allanmatthew · ‎02-19-2015

On 50% or so of our custom i.MX6 solo boards we're seeing a crash when an HDMI monitor is connected. The crash occurs once the system is fully booted and the HDMI cable is inserted. There is no useful information spit out on the console port, and the crash seems to be very low-level as the user LED (tied to a hearbeat trigger) stops flashing. Additionally, the kernel will halt during boot if an HDMI cable/monitor is connected with no useful output (see attached boot log).

Furthermore, i've been able to determine that its not just an HDMI monitor (of which I've tried many different types) being connected which causes the crash. I can cause the crash simply by pulling the HDMI HPD pin to 5v.

Lastly, and hopefully this is the smoking gun: Normally we use a kernel with a bundled initramfs which mounts/loads a squashfs and an aufs overlay rootfs. The crash does not occur on boards that are known to fail if I use a kernel that does not have a bundled initramfs, even though the version and defconfig are identical (obviously the boot does not complete, but it fails at the init as expected and I get an image on the HDMI).

My bundled uImage is ~9MB in size. The uImage is loaded to 0x12000000, dtb loaded to 0x18000000. Linux is based on the 3.10.17 GA release, as is the rootfs and initramfs (built in Yocto).

So, my questions are:

1.) Why would a bundled initramfs cause this failure? When no HDMI is connected it seems to operate just fine.

2.) Why does this only occur on 50% of our boards? Boards from the same batches will work or fail, there does not seem to be a common hardware issue.

Thanks,

-Allan

fabio_estevam · ‎02-20-2015

Allan,

I remember seeing hang issues when connecting HDMI cable with 3.10.17.

Can you try a mainline kernel, such as 3.19? I never saw such HDMI hang in mainline.

If I recall correctly the hang did not happen if the IPU was not used in U-boot. Do you have splash screen enabled in U-boot?

fabio_estevam · ‎02-20-2015

Please check this thread:

https://lists.yoctoproject.org/pipermail/meta-freescale/2014-July/009434.html

sinanakman · ‎02-22-2015

Hi Fabio

Thanks for bringing in your experience to this and

helping to set the right direction. Allan, I am sorry if I

mislead you towards potential memory problems. I

am glad we now have a solution but it would be nice

to know the real reason and the cause of the problem.

Best regards

Sinan Akman

dlschaeffer · ‎07-17-2015

Agreed, I've worked around this issue on some of our boards but it would be nice to know why it doesn't affect all of them.

The root cause appears to be a a HDMI PHY Frame Composer Overflow interrupt storm in Linux when HDMI has already been enabled by u-boot. The rate of the interrupt seems to very by temperature on some of my boards. If the rate is slow enough the kernel will recover, fix the interrupt and boot.

rostyslavkhudol · ‎07-12-2018

Hi Daniel,

We're facing the same issues. Could you please share workarounds you had to apply?

allanmatthew · ‎02-20-2015

Wow, that seemed to do it! Thanks Fabio!

Any word on why disabling the HDMI in u-boot would cause a crash on some boards but not others?

igorpadykov · ‎02-22-2015

Hi Allan

https://bugzilla.yoctoproject.org/show_bug.cgi?id=6703

Best regards

igor

sinanakman · ‎02-19-2015

Hi Alan

Now this might seem to be not related but perhaps you

do have a ram issue that shows itself when you stress

it with your ramdisk usage. I never saw what you

are reporting but on the boards that I worked on

which had ddr issues (either setup or signal integrity)

it would only show up during an NFS root as this was

triggering burst mode. So I wonder, instead of your

ramdisk, if you could test those boards (or some others

which look healthier) with an NFS root file system.

If you do see then similar results, I would recommend

to focus on a potential memory issue. Again, this is

perhaps not related but I would recommend to give

a try.

Hope this helps

Sinan Akman

allanmatthew · ‎02-19-2015

Hi Sinan-

Thanks for the suggestion. I tested the offending board with the DDR3 Stress Tester using the calibration values I previously obtained and put in U-boot, and the board passed 100% over a couple hours of testing.

Also, mtest in U-boot seemed to pass easily over a range of memory values.

Do you think thats sufficient for testing memory or do I need to run something like stressapptest? Unfortunately running in an NFS is tough as we don't have an ethernet connection, just WiFi.

-Allan

sinanakman · ‎02-19-2015

Hi Alan

There was at least one case on a customer board that

the memory tester didn't catch the problem but this

was couple years ago and some tester programs might

have since improved. What matters there is if the test triggers

burst mode and AFAIK uboot mtest is a simple patter write

read back test. I don't remember if Stress Tester does anything

in that direction. Perhaps it also only writes/reads circulating

data and address patterns. This is something you can maybe

verify. As for only having wifi, you probably already thought

of this but if your wifi is pcie based, would it be possible

to have a pcie ethernet card instead. Also, if we were to

consider a possible memory issue, I wonder if you could

go over the design layout recommendations and verify

against your board. Likewise, is running your memory

at a slower speed or relaxing some of the calibration

values make any difference. This would at least confirm

if this is the right direction to take.

Regards

Sinan Akman

allanmatthew · ‎02-20-2015

Hi Sinan-

I ran memtester and was unable to generate any errors after a few iterations of testing. I'm currently working on getting the stressapptest into my build and up and running, hopefully that yeilds something different.

Regarding the layout, we did have FSL verify it and I know it meets all the design recommendations. In terms of speed, I'm currently running at 400mhz.

Unfortunately the NFS mount isn't much of an option in the near future, as I'll need to order the adapter. In the meantime, can you think of other ways I might be able to trigger burst mode?

-Allan

sinanakman · ‎02-20-2015

Hi Alan

Unfortunately I don't know of any other practical setup

that would trigger burst mode but if you take a look

at the datasheet of your chip it might explain how and

when this happens. I understand 400Mhz is the lowest

your chip supports ? Can you change some of the

timing values for longer delays etc according to ranges

defined in the datasheet. FSL definitely did a good job

reviewing your design but in the past I did see memory

problems despite it passed a basic layout review. If you

have time and resources I'd suggest to go over the design

considerations once yourself. Between this and perhaps

modifying controller values more on the relax side of

the specs you can identify if this is the right direction to

chase further.

I felt your case might be memory related but if we step

back a moment, you originally mentioned that the problem

occurs when you plug off the hdmi cable. Is there any

other way (any other interface plug in and off) causes

this error ? Also can you scope the hdmi lines to see what

is happening when you pull the hdmi cable. Is there any

unusual spike or any pattern that you don't see on

the working boards ? If you do a high speed capture

you might potentially find a hint for anomaly. You could

also scope the DDR lines while you are removing the

hdmi cable. When the system freezes are the memory

lines still at a sane value ?

Sorry I couldn't help much but please let me know

if there is anything else I can be of any help.

Regards

Sinan Akman

HDMI connection causes crash on i.MX6 Solo

HDMI connection causes crash on i.MX6 Solo

i.MX6DL

i.MX6S

Linux

Yocto Project