Hello,
We are developing a custom board using iMX6Q CPU, however we run into some issues with GPU support in X11 (at least everything points into that direction in our opinion).
Our setup is:
Problem description:
We see a CPU hang within few minutes when doing the following things at the same time:
After the hang JTAG is not available, we do not see kernel dump or any kind of logging, clocks are still emitted. Additionally what we can see on display is that it slowly fades away and displays strange random pattern. This should point to some kind of AXI bus hang, right? When doing both things separately (PCIex usage and displaying the webpage) the issue does not occur.
Our findings so far:
Summary:
Based on those facts we concluded that displaying said website (using FF based webrowser) creates some specific series of events in EXA driver which could cause buggy behavior. We do realize it might not be 100% accurate, but this is what we concluded so far. We know that this could be caused by some bug in our software, but we would not expect that userspace tooling can crash the device in such way. Reproducing it on SabreSD should also point that this is not really a HW issue. We are currently busy of creating a webpage similar to the one on which we see the issue and we will post it here once it's ready, so other people might reproduce it as well (we cannot share parts of our customer SW here).
Maybe Freescale has seen such behavior previously, could point us into some direction where to look for or suggest some debugging actions we should take?
Thanks in advance.
Is it possible to reproduce without PCIe?
Can you able to provide the rootfs to debug the issue. May the following environment willl help
Hi Prabhu,
Thanks for looking at this issue. It is kind of possible to reproduce without PCIex usage - we saw similar crash on some devices which were running for 48+ hours (3 out of 56 devices crashed in same way), therefore we assume PCIex usage greatly decreases the time needed to reproduce the issue. I am not 100% sure if I can provide FSL with our software at the moment - I need to check with my colleagues, as I wrote in my previous post we are working at the moment on reproducing the issue with some more generic website which does not contain software of our customer.
Hi,
Let's check whether 2D cause this hang:
Edit /etc/X11/xorg.conf
In Section "Device"
Add
Option "NoAccel" "true"
If it can avoid hw failure, then it will be easy to identify which 2D api causes such issue.
Hi Zhenyong,
Thanks for looking at our issue. Enabling NoAccel seems to make the issue do not occur (tested it for an hour only so far, but it is still much more uptime than we see without it). Additionally we saw that issue seems to not occur when we have X set to 16-bit mode instead of 24-bit mode. Maybe this can point to something as well.