Hi community,
We are using a custom board that was based off SabreSD with imx6 processor.
Kernel: 4.1.15
Bsp: krogoth
As we received a new batch of PCBs, we are often seeing various kernel oops faults ( examples attached ). They seem to be caused by different processes every time, the common factor is that everything crashes during virtual memory access / page reallocation.
The above should point at the DDR fault, however we have extensively tested DDR on all of those boards using the NXP DDR stress test tool, even under various thermal conditions, and results were positive.
To eliminate the possibility of memory leaks in the system, i have added kmemleak for monitoring RAM usage, and I haven't seen anything suspicious. (kmemleak only ever reports kworker thread (pid 0) as potential leak, but i'm guessing it's just because that process allocated some memory and holds it until shutdown).
I'm struggling a little with what to do next to determine the possible cause.
If anyone has any ideas or pointers it would be much appreciated.
Thanks.
已解决! 转到解答。
Thanks Igor.
The problem was fixed by disabling dynamic frequency governors is the cpufreq driver. We are using Crank based GUI, and it seems that at lowest frequency setting or during the frequency change, there were some issues with accessing the RAM. After i have set conservative governor as default and removed 2 out of 3 frequency operating points from the device tree, the system seems to have become more stable.
Hi Stas
one can try linux memtester as except memory it also well test board
power supplies.Try to test with high/low temperatures.
Just for test one can try to disable busfreq driver using
Chapter 24 Dynamic Bus Frequency Driver attached Linux Manual.
Also similar issues may be caused by poor soldering.
Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Thanks Igor.
The problem was fixed by disabling dynamic frequency governors is the cpufreq driver. We are using Crank based GUI, and it seems that at lowest frequency setting or during the frequency change, there were some issues with accessing the RAM. After i have set conservative governor as default and removed 2 out of 3 frequency operating points from the device tree, the system seems to have become more stable.