Various intermittent Kernel Oops faults on boot / shutdown

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

Various intermittent Kernel Oops faults on boot / shutdown

跳至解决方案
1,812 次查看
stasgil
Contributor III

Hi community,

    We are using a custom board that was based off SabreSD with imx6 processor.

Kernel: 4.1.15

Bsp: krogoth

As we received a new batch of PCBs, we are often seeing various kernel oops faults ( examples attached ). They seem to be caused by different processes every time, the common factor is that everything crashes during virtual memory access / page reallocation. 

The above should point at the DDR fault, however we have extensively tested DDR on all of those boards using the NXP DDR stress test tool, even under various thermal conditions, and results were positive.

To eliminate the possibility of memory leaks in the system, i have added kmemleak for monitoring RAM usage, and I haven't seen anything suspicious. (kmemleak only ever reports kworker thread (pid 0) as potential leak, but i'm guessing it's just because that process allocated some memory and holds it until shutdown).

I'm struggling a little with what to do next to determine the possible cause.

If anyone has any ideas or pointers it would be much appreciated. 

Thanks.

标签 (1)
0 项奖励
回复
1 解答
1,548 次查看
stasgil
Contributor III

Thanks Igor.

    The problem was fixed by disabling dynamic frequency governors is the cpufreq driver. We are using Crank based GUI, and it seems that at lowest frequency setting or during the frequency change, there were some issues with accessing the RAM. After i have set conservative governor as default and removed 2 out of 3 frequency operating points from the device tree, the system seems to have become more stable.

在原帖中查看解决方案

0 项奖励
回复
2 回复数
1,548 次查看
igorpadykov
NXP Employee
NXP Employee

Hi Stas

one can try linux memtester as except memory it also well test board

power supplies.Try to test with high/low temperatures.

Just for test one can try to disable busfreq driver using

Chapter 24 Dynamic Bus Frequency Driver attached Linux Manual.

Also similar issues may be caused by poor soldering.

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 项奖励
回复
1,549 次查看
stasgil
Contributor III

Thanks Igor.

    The problem was fixed by disabling dynamic frequency governors is the cpufreq driver. We are using Crank based GUI, and it seems that at lowest frequency setting or during the frequency change, there were some issues with accessing the RAM. After i have set conservative governor as default and removed 2 out of 3 frequency operating points from the device tree, the system seems to have become more stable.

0 项奖励
回复