Custom IMX6Q system Hang-up Problem

取消
显示结果 
显示  仅  | 搜索替代 
您的意思是: 
已解决

Custom IMX6Q system Hang-up Problem

跳至解决方案
2,353 次查看
ko-hey
Senior Contributor II

Hi all

I'm developing a custom iMX6Q board that is based on SABRE-SD.

My system sometimes hang-up , the phenomenon is as below.

Please tell me how to resolve it.

Incidentally, I have already checked following community. But it has not been resolved.

IMX6Q system hang-up problem / linux kernel(3.0.35)

↓Here is my phenomenon

---------------------------------------------------------

  • Out Linux Kernel Version: 
    Timesys LinuxLink 3.0.35-ts-armv71 
    * based on [ L3.0.35_4.0.0 ] 
    * we change the kernel setting about CONFIG_RCU_CPU_STALL_TIMEOUT from 30sec to 10sec .CPU:i.MX6Q
    The frequency operates 15 boards consecutively for two days and occurs in 1-2 boards.
    I confirmed a few thing for the individual difference of the hardware .

    When my system hang-up , my system may or may not output the log about rcu detected stall.Follows are the log .

    a)
    Thu Mar 05 08:14:11.496 2015] INFO: rcu_preempt_state detected stalls on CPUs/tasks: { 2 3} (detected by 0, t=1002 jiffies)

    b)
    Mar 5 06:16:09 (none) user.err kernel: INFO: rcu_preempt_state detected stalls on CPUs/tasks: { 1 3} (detected by 0, t=1002 jiffies)

    c)
    Mar 5 10:32:47 (none) user.err kernel: INFO: rcu_preempt_state detected stall on CPU 3 (t=1001 jiffies)Mar 5 10:32:47 (none) user.err kernel: INFO: rcu_preempt_state detected stalls on CPUs/tasks: { 3} (detected by 2, t=1002 jiffies)


    By the application that we develop, I produce nine child threads from main thread and i watch the movement of the child thread in main thread.

    when my system hang-up , either main thread and child thread may be hang-up .

    If main thread hang-up, the watchdog timer outputs Power-on Reset .
    So i cannot acquire useful log .

    I traced child thread when hang-up , the thread did not wakeup after the sleep function call .
    I set 1sec or 100ms for the sleep function , but the thread wakeup 10 sec later .

    I implement a program to put up a stop flag of the global variable when there is not a reply, it more than 10 seconds.
    I felt that I linked about the wakeup and change of the global variable .

    Question:
    Could you tell me a cause that you think about this problem ?

标签 (3)
标记 (2)
0 项奖励
回复
1 解答
1,340 次查看
timesyssupport
Senior Contributor II

Hello ko-hey,

Freescale maintains a git repository of their kernel releases here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/

That thread looks to point to the JB 4.3-1.1.1-ga release, which is tagged here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tag/?id=jb4.3_1.1.1-ga

That release ports patchwork from the imx_3.0.35-4.1.0 branch of the kernel, here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.35_4.1.0

Reviewing the thread you mention, I see that this issue still persists, and those reporting the issue are also still running 3.0.35-4.0.0 - despite applying the patchwork for the Android kernel which supposedly addressed this issue. Those patches, it seems, are broken out in a later reply by PeterChan, and are contained in the rel_imx_3.0.101 branch here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.101

In that branch's history, you will see the various ENGR patches comitted.

Not knowing how customized your kernel is at this point, it may be possible to cherry-pick the patches from the rel_imx_3.0.101 kernel branch and apply to your own.

I should note, the 3.0.35-4.1.0 release was the last of the 3.0.x kernels, before moving to 3.10.17, and now, 3.10.53.

3.0.35-4.0.0 is considered to be legacy at this point, given the move to the 3.10.xx series of kernels; most of our users migrated to the 3.0.35-4.1.0 kernel release when that was still the current/supported version, so the 4.0.0 release was deprecated. If the patches described in that thread do not resolve your issue, we would need to engage in a dedicated services project to address the issue.

Regards,

Timesys Support

在原帖中查看解决方案

0 项奖励
回复
7 回复数
1,341 次查看
timesyssupport
Senior Contributor II

Hello ko-hey,

Freescale maintains a git repository of their kernel releases here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/

That thread looks to point to the JB 4.3-1.1.1-ga release, which is tagged here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/tag/?id=jb4.3_1.1.1-ga

That release ports patchwork from the imx_3.0.35-4.1.0 branch of the kernel, here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.35_4.1.0

Reviewing the thread you mention, I see that this issue still persists, and those reporting the issue are also still running 3.0.35-4.0.0 - despite applying the patchwork for the Android kernel which supposedly addressed this issue. Those patches, it seems, are broken out in a later reply by PeterChan, and are contained in the rel_imx_3.0.101 branch here: http://git.freescale.com/git/cgit.cgi/imx/linux-2.6-imx.git/log/?h=imx_3.0.101

In that branch's history, you will see the various ENGR patches comitted.

Not knowing how customized your kernel is at this point, it may be possible to cherry-pick the patches from the rel_imx_3.0.101 kernel branch and apply to your own.

I should note, the 3.0.35-4.1.0 release was the last of the 3.0.x kernels, before moving to 3.10.17, and now, 3.10.53.

3.0.35-4.0.0 is considered to be legacy at this point, given the move to the 3.10.xx series of kernels; most of our users migrated to the 3.0.35-4.1.0 kernel release when that was still the current/supported version, so the 4.0.0 release was deprecated. If the patches described in that thread do not resolve your issue, we would need to engage in a dedicated services project to address the issue.

Regards,

Timesys Support

0 项奖励
回复
1,340 次查看
ko-hey
Senior Contributor II

Hi Timesys Support

I understood current situation of 3.0.35-4.0.0.

I'll talk with my team and decide what we will do next.

ko-hey

0 项奖励
回复
1,340 次查看
wanglifang
Contributor I

Hi ko-hey,

Do you resolved this hangup problem, I find this cpu-hangup problem happened in my project too.

the same thing is that many kernel-thread are forked and CPU1 sometimes failed to schedule these threads, which cause system hangup.

My boarded is IMX6D and kernel is 3.0.35; I have patched the kenel with patches provided by PeterChan in another thread, but the problem still exists.

0 项奖励
回复
1,340 次查看
timesyssupport
Senior Contributor II

The i.MX6Q kernel has moved on in the 3.0.35 kernel, from the 4.0.0 release, to the 4.1.0 release; it has since been superceded by the 3.10.17, and soon to be the 3.10.53 kernels. If you are unable to move forward in kernel releases ko-hey we will likely need to review and possibly address as dedicated services. Without reviewing the subsequent kernel releases, we cannot say immediately whether this was resolved in a later release.

Regards,

Timesys support

1,340 次查看
ko-hey
Senior Contributor II

Hi igorpadykov, Karina Valencia Aguilar & Timesys Support.

Thank you for reply.

Do you think that it is improved if we update kernel ?

Changing the kernel has big impact for development.

So I don't want to change kernel version if I can.

According to the following thread, someone had same error message and it has been improved by applying the patchs.

I think the patch is effective for this problem. But the patch is for Android.

Can you provide me same patch for Linux 3.0.35 kernel ?

https://community.freescale.com/thread/337666

ko-hey

0 项奖励
回复
1,340 次查看
igorpadykov
NXP Employee
NXP Employee

Hi ko-hey

for timesys linux support please post on below link

Timesys Commercial Embedded Linux Support | Timesys Embedded Linux

alternatively one can try FSL BSPs

http://www.freescale.com/webapp/sps/site/prod_summary.jsp?code=i.MX6Q&fpsp=1&tab=Design_Tools_Tab

Best regards

igor

0 项奖励
回复
1,340 次查看
karina_valencia
NXP Apps Support
NXP Apps Support

timesyssupport can you help to review this case?

0 项奖励
回复