Galcore issue

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Galcore issue

1,929 Views
sam_clark
Contributor I

Hi, 

I recently started getting what looks to be a stack trace being sent over serial from imx6 board. We have recently been seeing repeated unexpected GUI crashes on our hardware - I feel this is somewhat linked. Does this point to a DDR issue?

------------[ cut here ]------------
WARNING: CPU: 0 PID: 431 at /home/plexus/BUILD/yocto/build_ctems/tmp/work-shared/imx6dl-ctems-alpha/kernel-source/drivers/clk/clk.c:521 clk_core_unprepare+0x54/0x7c()
Modules linked in: mxc_dcic galcore(O)
CPU: 0 PID: 431 Comm: galcore daemon Tainted: G W O 4.1.15-2.0.1+ctems-rev3181 #1
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[<80014f70>] (unwind_backtrace) from [<80011e70>] (show_stack+0x10/0x14)
[<80011e70>] (show_stack) from [<806aab34>] (dump_stack+0x74/0xc0)
[<806aab34>] (dump_stack) from [<8002e8ec>] (warn_slowpath_common+0x80/0xac)
[<8002e8ec>] (warn_slowpath_common) from [<8002e9a8>] (warn_slowpath_null+0x18/0x20)
[<8002e9a8>] (warn_slowpath_null) from [<804e4834>] (clk_core_unprepare+0x54/0x7c)
[<804e4834>] (clk_core_unprepare) from [<804e5d1c>] (clk_unprepare+0x24/0x2c)
[<804e5d1c>] (clk_unprepare) from [<7f0090e0>] (_SetClock+0x130/0x138 [galcore])
[<7f0090e0>] (_SetClock [galcore]) from [<7f004890>] (gckOS_SetGPUPower+0xb4/0x10c [galcore])
[<7f004890>] (gckOS_SetGPUPower [galcore]) from [<7f01ea30>] (gckHARDWARE_SetPowerManagementState+0x83c/0xa6c [galcore])
[<7f01ea30>] (gckHARDWARE_SetPowerManagementState [galcore]) from [<7f00431c>] (gckOS_Broadcast+0x44/0x104 [galcore])
[<7f00431c>] (gckOS_Broadcast [galcore]) from [<7f010978>] (_TryToIdleGPU+0xe4/0x11c [galcore])
[<7f010978>] (_TryToIdleGPU [galcore]) from [<7f01202c>] (gckEVENT_Notify+0x408/0x430 [galcore])
[<7f01202c>] (gckEVENT_Notify [galcore]) from [<7f019e98>] (gckHARDWARE_Interrupt+0x6c/0x74 [galcore])
[<7f019e98>] (gckHARDWARE_Interrupt [galcore]) from [<7f0007b4>] (threadRoutine+0x4c/0x58 [galcore])
[<7f0007b4>] (threadRoutine [galcore]) from [<80044cd8>] (kthread+0xd8/0xec)
[<80044cd8>] (kthread) from [<8000ece8>] (ret_from_fork+0x14/0x2c)
---[ end trace e49e6ae6554eb1f4 ]---
------------[ cut here ]------------
WARNING: CPU: 1 PID: 431 at /home/plexus/BUILD/yocto/build_ctems/tmp/work-shared/imx6dl-ctems-alpha/kernel-source/drivers/clk/clk.c:521 clk_core_unprepare+0x54/0x7c()
Modules linked in: mxc_dcic galcore(O)
CPU: 1 PID: 431 Comm: galcore daemon Tainted: G W O 4.1.15-2.0.1+ctems-rev3181 #1
Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[<80014f70>] (unwind_backtrace) from [<80011e70>] (show_stack+0x10/0x14)
[<80011e70>] (show_stack) from [<806aab34>] (dump_stack+0x74/0xc0)
[<806aab34>] (dump_stack) from [<8002e8ec>] (warn_slowpath_common+0x80/0xac)
[<8002e8ec>] (warn_slowpath_common) from [<8002e9a8>] (warn_slowpath_null+0x18/0x20)
[<8002e9a8>] (warn_slowpath_null) from [<804e4834>] (clk_core_unprepare+0x54/0x7c)
[<804e4834>] (clk_core_unprepare) from [<804e5d1c>] (clk_unprepare+0x24/0x2c)
[<804e5d1c>] (clk_unprepare) from [<7f0090e0>] (_SetClock+0x130/0x138 [galcore])
[<7f0090e0>] (_SetClock [galcore]) from [<7f004890>] (gckOS_SetGPUPower+0xb4/0x10c [galcore])
[<7f004890>] (gckOS_SetGPUPower [galcore]) from [<7f01ea30>] (gckHARDWARE_SetPowerManagementState+0x83c/0xa6c [galcore])
[<7f01ea30>] (gckHARDWARE_SetPowerManagementState [galcore]) from [<7f00431c>] (gckOS_Broadcast+0x44/0x104 [galcore])
[<7f00431c>] (gckOS_Broadcast [galcore]) from [<7f010978>] (_TryToIdleGPU+0xe4/0x11c [galcore])
[<7f010978>] (_TryToIdleGPU [galcore]) from [<7f01202c>] (gckEVENT_Notify+0x408/0x430 [galcore])
[<7f01202c>] (gckEVENT_Notify [galcore]) from [<7f019e98>] (gckHARDWARE_Interrupt+0x6c/0x74 [galcore])
[<7f019e98>] (gckHARDWARE_Interrupt [galcore]) from [<7f0007b4>] (threadRoutine+0x4c/0x58 [galcore])
[<7f0007b4>] (threadRoutine [galcore]) from [<80044cd8>] (kthread+0xd8/0xec)
[<80044cd8>] (kthread) from [<8000ece8>] (ret_from_fork+0x14/0x2c)
---[ end trace e49e6ae6554eb1f5 ]---

Labels (1)
Tags (1)
0 Kudos
3 Replies

1,814 Views
Bio_TICFSL
NXP TechSupport
NXP TechSupport

Hello Sam,

This looks like a DDR issue, you can run the DDR Strees tool to verify.

In the meantime you can test the latest BSP L4.9.35 that has new version of the gpu driver including the galcore.

Regards

0 Kudos

1,814 Views
sam_clark
Contributor I

We have managed to prove there is a DDR issue, tests have been ran on 3 boards all of which fall over.

I have some other questions I would like help with if possible:

  • We had previously been running the stress test tool via u-boot, we're now testing via GUI (windows). The problem is that we are seeing different calibration values depending on the method of testing. I've seen some posts relating to this on the forums but I don't see a valid answer [ imx6 DDR calibration - U-boot or GUI ? ] . Importantly - why method should we be using to test? Which method gives the best values?
  • We need to ensure the script we're using for the stress tool is valid. We had a question regarding one of the parameters in the excel spreadsheet tool. At the top of the spread sheet for 'Device Info' there is this parameter: 'Clock Cycle Freq (MHz)3' - we are unsure if this is the max frequency the DDR can handle, or the actual frequency used for the DDR? Furthermore, there is confusion over the note (3) - for example if we are running at 396MHz - what should be input? 396 or 401MHz [or similar] - the note seems to suggest adding a margin - can you help/advise on what this means?

Regards,

0 Kudos

1,814 Views
sam_clark
Contributor I

Hi,

Thanks for the response. We have assumed it is a DDR issue however we are finding it nearly impossible to verify.

Last night I ran an overnight test at two different clock frequencies (380 + 396MHz) - tests were ran > 14 hours and no errors were reported. It is worth noting that we are using the Windows command line tool (V1.0.3) to run tests - I've checked the release notes to see if there noticeable changes - from what I can see it seems to be just adding GUI/JTAG/Uboot support?

I know it's probably good practice to update the tool version and run again - but would you still not expect the V1.0.3 tool to be able to uncover issues?

We've had DDR issues in the past, thought we fixed the issue (as we have had repeated stress tests passed on multiple boards) - the problem we have is we are struggling to get boards to fail to prove there is a DDR issue (even though all evidence is pointing to this).

Regards,

Sam

0 Kudos