Hello All,
I'm working on a custom board using an i.mx6q processor and a JB4.2.2 build. After upgrading the graphic drivers to the P13 release using the patches provided here (GPU upgrade to latest p13 for JB4.2.2_1.1.0 release) the system crashes due to a "DMA appears to be stuck" message as shown below.
I've looked around on the community forums and I've only found one person who had somewhat of a similar problem (Q&A: How to make 16bit DDR + OpenGL/OpenVG working on imx6 solo?). I tried adjusting PMU_REG_CORE[REG1_TARG] with no luck.
Does anyone have any suggestions? Any help would be greatly appreciated.
Attached is my kernel start up log just in case it helps.
**************************
*** GPU STATE DUMP ***
**************************
axi = 0x00000051
idle = 0x7FFFFFFF
DMA appears to be stuck at this address:
0x19FB1858
dmaLow = 0x15400800
dmaHigh = 0x01FC0A40
dmaState = 0x00000000
command state = 0 (PAR_IDLE_ST)
command DMA state = 0 (CMD_IDLE_ST)
command fetch state = 0 (FET_IDLE_ST)
DMA request state = 0 (REQ_IDLE_ST)
cal state = 0 (CAL_IDLE_ST)
VE request state = 0 (VER_IDLE_ST)
Debug registers of pipe[0]:
RA debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x0BFF0000
[0x05] 0xA0045000
[0x06] 0x81BA8000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x12344321
[0x0D] 0x12344321
[0x0E] 0x12344321
[0x0F] 0x12344321
signature = 0x12344321 (1 read attempt(s))
TX debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00000000
[0x0F] 0x00000000
failed to obtain the signature (read 0x00000000).
FE debug registers:
[0x00] 0x19FB1858
[0x01] 0x15400800
[0x02] 0x01FC0A40
[0x03] 0x00000256
[0x04] 0x0408074C
[0x05] 0x00000000
[0x06] 0x00009571
[0x07] 0x00007645
[0x08] 0x0000000E
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0xA3010000
[0x0E] 0x00000096
[0x0F] 0xBABEF00D
signature = 0xBABEF00D (1 read attempt(s))
PE debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0xA0000000
[0x05] 0xABC00000
[0x06] 0xBC000000
[0x07] 0xCDE00000
[0x08] 0xD04045C0
[0x09] 0x204045C0
[0x0A] 0x0D863284
[0x0B] 0x00000000
[0x0C] 0xBABEF00D
[0x0D] 0xBABEF00D
[0x0E] 0xBABEF00D
[0x0F] 0xBABEF00D
signature = 0xBABEF00D (1 read attempt(s))
DE debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00000000
[0x0F] 0x00000000
failed to obtain the signature (read 0x00000000).
SH debug registers:
[0x00] 0x80FFEAAB
[0x01] 0x555F0000
[0x02] 0x0001FF05
[0x03] 0x00010AAA
[0x04] 0x00000000
[0x05] 0x000064FA
[0x06] 0x000064FA
[0x07] 0x001F6FB0
[0x08] 0x001F6322
[0x09] 0x000000F0
[0x0A] 0x0000002C
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00040A90
[0x0F] 0xDEADBEEF
signature = 0xDEADBEEF (1 read attempt(s))
PA debug registers:
[0x00] 0x800003FF
[0x01] 0x26280000
[0x02] 0x00000800
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x0000AAAA
[0x0A] 0x0000AAAA
[0x0B] 0x0000AAAA
[0x0C] 0x0000AAAA
[0x0D] 0x0000AAAA
[0x0E] 0x0000AAAA
[0x0F] 0x0000AAAA
signature = 0x0000AAAA (1 read attempt(s))
SE debug registers:
[0x00] 0x00000003
[0x01] 0x00000003
[0x02] 0x00000003
[0x03] 0x00000003
[0x04] 0x00000003
[0x05] 0x00000003
[0x06] 0x00000003
[0x07] 0x00000003
[0x08] 0x00000003
[0x09] 0x00000003
[0x0A] 0x00000003
[0x0B] 0x00000003
[0x0C] 0x00000003
[0x0D] 0x00000003
[0x0E] 0x00000003
[0x0F] 0x00000003
failed to obtain the signature (read 0x00000003).
MC debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x12345678
[0x05] 0x12345678
[0x06] 0x12345678
[0x07] 0x12345678
[0x08] 0x12345678
[0x09] 0x12345678
[0x0A] 0x12345678
[0x0B] 0x12345678
[0x0C] 0x12345678
[0x0D] 0x12345678
[0x0E] 0x12345678
[0x0F] 0x12345678
signature = 0x12345678 (1 read attempt(s))
HI debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0xAAAAAAAA
[0x04] 0xAAAAAAAA
[0x05] 0xAAAAAAAA
[0x06] 0xAAAAAAAA
[0x07] 0xAAAAAAAA
[0x08] 0xAAAAAAAA
[0x09] 0xAAAAAAAA
[0x0A] 0xAAAAAAAA
[0x0B] 0xAAAAAAAA
[0x0C] 0xAAAAAAAA
[0x0D] 0xAAAAAAAA
[0x0E] 0xAAAAAAAA
[0x0F] 0xAAAAAAAA
signature = 0xAAAAAAAA (1 read attempt(s))
Other Registers:
[0x0040] 0x001205B9
[0x0044] 0x0038C0A0
[0x004C] 0x0038C0A0
[0x0050] 0x00071814
[0x0054] 0x00071814
[0x0058] 0x001205B9
[0x005C] 0x000276FA
[0x0060] 0x000276FA
[0x043C] 0x00000000
[0x0440] 0x00000000
[0x0444] 0x00000000
[0x0414] 0x3C000000
Debug registers of pipe[1]:
RA debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x0BFF0000
[0x05] 0xA0045000
[0x06] 0x81BA8000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x12344321
[0x0D] 0x12344321
[0x0E] 0x12344321
[0x0F] 0x12344321
signature = 0x12344321 (1 read attempt(s))
TX debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00000000
[0x0F] 0x00000000
failed to obtain the signature (read 0x00000000).
FE debug registers:
[0x00] 0x19FB1858
[0x01] 0x15400800
[0x02] 0x01FC0A40
[0x03] 0x00000256
[0x04] 0x0408074C
[0x05] 0x00000000
[0x06] 0x00009571
[0x07] 0x00007645
[0x08] 0x0000000E
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0xA3010000
[0x0E] 0x00000097
[0x0F] 0xBABEF00D
signature = 0xBABEF00D (1 read attempt(s))
PE debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0xA0000000
[0x05] 0xABC00000
[0x06] 0xBC000000
[0x07] 0xCDE00000
[0x08] 0xD04045C0
[0x09] 0x204045C0
[0x0A] 0x0D863284
[0x0B] 0x00000000
[0x0C] 0xBABEF00D
[0x0D] 0xBABEF00D
[0x0E] 0xBABEF00D
[0x0F] 0xBABEF00D
signature = 0xBABEF00D (1 read attempt(s))
DE debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x00000000
[0x0A] 0x00000000
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00000000
[0x0F] 0x00000000
failed to obtain the signature (read 0x00000000).
SH debug registers:
[0x00] 0x80FFEAAB
[0x01] 0x555F0000
[0x02] 0x0001FF05
[0x03] 0x00010AAA
[0x04] 0x00000000
[0x05] 0x000064FA
[0x06] 0x000064FA
[0x07] 0x001F6FB0
[0x08] 0x001F6322
[0x09] 0x000000F0
[0x0A] 0x0000002C
[0x0B] 0x00000000
[0x0C] 0x00000000
[0x0D] 0x00000000
[0x0E] 0x00040A90
[0x0F] 0xDEADBEEF
signature = 0xDEADBEEF (1 read attempt(s))
PA debug registers:
[0x00] 0x800003FF
[0x01] 0x26280000
[0x02] 0x00000800
[0x03] 0x00000000
[0x04] 0x00000000
[0x05] 0x00000000
[0x06] 0x00000000
[0x07] 0x00000000
[0x08] 0x00000000
[0x09] 0x0000AAAA
[0x0A] 0x0000AAAA
[0x0B] 0x0000AAAA
[0x0C] 0x0000AAAA
[0x0D] 0x0000AAAA
[0x0E] 0x0000AAAA
[0x0F] 0x0000AAAA
signature = 0x0000AAAA (1 read attempt(s))
SE debug registers:
[0x00] 0x00000003
[0x01] 0x00000003
[0x02] 0x00000003
[0x03] 0x00000003
[0x04] 0x00000003
[0x05] 0x00000003
[0x06] 0x00000003
[0x07] 0x00000003
[0x08] 0x00000003
[0x09] 0x00000003
[0x0A] 0x00000003
[0x0B] 0x00000003
[0x0C] 0x00000003
[0x0D] 0x00000003
[0x0E] 0x00000003
[0x0F] 0x00000003
failed to obtain the signature (read 0x00000003).
MC debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0x00000000
[0x04] 0x12345678
[0x05] 0x12345678
[0x06] 0x12345678
[0x07] 0x12345678
[0x08] 0x12345678
[0x09] 0x12345678
[0x0A] 0x12345678
[0x0B] 0x12345678
[0x0C] 0x12345678
[0x0D] 0x12345678
[0x0E] 0x12345678
[0x0F] 0x12345678
signature = 0x12345678 (1 read attempt(s))
HI debug registers:
[0x00] 0x00000000
[0x01] 0x00000000
[0x02] 0x00000000
[0x03] 0xAAAAAAAA
[0x04] 0xAAAAAAAA
[0x05] 0xAAAAAAAA
[0x06] 0xAAAAAAAA
[0x07] 0xAAAAAAAA
[0x08] 0xAAAAAAAA
[0x09] 0xAAAAAAAA
[0x0A] 0xAAAAAAAA
[0x0B] 0xAAAAAAAA
[0x0C] 0xAAAAAAAA
[0x0D] 0xAAAAAAAA
[0x0E] 0xAAAAAAAA
[0x0F] 0xAAAAAAAA
signature = 0xAAAAAAAA (1 read attempt(s))
Other Registers:
[0x0040] 0x00129E3F
[0x0044] 0x0038C8D8
[0x004C] 0x0038C8D8
[0x0050] 0x0007191B
[0x0054] 0x0007191B
[0x0058] 0x00129E3F
[0x005C] 0x00028A0A
[0x0060] 0x00028A0A
[0x043C] 0x00000000
[0x0440] 0x00000000
[0x0444] 0x00000000
[0x0414] 0x3C000000
[<c0053fc4>] (unwind_backtrace+0x0/0x138) from [<c0471928>] (gckOS_DumpCallStack+0x8/0x10)
[<c0471928>] (gckOS_DumpCallStack+0x8/0x10) from [<c04844f0>] (gckHARDWARE_DumpGPUState+0x63c/0x834)
[<c04844f0>] (gckHARDWARE_DumpGPUState+0x63c/0x834) from [<c04708c8>] (gckOS_Broadcast+0x38/0xe8)
[<c04708c8>] (gckOS_Broadcast+0x38/0xe8) from [<c04744d0>] (gckKERNEL_Dispatch+0x1020/0x1228)
[<c04744d0>] (gckKERNEL_Dispatch+0x1020/0x1228) from [<c046ccbc>] (drv_ioctl+0x120/0x270)
[<c046ccbc>] (drv_ioctl+0x120/0x270) from [<c0140108>] (do_vfs_ioctl+0x80/0x54c)
[<c0140108>] (do_vfs_ioctl+0x80/0x54c) from [<c014060c>] (sys_ioctl+0x38/0x5c)
[<c014060c>] (sys_ioctl+0x38/0x5c) from [<c004c900>] (ret_fast_syscall+0x0/0x30)
Original Attachment has been moved to: startup_log.txt.zip
Hi sheldon
you can check if it is caused by GPU supply,
PU_CAP pin can be monitored. Also one can try
different gpu mem settings, recommended to have 128M gpumem,
with 1G system memory.
Best regards
igor
Hello Igor,
Thank you for the suggestions.
We monitored VDDPU_CAP_1 -> VDDPU_CAP_7 (as they're connected together) and found they cycled between 1.18V and 1.25V. The voltage was initially at 1.18V but when the "DMA appears to be stuck" message was displayed it increased to 1.25V. Not sure but it looks like the voltage increased as the GPU started to perform some operations.
We've got 4GB of RAM on our system so I tried setting the gpumem to 128M, 256M, 512M, and 1G. In all cases the unit continued to display the "DMA appears to be stuck" message.
Thank you,
Sheldon
Hi Sheldon
to further narrow down issue one can
run with kernel parameter maxcpus=1 (no smp) to exclude arm errata,
try kernel parameters enable_wait_mode=off, ldo_active=off/on, check kernel
CONFIG_MX6_VPU_352M (increase SOC/PU voltage for VPU352MHz)
Best regards
igor
Hello Igor,
Our board currently uses a modified version of the sabrelite kernel board file (board-mx6q_sabrelite.c). I found that if we switch to the sabresd version (board-mx6q_sabresd.c) the DMA hang error goes away. To narrow down the issue I modified the board-mx6q_sabresd.c file replacing functionality with functionality from the the board-mx6q_sabrelite.c file. With this I was able to track down the difference that appears to cause the DMA hang error to the following lines.
static struct ipuv3_fb_platform_data sabrelite_fb_data[] = {
{ /*fb0*/
.disp_dev = "ldb",
.interface_pix_fmt = IPU_PIX_FMT_RGB666,
.mode_str = "LDB-XGA",
.default_bpp = 16,
.int_clk = false,
}, {
.disp_dev = "lcd",
.interface_pix_fmt = IPU_PIX_FMT_RGB565,
.mode_str = "CLAA-WVGA",
.default_bpp = 16,
.int_clk = false,
}, {
.disp_dev = "ldb",
.interface_pix_fmt = IPU_PIX_FMT_RGB666,
.mode_str = "LDB-SVGA",
.default_bpp = 16,
.int_clk = false,
}, {
.disp_dev = "ldb",
.interface_pix_fmt = IPU_PIX_FMT_RGB666,
.mode_str = "LDB-VGA",
.default_bpp = 16,
.int_clk = false,
},
};
Above we see that the sabrelite version uses four sets of frame buffer data. The sabresd version (shown below) only uses three sets.
static struct ipuv3_fb_platform_data sabresd_fb_data[] = {
{ /*fb0*/
.disp_dev = "ldb",
.interface_pix_fmt = IPU_PIX_FMT_RGB666,
.mode_str = "LDB-XGA",
.default_bpp = 16,
.int_clk = false,
.late_init = false,
}, {
.disp_dev = "hdmi",
.interface_pix_fmt = IPU_PIX_FMT_RGB24,
.mode_str = "1920x1080M@60",
.default_bpp = 32,
.int_clk = false,
.late_init = false,
}, {
.disp_dev = "ldb",
.interface_pix_fmt = IPU_PIX_FMT_RGB666,
.mode_str = "LDB-XGA",
.default_bpp = 16,
.int_clk = false,
.late_init = false,
},
};
If I comment out any of the four sets of frame buffer data in the sabrelite version then the resulting code clears the DMA hang issue. Our system does not require four displays so this will work as a fix. However, I'm perplexed as to why the DMA hang error is displayed with four sets of frame buffer data and not with three.
My initial thought is that it's a memory issue where allocating the resources for all four displays causes the GPU to not get enough memory. However, I took the fbmem down as low as it could go and found that the DMA hang error was still present. Any thoughts?
Thank you,
Sheldon
Sheldon
this is interesting question, but sabrelite software is written
by boundary devices, you can post question on its forum.
Also one can try to debug it, there is many literature on that
topic, for example on link below
Kernel crash - using_objdump to disasm_debug_bug.pdf
https://community.freescale.com/message/465364#465364
~igor