DMA appears to be stuck

cancel
Showing results for 
Search instead for 
Did you mean: 

DMA appears to be stuck

2,180 Views
sheldonrucker
Contributor III

Hello All,

 

I'm working on a custom board using an i.mx6q processor and a JB4.2.2 build.  After upgrading the graphic drivers to the P13 release using the patches provided here (GPU upgrade to latest p13 for JB4.2.2_1.1.0 release) the system crashes due to a "DMA appears to be stuck" message as shown below. 

 

I've looked around on the community forums and I've only found one person who had somewhat of a similar problem (Q&A: How to make 16bit DDR + OpenGL/OpenVG working on imx6 solo?).  I tried adjusting PMU_REG_CORE[REG1_TARG] with no luck. 

 

Does anyone have any suggestions?  Any help would be greatly appreciated.

 

Attached is my kernel start up log just in case it helps.

 

**************************

***   GPU STATE DUMP   ***

**************************

  axi      = 0x00000051

  idle     = 0x7FFFFFFF

  DMA appears to be stuck at this address:

    0x19FB1858

  dmaLow   = 0x15400800

  dmaHigh  = 0x01FC0A40

  dmaState = 0x00000000

    command state       = 0 (PAR_IDLE_ST)

    command DMA state   = 0 (CMD_IDLE_ST)

    command fetch state = 0 (FET_IDLE_ST)

    DMA request state   = 0 (REQ_IDLE_ST)

    cal state           = 0 (CAL_IDLE_ST)

    VE request state    = 0 (VER_IDLE_ST)

  Debug registers of pipe[0]:

    RA debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x0BFF0000

      [0x05] 0xA0045000

      [0x06] 0x81BA8000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x12344321

      [0x0D] 0x12344321

      [0x0E] 0x12344321

      [0x0F] 0x12344321

      signature = 0x12344321 (1 read attempt(s))

    TX debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00000000

      [0x0F] 0x00000000

      failed to obtain the signature (read 0x00000000).

    FE debug registers:

      [0x00] 0x19FB1858

      [0x01] 0x15400800

      [0x02] 0x01FC0A40

      [0x03] 0x00000256

      [0x04] 0x0408074C

      [0x05] 0x00000000

      [0x06] 0x00009571

      [0x07] 0x00007645

      [0x08] 0x0000000E

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0xA3010000

      [0x0E] 0x00000096

      [0x0F] 0xBABEF00D

      signature = 0xBABEF00D (1 read attempt(s))

    PE debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0xA0000000

      [0x05] 0xABC00000

      [0x06] 0xBC000000

      [0x07] 0xCDE00000

      [0x08] 0xD04045C0

      [0x09] 0x204045C0

      [0x0A] 0x0D863284

      [0x0B] 0x00000000

      [0x0C] 0xBABEF00D

      [0x0D] 0xBABEF00D

      [0x0E] 0xBABEF00D

      [0x0F] 0xBABEF00D

      signature = 0xBABEF00D (1 read attempt(s))

    DE debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00000000

      [0x0F] 0x00000000

      failed to obtain the signature (read 0x00000000).

    SH debug registers:

      [0x00] 0x80FFEAAB

      [0x01] 0x555F0000

      [0x02] 0x0001FF05

      [0x03] 0x00010AAA

      [0x04] 0x00000000

      [0x05] 0x000064FA

      [0x06] 0x000064FA

      [0x07] 0x001F6FB0

      [0x08] 0x001F6322

      [0x09] 0x000000F0

      [0x0A] 0x0000002C

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00040A90

      [0x0F] 0xDEADBEEF

      signature = 0xDEADBEEF (1 read attempt(s))

    PA debug registers:

      [0x00] 0x800003FF

      [0x01] 0x26280000

      [0x02] 0x00000800

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x0000AAAA

      [0x0A] 0x0000AAAA

      [0x0B] 0x0000AAAA

      [0x0C] 0x0000AAAA

      [0x0D] 0x0000AAAA

      [0x0E] 0x0000AAAA

      [0x0F] 0x0000AAAA

      signature = 0x0000AAAA (1 read attempt(s))

    SE debug registers:

      [0x00] 0x00000003

      [0x01] 0x00000003

      [0x02] 0x00000003

      [0x03] 0x00000003

      [0x04] 0x00000003

      [0x05] 0x00000003

      [0x06] 0x00000003

      [0x07] 0x00000003

      [0x08] 0x00000003

      [0x09] 0x00000003

      [0x0A] 0x00000003

      [0x0B] 0x00000003

      [0x0C] 0x00000003

      [0x0D] 0x00000003

      [0x0E] 0x00000003

      [0x0F] 0x00000003

      failed to obtain the signature (read 0x00000003).

    MC debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x12345678

      [0x05] 0x12345678

      [0x06] 0x12345678

      [0x07] 0x12345678

      [0x08] 0x12345678

      [0x09] 0x12345678

      [0x0A] 0x12345678

      [0x0B] 0x12345678

      [0x0C] 0x12345678

      [0x0D] 0x12345678

      [0x0E] 0x12345678

      [0x0F] 0x12345678

      signature = 0x12345678 (1 read attempt(s))

    HI debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0xAAAAAAAA

      [0x04] 0xAAAAAAAA

      [0x05] 0xAAAAAAAA

      [0x06] 0xAAAAAAAA

      [0x07] 0xAAAAAAAA

      [0x08] 0xAAAAAAAA

      [0x09] 0xAAAAAAAA

      [0x0A] 0xAAAAAAAA

      [0x0B] 0xAAAAAAAA

      [0x0C] 0xAAAAAAAA

      [0x0D] 0xAAAAAAAA

      [0x0E] 0xAAAAAAAA

      [0x0F] 0xAAAAAAAA

      signature = 0xAAAAAAAA (1 read attempt(s))

    Other Registers:

      [0x0040] 0x001205B9

      [0x0044] 0x0038C0A0

      [0x004C] 0x0038C0A0

      [0x0050] 0x00071814

      [0x0054] 0x00071814

      [0x0058] 0x001205B9

      [0x005C] 0x000276FA

      [0x0060] 0x000276FA

      [0x043C] 0x00000000

      [0x0440] 0x00000000

      [0x0444] 0x00000000

      [0x0414] 0x3C000000

  Debug registers of pipe[1]:

    RA debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x0BFF0000

      [0x05] 0xA0045000

      [0x06] 0x81BA8000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x12344321

      [0x0D] 0x12344321

      [0x0E] 0x12344321

      [0x0F] 0x12344321

      signature = 0x12344321 (1 read attempt(s))

    TX debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00000000

      [0x0F] 0x00000000

      failed to obtain the signature (read 0x00000000).

    FE debug registers:

      [0x00] 0x19FB1858

      [0x01] 0x15400800

      [0x02] 0x01FC0A40

      [0x03] 0x00000256

      [0x04] 0x0408074C

      [0x05] 0x00000000

      [0x06] 0x00009571

      [0x07] 0x00007645

      [0x08] 0x0000000E

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0xA3010000

      [0x0E] 0x00000097

      [0x0F] 0xBABEF00D

      signature = 0xBABEF00D (1 read attempt(s))

    PE debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0xA0000000

      [0x05] 0xABC00000

      [0x06] 0xBC000000

      [0x07] 0xCDE00000

      [0x08] 0xD04045C0

      [0x09] 0x204045C0

      [0x0A] 0x0D863284

      [0x0B] 0x00000000

      [0x0C] 0xBABEF00D

      [0x0D] 0xBABEF00D

      [0x0E] 0xBABEF00D

      [0x0F] 0xBABEF00D

      signature = 0xBABEF00D (1 read attempt(s))

    DE debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x00000000

      [0x0A] 0x00000000

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00000000

      [0x0F] 0x00000000

      failed to obtain the signature (read 0x00000000).

    SH debug registers:

      [0x00] 0x80FFEAAB

      [0x01] 0x555F0000

      [0x02] 0x0001FF05

      [0x03] 0x00010AAA

      [0x04] 0x00000000

      [0x05] 0x000064FA

      [0x06] 0x000064FA

      [0x07] 0x001F6FB0

      [0x08] 0x001F6322

      [0x09] 0x000000F0

      [0x0A] 0x0000002C

      [0x0B] 0x00000000

      [0x0C] 0x00000000

      [0x0D] 0x00000000

      [0x0E] 0x00040A90

      [0x0F] 0xDEADBEEF

      signature = 0xDEADBEEF (1 read attempt(s))

    PA debug registers:

      [0x00] 0x800003FF

      [0x01] 0x26280000

      [0x02] 0x00000800

      [0x03] 0x00000000

      [0x04] 0x00000000

      [0x05] 0x00000000

      [0x06] 0x00000000

      [0x07] 0x00000000

      [0x08] 0x00000000

      [0x09] 0x0000AAAA

      [0x0A] 0x0000AAAA

      [0x0B] 0x0000AAAA

      [0x0C] 0x0000AAAA

      [0x0D] 0x0000AAAA

      [0x0E] 0x0000AAAA

      [0x0F] 0x0000AAAA

      signature = 0x0000AAAA (1 read attempt(s))

    SE debug registers:

      [0x00] 0x00000003

      [0x01] 0x00000003

      [0x02] 0x00000003

      [0x03] 0x00000003

      [0x04] 0x00000003

      [0x05] 0x00000003

      [0x06] 0x00000003

      [0x07] 0x00000003

      [0x08] 0x00000003

      [0x09] 0x00000003

      [0x0A] 0x00000003

      [0x0B] 0x00000003

      [0x0C] 0x00000003

      [0x0D] 0x00000003

      [0x0E] 0x00000003

      [0x0F] 0x00000003

      failed to obtain the signature (read 0x00000003).

    MC debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0x00000000

      [0x04] 0x12345678

      [0x05] 0x12345678

      [0x06] 0x12345678

      [0x07] 0x12345678

      [0x08] 0x12345678

      [0x09] 0x12345678

      [0x0A] 0x12345678

      [0x0B] 0x12345678

      [0x0C] 0x12345678

      [0x0D] 0x12345678

      [0x0E] 0x12345678

      [0x0F] 0x12345678

      signature = 0x12345678 (1 read attempt(s))

    HI debug registers:

      [0x00] 0x00000000

      [0x01] 0x00000000

      [0x02] 0x00000000

      [0x03] 0xAAAAAAAA

      [0x04] 0xAAAAAAAA

      [0x05] 0xAAAAAAAA

      [0x06] 0xAAAAAAAA

      [0x07] 0xAAAAAAAA

      [0x08] 0xAAAAAAAA

      [0x09] 0xAAAAAAAA

      [0x0A] 0xAAAAAAAA

      [0x0B] 0xAAAAAAAA

      [0x0C] 0xAAAAAAAA

      [0x0D] 0xAAAAAAAA

      [0x0E] 0xAAAAAAAA

      [0x0F] 0xAAAAAAAA

      signature = 0xAAAAAAAA (1 read attempt(s))

    Other Registers:

      [0x0040] 0x00129E3F

      [0x0044] 0x0038C8D8

      [0x004C] 0x0038C8D8

      [0x0050] 0x0007191B

      [0x0054] 0x0007191B

      [0x0058] 0x00129E3F

      [0x005C] 0x00028A0A

      [0x0060] 0x00028A0A

      [0x043C] 0x00000000

      [0x0440] 0x00000000

      [0x0444] 0x00000000

      [0x0414] 0x3C000000

[<c0053fc4>] (unwind_backtrace+0x0/0x138) from [<c0471928>] (gckOS_DumpCallStack+0x8/0x10)

[<c0471928>] (gckOS_DumpCallStack+0x8/0x10) from [<c04844f0>] (gckHARDWARE_DumpGPUState+0x63c/0x834)

[<c04844f0>] (gckHARDWARE_DumpGPUState+0x63c/0x834) from [<c04708c8>] (gckOS_Broadcast+0x38/0xe8)

[<c04708c8>] (gckOS_Broadcast+0x38/0xe8) from [<c04744d0>] (gckKERNEL_Dispatch+0x1020/0x1228)

[<c04744d0>] (gckKERNEL_Dispatch+0x1020/0x1228) from [<c046ccbc>] (drv_ioctl+0x120/0x270)

[<c046ccbc>] (drv_ioctl+0x120/0x270) from [<c0140108>] (do_vfs_ioctl+0x80/0x54c)

[<c0140108>] (do_vfs_ioctl+0x80/0x54c) from [<c014060c>] (sys_ioctl+0x38/0x5c)

[<c014060c>] (sys_ioctl+0x38/0x5c) from [<c004c900>] (ret_fast_syscall+0x0/0x30)

Original Attachment has been moved to: startup_log.txt.zip

Labels (1)
Tags (2)
5 Replies

685 Views
igorpadykov
NXP TechSupport
NXP TechSupport

Hi sheldon

you can check if it is caused by GPU supply,

PU_CAP pin can be monitored. Also one can try

different gpu mem settings, recommended to have 128M gpumem,

with 1G system memory.

Best regards

igor

0 Kudos

685 Views
sheldonrucker
Contributor III

Hello Igor,

Thank you for the suggestions. 

We monitored VDDPU_CAP_1 -> VDDPU_CAP_7 (as they're connected together) and found they cycled between 1.18V and 1.25V. The voltage was initially at 1.18V but when the "DMA appears to be stuck" message was displayed it increased to 1.25V.  Not sure but it looks like the voltage increased as the GPU started to perform some operations.

We've got 4GB of RAM on our system so I tried setting the gpumem to 128M, 256M, 512M, and 1G.  In all cases the unit continued to display the "DMA appears to be stuck" message.

Thank you,

Sheldon

685 Views
igorpadykov
NXP TechSupport
NXP TechSupport

Hi Sheldon

to further narrow down issue one can

run with kernel parameter maxcpus=1 (no smp) to exclude arm errata,

try kernel parameters enable_wait_mode=off,  ldo_active=off/on, check kernel

CONFIG_MX6_VPU_352M (increase SOC/PU voltage for VPU352MHz)

Best regards

igor

0 Kudos

685 Views
sheldonrucker
Contributor III

Hello Igor,

Our board currently uses a modified version of the sabrelite kernel board file (board-mx6q_sabrelite.c).  I found that if we switch to the sabresd version (board-mx6q_sabresd.c) the DMA hang error goes away.  To narrow down the issue I modified the board-mx6q_sabresd.c file replacing functionality with functionality from the the board-mx6q_sabrelite.c file.  With this I was able to track down the difference that appears to cause the DMA hang error to the following lines.

static struct ipuv3_fb_platform_data sabrelite_fb_data[] = {

  { /*fb0*/

  .disp_dev = "ldb",

  .interface_pix_fmt = IPU_PIX_FMT_RGB666,

  .mode_str = "LDB-XGA",

  .default_bpp = 16,

  .int_clk = false,

  }, {

  .disp_dev = "lcd",

  .interface_pix_fmt = IPU_PIX_FMT_RGB565,

  .mode_str = "CLAA-WVGA",

  .default_bpp = 16,

  .int_clk = false,

  }, {

  .disp_dev = "ldb",

  .interface_pix_fmt = IPU_PIX_FMT_RGB666,

  .mode_str = "LDB-SVGA",

  .default_bpp = 16,

  .int_clk = false,

  }, {

  .disp_dev = "ldb",

  .interface_pix_fmt = IPU_PIX_FMT_RGB666,

  .mode_str = "LDB-VGA",

  .default_bpp = 16,

  .int_clk = false,

  },

};

Above we see that the sabrelite version uses four sets of frame buffer data.  The sabresd version (shown below) only uses three sets.

static struct ipuv3_fb_platform_data sabresd_fb_data[] = {

  { /*fb0*/

  .disp_dev = "ldb",

  .interface_pix_fmt = IPU_PIX_FMT_RGB666,

  .mode_str = "LDB-XGA",

  .default_bpp = 16,

  .int_clk = false,

  .late_init = false,

  }, {

  .disp_dev = "hdmi",

  .interface_pix_fmt = IPU_PIX_FMT_RGB24,

  .mode_str = "1920x1080M@60",

  .default_bpp = 32,

  .int_clk = false,

  .late_init = false,

  }, {

  .disp_dev = "ldb",

  .interface_pix_fmt = IPU_PIX_FMT_RGB666,

  .mode_str = "LDB-XGA",

  .default_bpp = 16,

  .int_clk = false,

  .late_init = false,

  },

};

If I comment out any of the four sets of frame buffer data in the sabrelite version then the resulting code clears the DMA hang issue.  Our system does not require four displays so this will work as a fix.  However, I'm perplexed as to why the DMA hang error is displayed with four sets of frame buffer data and not with three. 

My initial thought is that it's a memory issue where allocating the resources for all four displays causes the GPU to not get enough memory.  However, I took the fbmem down as low as it could go and found that the DMA hang error was still present.  Any thoughts?

Thank you,

Sheldon

0 Kudos

685 Views
igorpadykov
NXP TechSupport
NXP TechSupport

Sheldon

this is interesting question, but sabrelite software is written

by boundary devices, you can post question on its forum.

Also one can try to debug it, there is many literature on that

topic, for example on link below

Kernel crash - using_objdump to disasm_debug_bug.pdf

https://community.freescale.com/message/465364#465364

~igor

0 Kudos