gdb: SIGTRAP running any program

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

gdb: SIGTRAP running any program

Jump to solution
4,905 Views
caryo_brien
Contributor III

We have a custom P1022 board and are using QorIQ-SDK-V1.6 to create

the software.  u-boot, lernel, and rootfs all run.

However, if we try and use gdbserver or even gdb on the board, immediately

upon starting the program (i.e. r) we get

warning: [   68.801449] Oops: Exception in kernel mode, sig: 5 [#2]
[   68.807255] SMP NR_CPUS=2 CTI NED
[   68.810564] Modules linked in: cy14b101_nvram(O)
[   68.815183] CPU: 0 PID: 1958 Comm: cp_client Tainted: G      D    O 3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #1
[   68.824915] task: b78c9f80 ti: cfff8000 task.ti: b7858000
[   68.830306] NIP: b000f86c LR: b000f8f4 CTR: b0066bbc
[   68.835262] REGS: cfff9f10 TRAP: 2002   Tainted: G      D    O  (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
[   68.844556] MSR: 00021000 <CE,ME>  CR: 22000a22  XER: 00000000
[   68.850396]
GPR00: b000f8f4 b7859f40 b78c9f80 00000000 00000002 00000000 00000000 00000100
GPR08: b7858060 b7858000 00021202 00021000 0020103c 00000000 00000000 10000000
GPR16: 0fff248c 0fff217c fffff000 00000000 0fff0f40 9f8b9668 00000000 0fff20e8
GPR24: 0ffdd678 0fff1d38 0fff2190 0fff0c78 0fff0cd8 0fff1820 0fff1810 9f8b9650
[   68.880134] NIP [b000f86c] recheck+0x10/0x24
[   68.884399] LR [b000f8f4] do_user_signal+0x74/0xc4
[   68.889179] Call Trace:
[   68.891621] [b7859f40] [b000f8f4] do_user_signal+0x74/0xc4 (unreliable)
[   68.898234] --- Exception: 0 at 0xffcfd3c
[   68.898234]     LR = 0xffc3b24
[   68.905274] Instruction dump:
[   68.908235] 3960ffff 7d704ba6 4e800020 7120000c 41820034 614a8000 7d400124 484e12d1
[   68.915993] 3d400002 614a1202 7d400124 54290024 <81290060> 7120000c 40a2ffdc 7120600e
[   68.923929] ---[ end trace 94924a23f071f21e ]---
[   68.928537]
Could not load shared library symbols for linux-vdso32.so.1.
Do you need "set solib-search-path" or "set sysroot"?

Program terminated with signal SIGTRAP, Trace/breakpoint trap.
The program no longer exists.

Google searching indicates the message about linux-vdso32.so.1 can be ignored.

Note that we had to change CONFIG_LOWMEM_SIZE, CONFIG_PAGE_OFFSET, and

CONFIG_KERNEL_START in the defconfig to make space for nor flash.  Could

this be a problem?

Ideas?

Thanks,

Cary O'Brien

Labels (1)
0 Kudos
1 Solution
3,182 Views
caryo_brien
Contributor III

I finally found the problem.  I had CONFIG_DEBUG_CW set in the kernel

configuration.   This is defined in arch/powerpc/Kconfig.debug as simply

"Include CodeWarrior kernel debugging".  It seems to change things in

arch/powerpc/include/asm/reg_booke.h, arch/powerpc/kernel/idle.c,

arch/powerpc/kernel/fsl_booke_entry_mapping.S, and

arch/powerpc/kernel/head_fsl_booke.S  One of those must be messing

up how gdb sets breakpoints or how the kernel handles breakpoints.

So if this happens, edit your defconfig, find DEBUG_CW line, and set it to

# CONFIG_DEBUG_CW is not set

Cary O'Brien

View solution in original post

0 Kudos
11 Replies
3,183 Views
caryo_brien
Contributor III

I finally found the problem.  I had CONFIG_DEBUG_CW set in the kernel

configuration.   This is defined in arch/powerpc/Kconfig.debug as simply

"Include CodeWarrior kernel debugging".  It seems to change things in

arch/powerpc/include/asm/reg_booke.h, arch/powerpc/kernel/idle.c,

arch/powerpc/kernel/fsl_booke_entry_mapping.S, and

arch/powerpc/kernel/head_fsl_booke.S  One of those must be messing

up how gdb sets breakpoints or how the kernel handles breakpoints.

So if this happens, edit your defconfig, find DEBUG_CW line, and set it to

# CONFIG_DEBUG_CW is not set

Cary O'Brien

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

What values did you set those CONFIG symbols to?  Can you reproduce this problem without that change (e.g. by leaving flash unmapped and running from a ram or network filesystem)?  Was there any output prior to the "Exception in kernel mode" line?  Is CONFIG_PPC_ADV_DEBUG_REGS enabled?

The taint line shows that this kernel has already had crash output, and has an out-of-tree module loaded -- could you try without the out-of-tree module, and be sure to report the first crash output and context that led up to it?  Have you made any other changes to the kernel?

0 Kudos
3,181 Views
caryo_brien
Contributor III

I tried it without the loadable module, gdb still terminates with

  Program terminated with signal SIGTRAP, Trace/breakpoint trap.

I backed out the kernel memory layout changes, gdb still terminates with the SIGTRAP.

In the defconfig I have CONFIG_PPC_ADV_DEBUG_REGS=y

One other observation: it is a P1022 with 2 cores, but currently only

the first core is running.  u-boot initialization fails for the second core.

Here is the error message from dmesg.

[   76.409265] Oops: Exception in kernel mode, sig: 5 [#1]
[   76.414490] SMP NR_CPUS=2 CTI NED
[   76.417799] Modules linked in:
[   76.420852] CPU: 0 PID: 1626 Comm: cp_client Not tainted 3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #1
[   76.429629] task: c78fea00 ti: efff8000 task.ti: c7a32000
[   76.435019] NIP: c000f86c LR: c000f8f4 CTR: c0066bbc
[   76.439975] REGS: efff9f10 TRAP: 2002   Not tainted  (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
[   76.448314] MSR: 00021000 <CE,ME>  CR: 22000a22  XER: 00000000
[   76.454153]
[   76.454153] GPR00: c000f8f4 c7a33f40 c78fea00 00000000 00000002 00000000 00000000 00000100
[   76.454153] GPR08: c7a32060 c7a32000 00021202 00021000 0020303c 00000000 00000000 10000000
[   76.454153] GPR16: 0fff248c 0fff217c fffff000 00000000 0fff0f40 bfe64a78 00000000 0fff20e8
[   76.454153] GPR24: 0ffdd678 0fff1d38 0fff2190 0fff0c78 0fff0cd8 0fff1820 0fff1810 bfe64a60
[   76.483895] NIP [c000f86c] recheck+0x10/0x24
[   76.488159] LR [c000f8f4] do_user_signal+0x74/0xc4
[   76.492940] Call Trace:
[   76.495383] [c7a33f40] [c000f8f4] do_user_signal+0x74/0xc4 (unreliable)
[   76.501993] --- Exception: 0 at 0xffcfd3c
[   76.501993]     LR = 0xffc3b24
[   76.509034] Instruction dump:
[   76.511995] 3960ffff 7d704ba6 4e800020 7120000c 41820034 614a8000 7d400124 484e15b1
[   76.519754] 3d400002 614a1202 7d400124 54290024 <81290060> 7120000c 40a2ffdc 7120600e
[   76.527690] ---[ end trace 5dbb35dc8293f831 ]---
[   76.532297]

Default memory layout settings:

CONFIG_LOWMEM_SIZE=0x30000000
CONFIG_LOWMEM_CAM_NUM=3
CONFIG_PAGE_OFFSET=0xc0000000
CONFIG_KERNEL_START=0xc0000000
CONFIG_PHYSICAL_START=0x00000000
CONFIG_PHYSICAL_ALIGN=0x04000000
CONFIG_TASK_SIZE=0xc0000000

Modified kernel memory layout:

CONFIG_ADVANCED_OPTIONS=y

CONFIG_LOWMEM_SIZE_BOOL=y

CONFIG_LOWMEM_SIZE=0x20000000

CONFIG_LOWMEM_CAM_NUM=3

# CONFIG_DYNAMIC_MEMSTART is not set

CONFIG_PAGE_OFFSET_BOOL=y

CONFIG_PAGE_OFFSET=0xB0000000

# CONFIG_KERNEL_START_BOOL is not set

CONFIG_KERNEL_START=0xB0000000

# CONFIG_PHYSICAL_START_BOOL is not set

CONFIG_PHYSICAL_START=0x00000000

CONFIG_PHYSICAL_ALIGN=0x04000000

CONFIG_TASK_SIZE_BOOL=y

CONFIG_TASK_SIZE=0xa0000000

Thanks,

Cary O'Brien

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

I tried running gdb on an e500v2 with an SDK 1.6 kernel, and could not reproduce this.  Could you provide your kernel config?

Again, was there any kernel output before the "Exception in kernel mode" line?

Could you try adding a WARN_ON(1); before the call do die() in _exception() in arch/powerpc/kernel/traps.c?  This should produce a stack trace to see where the SIGTRAP is coming from.

0 Kudos
3,182 Views
caryo_brien
Contributor III

I have attached defconfig.

I added the WARN_ON(1) and when I run the program under gdb, in dmesg all I get is

[  616.251677] ------------[ cut here ]------------
[  616.256874] WARNING: at b000aa08 [verbose debug info unavailable]
[  616.262956] Modules linked in:
[  616.266011] CPU: 0 PID: 1838 Comm: cp_client Tainted: G      D W    3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #1
[  616.275743] task: b786b480 ti: cfff8000 task.ti: b79be000
[  616.281132] NIP: b000aa08 LR: b000a928 CTR: 00000000
[  616.286089] REGS: cfff9d60 TRAP: 0700   Tainted: G      D W     (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
[  616.295384] MSR: 00000000 <>  CR: 24000a28  XER: 00000000
[  616.300781]
[  616.300781] GPR00: 00000000 cfff9e10 b786b480 cfff9e98 b05d4ea6 00000003 cfff9eda 7820636f
[  616.300781] GPR08: 64652025 00021000 00000000 b000bae0 44000a20 00000000 00000000 10000000
[  616.300781] GPR16: 0fff248c 0fff217c fffff000 00000000 0fff0f40 9faa3728 00000000 0fff20e8
[  616.300781] GPR24: 0ffdd678 0fff1d38 0fff2190 0fff0c78 b000f874 00030002 cfff9f10 00000005
[  616.330526] NIP [b000aa08] _exception+0x114/0x140
[  616.335224] LR [b000a928] _exception+0x34/0x140
[  616.339745] Call Trace:
[  616.342181] Instruction dump:
[  616.345143] 419eff6c 80a2011c 38610088 811e0080 388201f8 813e0090 7fe6fb78 7f87e378
[  616.352901] 7faaeb78 4cc63182 484e99c5 4bffff40 <0fe00000> 3c60b05d 7fc4f378 7fe5fb78
[  616.360833] ---[ end trace cb933b4c6c6e909a ]---
[  616.365444] Oops: Exception in kernel mode, sig: 5 [#2]
[  616.370658] SMP NR_CPUS=2 CTI NED
[  616.373966] Modules linked in:
[  616.377016] CPU: 0 PID: 1838 Comm: cp_client Tainted: G      D W    3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a #1
[  616.386747] task: b786b480 ti: cfff8000 task.ti: b79be000
[  616.392137] NIP: b000f874 LR: b000f8fc CTR: b0066bbc
[  616.397094] REGS: cfff9f10 TRAP: 2002   Tainted: G      D W     (3.12.19-rt30-QorIQ-SDK-V1.6+gc29fe1a)
[  616.406388] MSR: 00021000 <CE,ME>  CR: 24000a22  XER: 00000000
[  616.412226]
[  616.412226] GPR00: b000f8fc b79bff40 b786b480 00000000 00000002 00000000 00000000 00000100
[  616.412226] GPR08: b79be060 b79be000 00021202 00021000 0020103c 00000000 00000000 10000000
[  616.412226] GPR16: 0fff248c 0fff217c fffff000 00000000 0fff0f40 9faa3728 00000000 0fff20e8
[  616.412226] GPR24: 0ffdd678 0fff1d38 0fff2190 0fff0c78 0fff0cd8 0fff1820 0fff1810 9faa3710
[  616.441965] NIP [b000f874] recheck+0x10/0x24
[  616.446228] LR [b000f8fc] do_user_signal+0x74/0xc4
[  616.451008] Call Trace:
[  616.453450] [b79bff40] [b000f8fc] do_user_signal+0x74/0xc4 (unreliable)
[  616.460063] --- Exception: 0 at 0xffcfd3c
[  616.460063]     LR = 0xffc3b24
[  616.467102] Instruction dump:
[  616.470063] 3960ffff 7d704ba6 4e800020 7120000c 41820034 614a8000 7d400124 484e15a9
[  616.477822] 3d400002 614a1202 7d400124 54290024 <81290060> 7120000c 40a2ffdc 7120600e
[  616.485762] ---[ end trace cb933b4c6c6e909b ]---

I am working on building a kernel using the p1022rdb config from the Yocto to run on the

board, but this will take a while.

Thanks,

Cary O'Brien

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

Could you try cherry-picking 6cecf76b47ba6bea3c81d170afc2e0b244e5849c "powerpc/booke64: Fix kernel hangs at kernel_dbg_exc"?  Don't worry about the 64-bit stuff; the change to arch/powerpc/kernel/process.c is what I'm interested in.

If that doesn't help, could you try the latest upstream kernel to see if it's reproducible there?

Based on the above output I believe the SIGTRAP is coming from the _exception() call in DebugException(), which means that somehow single-step interrupts are being enabled inside kernel code.

0 Kudos
3,182 Views
caryo_brien
Contributor III

a) I compared the diff from kernel.googlesource.com with the code from 1.6, the

    comment and the "mtmsr(mfmsr() & ~MSR_DE);" are already there.

b) I'm not sure exactly where I would get the latest upstream kernel?  Is

    this from freescale or kernel.org?

c) I'm going to see if I have any luck tracking this down with CW and the

    USB tap.

Hopefully we can come up with a fix, our app development guys are not

happy to have lost gdb.

Thanks for your help,

Cary O'Brien

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

As a short-term workaround, you could also try disabling CONFIG_PPC_ADV_DEBUG_REGS.

0 Kudos
3,182 Views
caryo_brien
Contributor III

I can't seem to change CONFIG_PPC_ADV_DEBUG_REGS (and related values).

I change my defconfig, rebuild the kernel, but they revert in .config to

CONFIG_PPC_ADV_DEBUG_REGS=y
CONFIG_PPC_ADV_DEBUG_IACS=2
CONFIG_PPC_ADV_DEBUG_DACS=2
CONFIG_PPC_ADV_DEBUG_DVCS=0

I'm not sure how I should proceed here.

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

Sorry... I thought it was user-selectable, but apparently not -- and disabling it would enable support for DABR which this hardware doesn't have.

0 Kudos
3,182 Views
scottwood
NXP Employee
NXP Employee

Upstream would be from kernel.org.

0 Kudos