How do I debug Kernel Exceptions?

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

How do I debug Kernel Exceptions?

2,617 Views
peterreichert
Contributor I

I seem to have multiple exceptions in tcp_transmit_skb().  Two shown below.    We have 44 boards running, and only a couple are failing.  The boards are new instances of an old design that has been working for several years now.  We recently changed uboot, but I can't say what changed because we don't have the source code for the previous uboot.  I haven't touched the kernel. 

I have got to believe there is information in the these logs that would help us debug the problem, but I can't find any documentation.  For example, what does "TRAP: 0700" mean?

The problem only occurs when the board is being heavily used and the ambient air temperature is around 40C.

Jan  3 21:51:56 freescale user.emerg kernel: skb_under_panic: text:c02190d0 len:74 put:32 head:d8aaf400 data:d8aaf3e0 tail:0xd8aaf42a end:0xd8aaf4a0 dev:<NULL>

Jan  3 21:51:56 freescale user.emerg kernel: ------------[ cut here ]------------

Jan  3 21:51:56 freescale user.crit kernel: Kernel BUG at c01d51a0 [verbose debug info unavailable]

Jan  3 21:51:56 freescale user.warn kernel: Oops: Exception in kernel mode, sig: 5 [#1]

Jan  3 21:51:56 freescale user.warn kernel: SMP NR_CPUS=2 P2020 DS

Jan  3 21:51:56 freescale user.warn kernel: Modules linked in:

Jan  3 21:51:56 freescale user.warn kernel: NIP: c01d51a0 LR: c01d51a0 CTR: c0188888

Jan  3 21:51:56 freescale user.warn kernel: REGS: d7e4fb50 TRAP: 0700   Not tainted  (2.6.32-svn5914)

Jan  3 21:51:56 freescale user.warn kernel: MSR: 00029000 <EE,ME,CE>  CR: 22442424  XER: 20000000

Jan  3 21:51:56 freescale user.warn kernel: TASK = dbe9a600[599] 'python' THREAD: d7e4e000 CPU: 1

Jan  3 21:51:56 freescale user.warn kernel: GPR00: c01d51a0 d7e4fc00 dbe9a600 00000079 00021000 ffffffff c0189440 00000000

Jan  3 21:51:56 freescale user.warn kernel: GPR08: 000028b6 c03149ac 00000073 00465000 22442422 101a3bfc 00000000 00000000

Jan  3 21:51:56 freescale user.warn kernel: GPR16:

Jan  3 21:51:56 freescale user.info kernel: 00000003

Jan  3 21:51:56 freescale user.info kernel: 00010000 00000000 00004000 d7e4fce8 dbe7a7dc c03464e4 00000000

Jan  3 21:51:56 freescale user.warn kernel: GPR24: dbe7a7dc 00000020 c03390a4 00000020 d7e4fc30 d7592fec d7592fc8 d7592fc8

Jan  3 21:51:56 freescale user.warn kernel: NIP [c01d51a0] skb_under_panic+0x48/0x5c

Jan  3 21:51:56 freescale user.warn kernel: LR [c01d51a0] skb_under_panic+0x48/0x5c

Jan  3 21:51:56 freescale user.warn kernel: Call Trace:

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fc00] [c01d51a0] skb_under_panic+0x48/0x5c (unreliable)

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fc10] [c01d71a4] skb_push+0x58/0x60

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fc20] [c02190d0] tcp_transmit_skb+0xdc/0x760

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fc80] [c021bfb8] tcp_write_xmit+0x1fc/0x480

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fcd0] [c021c2a8] __tcp_push_pending_frames+0x38/0xb8

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fce0] [c020e474] tcp_sendmsg+0x1bc/0xc04

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fd60] [c01d0178] sock_sendmsg+0xb4/0xec

Jan  3 21:51:56 freescale user.warn kernel: [d7e4fe40] [c01d050c] sys_sendto+0xbc/0xf0

Jan  3 21:51:56 freescale user.warn kernel: [d7e4ff10] [c01d1090] sys_socketcall+0x1c0/0x238

Jan  3 21:51:56 freescale user.warn kernel: [d7e4ff40] [c000facc] ret_from_syscall+0x0/0x3c

Jan  3 21:51:56 freescale user.warn kernel: Instruction dump:

Jan  3 21:51:56 freescale user.warn kernel: 2f800000 80e30098 8103009c 81230090 81430094 419e0024 3c60c02c 90010008

Jan  3 21:51:56 freescale user.warn kernel: 7ca42b78 38637a98 7d655b78 480819b1 <0fe00000> 48000000 3c80c02a 3804131c

Jan  3 21:51:56 freescale user.warn kernel: ---[ end trace d0a44476c96c002e ]---

Jan  4 00:53:19 freescale auth.info login[662]: root login on 'pts/0'

Jan  4 01:39:22 freescale user.emerg kernel: skb_under_panic: text:c02190d0 len:95 put:32 head:d8975a00 data:d89759e0 tail:0xd8975a3f end:0xd8975aa0 dev:<NULL>

Jan  4 01:39:22 freescale user.emerg kernel: ------------[ cut here ]------------

Jan  4 01:39:23 freescale user.crit kernel: Kernel BUG at c01d51a0 [verbose debug info unavailable]

Jan  4 01:39:23 freescale user.warn kernel: Oops: Exception in kernel mode, sig: 5 [#2]

Jan  4 01:39:23 freescale user.warn kernel: SMP NR_CPUS=2 P2020 DS

Jan  4 01:39:23 freescale user.warn kernel: Modules linked in:

Jan  4 01:39:23 freescale user.warn kernel: NIP: c01d51a0 LR: c01d51a0 CTR: c0188888

Jan  4 01:39:23 freescale user.warn kernel: REGS: dbee3800 TRAP: 0700   Tainted: G      D     (2.6.32-svn5914)

Jan  4 01:39:23 freescale user.warn kernel: MSR: 00029000 <EE,ME,CE>  CR: 24422424  XER: 20000000

Jan  4 01:39:23 freescale user.warn kernel: TASK = dbe9af80[160] 'SIO3_SuperAppli' THREAD: dbee2000 CPU: 0

Jan  4 01:39:23 freescale user.warn kernel: GPR00: c01d51a0 dbee38b0 dbe9af80 00000079 00021000 ffffffff c0189440 00000000

Jan  4 01:39:23 freescale user.warn kernel: GPR08: 00002f5a c03149ac 00000073 00455000 24422422 1010c278 00000000 00000000

Jan  4 01:39:23 freescale user.warn kernel: GPR16: 00000003 00010000 00000000 000005a8 dbee3998 dbe7893c c03464e4 00000000

Jan  4 01:39:23 freescale user.warn kernel: GPR24: dbe7893c 00000020 c03390a4 00000020 dbee38e0 d47d07ac d47d0788 d47d0788

Jan  4 01:39:23 freescale user.warn kernel: NIP [c01d51a0] skb_under_panic+0x48/0x5c

Jan  4 01:39:23 freescale user.warn kernel: LR [c01d51a0] skb_under_panic+0x48/0x5c

Jan  4 01:39:23 freescale user.warn kernel: Call Trace:

Jan  4 01:39:23 freescale user.warn kernel: [dbee38b0] [c01d51a0] skb_under_panic+0x48/0x5c (unreliable)

Jan  4 01:39:23 freescale user.warn kernel: [dbee38c0] [c01d71a4] skb_push+0x58/0x60

Jan  4 01:39:23 freescale user.warn kernel: [dbee38d0] [c02190d0] tcp_transmit_skb+0xdc/0x760

Jan  4 01:39:23 freescale user.warn kernel: [dbee3930] [c021bfb8] tcp_write_xmit+0x1fc/0x480

Jan  4 01:39:23 freescale user.warn kernel: [dbee3980] [c021c2a8] __tcp_push_pending_frames+0x38/0xb8

Jan  4 01:39:23 freescale user.warn kernel: [dbee3990] [c020e474] tcp_sendmsg+0x1bc/0xc04

Jan  4 01:39:23 freescale user.warn kernel: [dbee3a10] [c01d0178] sock_sendmsg+0xb4/0xec

Jan  4 01:39:23 freescale user.warn kernel: [dbee3af0] [c01d0578] kernel_sendmsg+0x2c/0x44

Jan  4 01:39:23 freescale user.warn kernel: [dbee3b00] [c010ad38] smb_sendv+0x104/0x304

Jan  4 01:39:23 freescale user.warn kernel: [dbee3b80] [c010b000] SendReceive2+0xc8/0x4f4

Jan  4 01:39:23 freescale user.warn kernel: [dbee3bc0] [c00f6fb0] CIFSSMBRead+0x16c/0x320

Jan  4 01:39:23 freescale user.warn kernel: [dbee3c10] [c010356c] T.1018+0xf8/0x2ac

Jan  4 01:39:23 freescale user.warn kernel: [dbee3c70] [c01037b4] cifs_readpage_worker+0x94/0x1ec

Jan  4 01:39:23 freescale user.warn kernel: [dbee3ca0] [c0103a9c] cifs_write_begin+0x190/0x210

Jan  4 01:39:23 freescale user.warn kernel: [dbee3ce0] [c0065958] generic_perform_write+0xc0/0x1e8

Jan  4 01:39:23 freescale user.warn kernel: [dbee3d40] [c0067708] generic_file_buffered_write+0x64/0xec

Jan  4 01:39:23 freescale user.warn kernel: [dbee3d80] [c0067d0c] __generic_file_aio_write+0x33c/0x50c

Jan  4 01:39:23 freescale user.warn kernel: [dbee3df0] [c0067f4c] generic_file_aio_write+0x70/0xf0

Jan  4 01:39:23 freescale user.warn kernel: [dbee3e20] [c00ee61c] cifs_file_aio_write+0x20/0x50

Jan  4 01:39:23 freescale user.warn kernel: [dbee3e30] [c00921bc] do_sync_write+0xc4/0x138

Jan  4 01:39:23 freescale user.warn kernel: [dbee3ef0] [c00922e4] vfs_write+0xb4/0x10c

Jan  4 01:39:23 freescale user.warn kernel: [dbee3f10] [c0092424] sys_write+0x4c/0x90

Jan  4 01:39:23 freescale user.warn kernel: [dbee3f40] [c000facc] ret_from_syscall+0x0/0x3c

Jan  4 01:39:23 freescale user.warn kernel: Instruction dump:

Jan  4 01:39:23 freescale user.warn kernel: 2f800000 80e30098 8103009c 81230090 81430094 419e0024 3c60c02c 90010008

Jan  4 01:39:23 freescale user.warn kernel: 7ca42b78 38637a98 7d655b78 480819b1 <0fe00000> 48000000 3c80c02a 3804131c

Jan  4 01:39:23 freescale user.warn kernel: ---[ end trace d0a44476c96c002f ]---

Jan  4 02:37:38 freescale user.debug kernel: prune_queue: c=70c81daf

Jan  4 03:28:45 freescale user.debug kernel: prune_queue: c=70c81daf

Jan  4 04:18:53 freescale user.debug kernel: prune_queue: c=70c81daf

0 Kudos
1 Reply

1,097 Views
scottwood
NXP Employee
NXP Employee

You should turn on CONFIG_DEBUG_BUGVERBOSE to get better reporting of such events, but in this case there's a previous print saying that it was a call to skb_under_panic().  Since it's temperature dependent, it's probably not a software issue.

The trap number tells you what sort of exception you took -- in this case, it's a program exception (see "EXC_XFER_STD(0x0700, program_check_exception)").  This program exception was deliberately triggered by BUG() in order to generate a backtrace.

0 Kudos