Gianfar skb errors

cancel
Showing results for 
Search instead for 
Did you mean: 

Gianfar skb errors

113 Views
Contributor I

Hello,

CPU:  e300c1, MPC8343A, Rev: 3.0 

I am experience kernel panics using Kernel 4.19.87 related to, what appears to be, socket buffer corruption. A few different crashes occur upon traffic being sent/received. Sometimes these occur instantly, others up to a minute or so after everything is up and configured. 

1:

queue_mapping=1 skbaddr=cf929000 protocol=0x0800 ip_summed=0 len=303 data_len=0 network_offset=-78 transport_offset_valid=0 transport_offset=65457 tx_flags=43 gso_size=5 gso_segs=2 gso_type=0x0

[  125.564168] Unable to handle kernel paging request for data at address 0x81000000
[  125.571671] Faulting instruction address: 0xc03c42ec
[  125.576656] Oops: Kernel access of bad area, sig: 11 [#1]
[  125.582065] BE PREEMPT eMPC
[  125.584887] CPU: 0 PID: 0 Comm: swapper Not tainted kernel_upgrade #5
[  125.594567] NIP:  c03c42ec LR: c03c3f60 CTR: c02a6084
[  125.599636] REGS: cfff5ca0 TRAP: 0300   Not tainted  (kernel_upgrade )
[  125.609310] MSR:  00009032 <EE,ME,IR,DR,RI>  CR: 24028224  XER: 20000000
[  125.616054] DAR: 81000000 DSISR: 20000000
[  125.616054] GPR00: c03c3f60 cfff5d50 c067b420 0000001c d101c501 00000000 00000aac c06f1c1b
[  125.616054] GPR08: c06cac48 00000000 cfff4000 cfff5d10 44028224 00900000 cf97c878 00480020
[  125.616054] GPR16: 00480020 c0555a98 cf9290a8 00000000 c0607325 c0607313 c0555aa4 00000800
[  125.616054] GPR24: 00000001 cfff5e28 c0555874 c0555850 c0555810 81000000 cf929000 c0607325
[  125.653754] NIP [c03c42ec] skb_copy_ubufs+0x484/0x4d4
[  125.658827] LR [c03c3f60] skb_copy_ubufs+0xf8/0x4d4
[  125.663714] Call Trace:
[  125.666177] [cfff5d50] [c03c3f60] skb_copy_ubufs+0xf8/0x4d4 (unreliable)
[  125.672916] [cfff5da0] [c03d29a4] __netif_receive_skb_core+0x9c0/0xbd8
[  125.679472] [cfff5e20] [c03d2bf0] __netif_receive_skb_one_core+0x34/0x60
[  125.686208] [cfff5e40] [c03d77f0] netif_receive_skb_internal+0x7c/0xec
[  125.692766] [cfff5e50] [c03d8a24] napi_gro_receive+0xf8/0x124
[  125.698547] [cfff5e70] [c0338610] gfar_clean_rx_ring+0x640/0x674
[  125.704581] [cfff5f00] [c0338808] gfar_poll_rx_sq+0x48/0xdc
[  125.710180] [cfff5f20] [c03d92b0] net_rx_action+0x12c/0x308
[  125.715788] [cfff5f80] [c051b6b0] __do_softirq+0x230/0x32c
[  125.721313] [cfff5fe0] [c00237c0] irq_exit+0x80/0xa0
[  125.726309] [cfff5ff0] [c000e3a4] call_do_irq+0x24/0x3c
[  125.731566] [c06cde80] [c00069c0] do_IRQ+0xb8/0xe0‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

2:

queue_mapping=1 skbaddr=cf9290c0 protocol=0x0800 ip_summed=0 len=78 data_len=0 network_offset=-78 transport_offset_valid=0 transport_offset=65457 tx_flags=2 gso_size=0 gso_segs=0 gso_type=0x0

[ 63.970458] Unable to handle kernel paging request for data at address 0x008c001a
[ 63.977966] Faulting instruction address: 0xc03c32a8
[ 63.982951] Oops: Kernel access of bad area, sig: 11 [#1]
[ 63.988360] BE PREEMPT eMPC
[ 63.991183] CPU: 0 PID: 0 Comm: swapper Not tainted kernel_upgrade #9
[ 64.000862] NIP: c03c32a8 LR: c03c2fb0 CTR: 00000000
[ 64.005933] REGS: cfff5c30 TRAP: 0300 Not tainted (kernel_upgrade)
[ 64.015606] MSR: 00009032 <EE,ME,IR,DR,RI> CR: 44088224 XER: 00000000
[ 64.022350] DAR: 008c001a DSISR: 20000000
[ 64.022350] GPR00: c03c2fb0 cfff5ce0 c067b420 008c001a 000000ff fedac247 41434143 04010000
[ 64.022350] GPR08: 00000000 00000000 00000000 cfff5d00 44044224 00900000 cf97c878 00480020
[ 64.022350] GPR16: 00000054 00000000 00000000 ce30c078 c05c3782 00000003 c0a80019 c0a800ff
[ 64.022350] GPR24: 00000089 c06d3fbc 00000000 c06ba938 cec666a8 00000000 cec66680 cf9290c0
[ 64.060047] NIP [c03c32a8] kfree_skb_list+0x24/0x40
[ 64.064947] LR [c03c2fb0] skb_release_data+0xc8/0x208
[ 64.070009] Call Trace:
[ 64.072474] [cfff5ce0] [c03c59f8] skb_checksum+0x38/0x48 (unreliable)
[ 64.078944] [cfff5cf0] [c03c2fb0] skb_release_data+0xc8/0x208
[ 64.084715] [cfff5d10] [c03c3164] __kfree_skb+0x24/0x3c
[ 64.089973] [cfff5d20] [c044f3d8] __udp4_lib_rcv+0x6b4/0x8a8
[ 64.095665] [cfff5d80] [c041b940] ip_local_deliver_finish+0x118/0x244
[ 64.102132] [cfff5da0] [c041c534] ip_local_deliver+0x68/0xec
[ 64.107816] [cfff5de0] [c041c610] ip_rcv+0x58/0xc0
[ 64.112636] [cfff5e20] [c03d2c98] __netif_receive_skb_one_core+0x58/0x60
[ 64.119372] [cfff5e40] [c03d7874] netif_receive_skb_internal+0x7c/0xec
[ 64.125928] [cfff5e50] [c03d8aa8] napi_gro_receive+0xf8/0x124
[ 64.131707] [cfff5e70] [c033862c] gfar_clean_rx_ring+0x640/0x674
[ 64.137740] [cfff5f00] [c0338824] gfar_poll_rx_sq+0x48/0xdc
[ 64.143339] [cfff5f20] [c03d9334] net_rx_action+0x12c/0x308
[ 64.148945] [cfff5f80] [c051b7c0] __do_softirq+0x230/0x32c
[ 64.154470] [cfff5fe0] [c00237c0] irq_exit+0x80/0xa0
[ 64.159466] [cfff5ff0] [c000e3a4] call_do_irq+0x24/0x3c
[ 64.164721] [c06cde80] [c00069c0] do_IRQ+0xb8/0xe0‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

3:

queue_mapping=1 skbaddr=cf9290c0 protocol=0x0800 ip_summed=0 len=78 data_len=0 network_offset=-78 transport_offset_valid=0 transport_offset=65457 tx_flags=160 gso_size=53246 gso_segs=55488 gso_type=0xcffed940

[ 159.781432] BUG: Bad page state in process swapper pfn:0fd4e
[ 159.787209] page:cffed9c0 count:0 mapcount:0 mapping:cf57562c index:0x1
[ 159.793845] flags: 0x0()
[ 159.796408] raw: 00000000 00000100 00000200 cf57562c 00000001 00000000 ffffffff 00000000
[ 159.804518] page dumped because: non-NULL mapping
[ 159.809246] CPU: 0 PID: 0 Comm: swapper Not tainted kernel_upgrade #5
[ 159.818921] Call Trace:
[ 159.821400] [cfff5c60] [c00ae17c] bad_page+0x118/0x11c (unreliable)
[ 159.827698] [cfff5c80] [c00ae3c0] free_pcppages_bulk+0x1b8/0x440
[ 159.833734] [cfff5ce0] [c00afaec] free_unref_page+0x60/0x6c
[ 159.839340] [cfff5cf0] [c03c2f0c] skb_release_data+0xa8/0x208
[ 159.845111] [cfff5d10] [c03c30e0] __kfree_skb+0x24/0x3c
[ 159.850370] [cfff5d20] [c044f2c4] __udp4_lib_rcv+0x6b4/0x8a8
[ 159.856063] [cfff5d80] [c041b82c] ip_local_deliver_finish+0x118/0x244
[ 159.862532] [cfff5da0] [c041c420] ip_local_deliver+0x68/0xec
[ 159.868216] [cfff5de0] [c041c4fc] ip_rcv+0x58/0xc0‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

There seems to be an issue with how packets are being segmented, almost as if there is an inconsistency in the kernel in regard to involved drivers/subsystems agreeing upon if GSO/TSO is enabled or not. 

Thank you in advance for any help or information. 

Labels (1)
0 Kudos
5 Replies

15 Views
Contributor I

Hello,

 

We have narrowed the issue down to the following commit:

https://github.com/torvalds/linux/commit/75354148ce697266b57c13d051ddffa3bb75fc9e

Without these changes, we experience no crashes.

Specifically, we see changes to the tx_flags, gso_size, gso_secs, and gso_type variables in the SKB after the dma_sync_single_range_for_cpu() call in gfar_get_next_rxbuff(). As you can see in the above kernel panics, sometimes these members contain what appear to be valid values, and sometimes corrupted values.

Do you have an idea about why these changes might result in the kernel panics as described above?

0 Kudos

45 Views
NXP TechSupport
NXP TechSupport
0 Kudos

45 Views
NXP TechSupport
NXP TechSupport

Is there problem under u-boot on your board?

NXP offers LTIB Linux BSP for the MPC8349 - MPC8343.

Is there problem if this BSP is used on your board?

 

Perhaps NXP Professional Service can be helpful for you.

Use the following page for testing and code changing for new kernel verion:

https://www.nxp.com/design/engineering-services/professional-engineering-services:PROFESSIONAL-ENGIN...

Have a great day,
Pavel Chubakov

 

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos

45 Views
Contributor I

thank you for your response. 

No, there is no issue in the u-boot. The kernel has been upgraded from 3.14 to 4.19, and using the same u-boot, there are no issues under 3.14, only 4.19

0 Kudos

10 Views
NXP TechSupport
NXP TechSupport

Perhaps NXP Professional Service can be helpful for you.

Use the following page for testing and code changing for new kernel verion:

https://www.nxp.com/design/engineering-services/professional-engineering-services:PROFESSIONAL-ENGIN...

Have a great day,
Pavel Chubakov

0 Kudos