Linux kernel 5.4.47 crash on QorIQ LS1046A: Internal error: Oops: 96000005 [#1] SMP

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Linux kernel 5.4.47 crash on QorIQ LS1046A: Internal error: Oops: 96000005 [#1] SMP

Jump to solution
2,769 Views
javier_tia
Contributor III

After upgrading to Linux kernel 5.4 and using the LSDK-20.12-V5.4 tag code, an intermittent Linux kernel 5.4.47 crash happens. This crash was not happening with LSDK-19.09-update-311219-V4.19.

Crash details:

[ 292.977269] ------------[ cut here ]------------
[ 292.977279] percpu ref (css_release) <= 0 (-7) after switching to atomic
[ 292.977304] WARNING: CPU: 1 PID: 16 at lib/percpu-refcount.c:160 percpu_ref_switch_to_atomic_rcu+0x180/0x1a8
[ 292.977305] Modules linked in: provision(O) uiodma(O) ntb_hw_provision(O) pvnet(O) pvuart(O) gre xfrm_user xfrm_algo bnep btusb btrtl btbcm btintel veth bluetooth bridge stp ecdh_generic ecc llc uas usb_storage ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp nfsd auth_rpcgss oid_registry lockd grace sunrpc tun pvinterface(O) iptable_filter ip_tables x_tables eop ptp_qoriq hpe_panic(O) hpe_lcsc(PO) ntb_netdev(O) ntb_transport(O) ntb(O) hpe_port_shutdown(O) hpe_pmc_wdg(O) hpe_platform_id(PO) hpe_ndmd(O) hpe_fpga_spi(PO) hpe_util(PO) hpe_fpga_pcie_uio(O) hpe_fpga_i2c_ocores(O) fuse bonding
[ 292.977353] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 292.977356] pc : percpu_ref_switch_to_atomic_rcu+0x180/0x1a8
[ 292.977358] lr : percpu_ref_switch_to_atomic_rcu+0x180/0x1a8
[ 292.977360] sp : ffff800011363c60
[ 292.977361] x29: ffff800011363c60 x28: ffff800010fa9318
[ 292.977364] x27: ffff800010ef9e08 x26: ffff800011363d30
[ 292.977366] x25: ffff000ae033e038 x24: ffff800010ed88b0
[ 292.977368] x23: ffff800010ed92a0 x22: ffff800010ed8a80
[ 292.977370] x21: ffff000ae033e038 x20: 00007df4515d9140
[ 292.977372] x19: 7ffffffffffffff9 x18: 00000000fffffffd
[ 292.977374] x17: 0000000000000000 x16: 0000000000000000
[ 292.977376] x15: 0000000000000001 x14: ffffffffffffffff
[ 292.977378] x13: ffff800010fe6240 x12: ffff800010fe5e80
[ 292.977381] x11: 0101010101010101 x10: ffffff7f7f7f7f7f
[ 292.977383] x9 : 00000000fffffffe x8 : 6968637469777320
[ 292.977385] x7 : 7265746661202937 x6 : ffff800010fe551c
[ 292.977387] x5 : 000000000000003c x4 : 0000000000000000
[ 292.977389] x3 : 0000000000000000 x2 : 00000000ffffffff
[ 292.977391] x1 : ffff800b6e84c000 x0 : 000000000000003c
[ 292.977393] Call trace:
[ 292.977396] percpu_ref_switch_to_atomic_rcu+0x180/0x1a8
[ 292.977399] rcu_core+0x280/0x8c8
[ 292.977401] rcu_core_si+0x14/0x20
[ 292.977405] __do_softirq+0x140/0x33c
[ 292.977409] run_ksoftirqd+0x44/0x60
[ 292.977412] smpboot_thread_fn+0x150/0x198
[ 292.977415] kthread+0x104/0x130
[ 292.977417] ret_from_fork+0x10/0x1c
[ 292.977419] ---[ end trace 096121eef3e43aa8 ]---
[ 293.138868] device mdns left promiscuous mode
[ 293.255063] audit: type=1700 audit(1634818897.553:94): dev=cptun prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
[ 293.302291] audit: type=1325 audit(1634818897.601:95): table=filter family=3 entries=0
[ 293.324940] audit: type=1325 audit(1634818897.593:96): table=filter family=3 entries=0
[ 293.255020] device cptun left promiscuous mode
[ 293.255063] audit: type=1700 audit(1634818897.553:94): dev=cptun prom=0 old_prom=256 auid=4294967295 uid=0 gid=0 ses=4294967295
[ 293.302291] audit: type=1325 audit(1634818897.601:95): table=filter family=3 entries=0
[ 293.324940] audit: type=1325 audit(1634818897.593:96): table=filter family=3 entries=0
[ 293.325227] audit: type=1325 audit(1634818897.625:97): table=filter family=3 entries=4
[ 293.863959] Unable to handle kernel paging request at virtual address ffff800b6e82d000
[ 293.870667] Mem abort info:
[ 293.872153] ESR = 0x96000005
[ 293.873925] EC = 0x25: DABT (current EL), IL = 32 bits
[ 293.877940] SET = 0, FnV = 0
[ 293.879684] EA = 0, S1PTW = 0
[ 293.881520] Data abort info:
[ 293.883092] ISV = 0, ISS = 0x00000005
[ 293.885622] CM = 0, WnR = 0
[ 293.887280] swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000080d9f000
[ 293.892680] [ffff800b6e82d000] pgd=0000000bff7ff003, pud=0000000000000000
[ 293.898169] Internal error: Oops: 96000005 [#1] SMP
[ 293.901735] Modules linked in: arptable_filter arp_tables uiodma(O) gre xfrm_user xfrm_algo bnep btusb btrtl btbcm btintel veth bluetooth bridge stp ecdh_generic ecc llc uas usb_storage ip6table_filter ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp nfsd auth_rpcgss oid_registry lockd grace sunrpc tun iptable_filter ip_tables x_tables eop ptp_qoriq ntb_netdev(O) ntb_transport(O) ntb(O) fuse bonding
[ 293.956062] CPU: 0 PID: 1 Comm: systemd Kdump: loaded Tainted: P W O 5.4.47-yocto-standard #1
[ 293.964405] Hardware name: 6400 (DT)
[ 293.966667] pstate: 40000005 (nZcv daif -PAN -UAO)
[ 293.970152] pc : cgroup_sk_free+0x6c/0xb8
[ 293.972851] lr : __sk_destruct+0x120/0x1b0
[ 293.975634] sp : ffff80001003bbf0
[ 293.977635] x29: ffff80001003bbf0 x28: ffff000b7f200000
[ 293.981635] x27: 0000000000000000 x26: 0000000000000000
[ 293.985634] x25: 0000000056000000 x24: 0000000000000000
[ 293.989635] x23: ffff000ab5c34080 x22: ffff000ade0f36e0
[ 293.993635] x21: 0000000000000000 x20: ffff000ade0fb400
[ 293.997635] x19: ffff000ade0fb690 x18: 0000000000000000
[ 294.001635] x17: 0000000000000000 x16: 0000000000000000
[ 294.005634] x15: 0000000000000000 x14: 0000000000000000
[ 294.009634] x13: 0000000000000000 x12: 0000000000000000
[ 294.013633] x11: 0000000000000000 x10: 0000000000000000
[ 294.017632] x9 : 0000000000000000 x8 : 0000000000000001
[ 294.021632] x7 : 76dfab7ed96db635 x6 : 0000000000000000
[ 294.025632] x5 : ffff80001003bc48 x4 : 000000000000027b
[ 294.029631] x3 : 000000000000027a x2 : ffffffffffffffff
[ 294.033630] x1 : ffff800b6e82d000 x0 : ffff800b6e82d000
[ 294.037630] Call trace:
[ 294.038766] cgroup_sk_free+0x6c/0xb8
[ 294.041117] __sk_destruct+0x120/0x1b0
[ 294.043554] sk_destruct+0x54/0x70
[ 294.045642] __sk_free+0x40/0xd8
[ 294.047557] sk_free+0x3c/0x48
[ 294.049301] unix_release_sock+0x244/0x318
[ 294.052085] unix_release+0x28/0x40
[ 294.054262] __sock_release+0x4c/0xc8
[ 294.056612] sock_close+0x24/0x38
[ 294.058615] __fput+0x90/0x218
[ 294.060357] ____fput+0x20/0x30
[ 294.062186] task_work_run+0xd0/0x100
[ 294.064536] do_notify_resume+0x2d8/0x318
[ 294.067234] work_pending+0x8/0x10
[ 294.069324] Code: 54000161 92800002 d538d081 8b010000 (c85f7c04)
[ 294.074106] ---[ end trace 096121eef3e43aa9 ]---
[ 294.077411] Kernel panic - not syncing: Fatal exception
[ 294.081323] SMP: stopping secondary CPUs
[ 294.083935] Kernel Offset: disabled
[ 294.086111] CPU features: 0x0002,20002000
[ 294.088807] Memory Limit: none
[ 294.115126] Starting crashdump kernel...
[ 294.117737] Bye!

 

0 Kudos
1 Solution
2,743 Views
javier_tia
Contributor III
0 Kudos
2 Replies
2,759 Views
yipingwang
NXP TechSupport
NXP TechSupport

Please download LSDK2012 boot partition tar ball with the following command.

wget https://www.nxp.com/lgfiles/sdk/lsdk2012/bootpartition_LS_arm64_lts_5.4.tgz

Then get "Image" from this tar ball and use "Image" on your target board to check whether Kernel crash problem persists.

0 Kudos
2,744 Views
javier_tia
Contributor III

@yipingwang With the image same crash, but there is a solution. Using this patch:

https://github.com/torvalds/linux/commit/ad0f75e5f57ccbceec13274e1e242f2b5a6397ed

0 Kudos