IMX8MP: Yocto hardknott hang on poweroff/reboot

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

IMX8MP: Yocto hardknott hang on poweroff/reboot

295 Views
TerryBarnaby1
Contributor IV

We are working on an embedded video processing instrument system that is almost complete.

We are using the NXP hardknott release built by ourselves with the kernel 5.10.52-lts-5.10.y-ds200i-00007-ge17e4a8d84dd.

Generally all is working fine apart from one issue. If the user powers off the system (software calls "poweroff") the system intermittently hangs during poweroff leaving the system in a powered on state and with a cursor at the top left of the LCD screen. It may poweroff 50x in a row without problem and then fail on every poweroff/reboot for a time. It will also fail in the same intermittent way if we only start the system without any of our programs running to a simple LCD console (no Wayland/Weston etc.) and then run the "reboot" command. We have a systemd processes that calls "reboot" 10 secs after boot to try and catch the issue.

We have been trying to debug this issue but it is proving elusive. We normally have "quiet" on the kernel command line and see no obvious error messages on the serial console. With "quiet" off the system seems to rarely have a problem. Changing thinks "appear" to affect the problem. When we do have some kernel output on the UART2 serial console, we normally don't see any error messages however we have occasionally seen a bit of serial output like "[    " at the point of the system hang.

A few times we have seen some kernel SError panic messages like that show below mentioning imx_uart_readl. We wonder if there might be some issue in the IMX serial driver ?

Has anyone noticed this, have any ideas on what may be causing this or any ideas on how to debug ?

Terry

 

[ 6.67?[ 6.676472] SError Interrupt on CPU1, code 0xbf000002 -- SError
[ 6.676476] CPU: 1 PID: 1 Comm: systemd Tainted: G C O 5.10.52-lts-5.10.y-ds200i-00007-ge17e4a8d84dd #34
[ 6.676478] Hardware name: Beam IMX8MPlus DS200i board (DT)
[ 6.676479] pstate: 20000085 (nzCv daIf -PAN -UAO -TCO BTYPE=--)
[ 6.676481] pc : imx_uart_readl+0x90/0xb4
[ 6.676482] lr : imx_uart_console_putchar+0x28/0x50
[ 6.676484] sp : ffff800011d6b940
[ 6.676485] x29: ffff800011d6b940 x28: ffff800011cb8f10
[ 6.676491] x27: ffff800011c0e8b0 x26: 0000000000000000
[ 6.676494] x25: 000000000000038c x24: 000000000000005e
[ 6.676497] x23: ffff800011cb8f10 x22: ffff800011cb8f6e
[ 6.676501] x21: ffff00000417f880 x20: ffff80001071a9b0
[ 6.676504] x19: ffff800011cb8f3c x18: 0000000000000030
[ 6.676507] x17: 0000000000000000 x16: 0000000000000000
[ 6.676511] x15: ffff000004088478 x14: 72617453203a5d31
[ 6.676514] x13: 5b646d6574737973 x12: 206e692064656873
[ 6.676517] x11: 696e696620707574 x10: 20296c656e72656b
[ 6.676520] x9 : 2820733534302e31 x8 : 7073726573752820
[ 6.676523] x7 : 733732362e35202b x6 : ffff800011cb8f1f
[ 6.676527] x5 : 00000000fffffffe x4 : 0000000000000020
[ 6.676530] x3 : ffff00000417f880 x2 : ffff00000417f880
[ 6.676534] x1 : ffff8000125b00b4 x0 : 0000000000000060
[ 6.676537] Kernel panic - not syncing: Asynchronous SError Interrupt
[ 6.676540] CPU: 1 PID: 1 Comm: systemd Tainted: G C O 5.10.52-lts-5.10.y-ds200i-00007-ge17e4a8d84dd #34
[ 6.676542] Hardware name: Beam IMX8MPlus DS200i board (DT)
[ 6.676543] Call trace:
[ 6.676544] dump_backtrace+0x0/0x1a0
[ 6.676545] show_stack+0x18/0x70
[ 6.676547] dump_stack+0xd0/0x12c
[ 6.676548] panic+0x16c/0x334
[ 6.676549] nmi_panic+0x8c/0x90
[ 6.676551] arm64_serror_panic+0x78/0x84
[ 6.676552] do_serror+0x64/0x6c
[ 6.676553] el1_error+0x90/0x110
[ 6.676554] imx_uart_readl+0x90/0xb4
[ 6.676556] uart_console_write+0x50/0x70
[ 6.676558] imx_uart_console_write+0xf4/0x1cc
[ 6.676559] console_unlock+0x36c/0x460
[ 6.676561] vprintk_emit+0x134/0x260
[ 6.676562] devkmsg_emit.constprop.0+0x68/0x8c
[ 6.676563] devkmsg_write+0x14c/0x174
[ 6.676565] do_iter_readv_writev+0xf8/0x194
[ 6.676566] do_iter_write+0x90/0x1f0
[ 6.676568] vfs_writev+0xb0/0x17c
[ 6.676569] do_writev+0x74/0x130
[ 6.676570] __arm64_sys_writev+0x20/0x30
[ 6.676572] el0_svc_common.constprop.0+0x78/0x1a0
[ 6.676573] do_el0_svc+0x24/0x90
[ 6.676575] el0_svc+0x14/0x20
[ 6.676576] el0_sync_handler+0x1a4/0x1b0
[ 6.676577] el0_sync+0x180/0x1c0
[ 6.676599] SMP: stopping secondary CPUs
[ 6.676601] Kernel Offset: disabled
[ 6.676602] CPU features: 0x0240002,2000200c
[ 6.676603] Memory Limit: none

Generally when it has failed

0 Kudos
1 Reply

281 Views
TerryBarnaby1
Contributor IV

Bump: Any idea on how we can try and debug this kernel pank due to an SError ?

0 Kudos