AnsweredAssumed Answered

RCU stall/hang on imx6q using 4.9.88_2.0.0_ga

Question asked by Todd Blanchard on Jun 19, 2019
Latest reply on Jun 20, 2019 by igorpadykov

We are seeing a periodic rcu stall/hang, where sometimes we can get a backtrace. Most often, the system just hangs. Please advise.

 

RCU-relevant config options:

(CONFIG_PREEMPT_RCU=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_COUNT=y

# CONFIG_RCU_EXPERT is not set
CONFIG_SRCU=y
# CONFIG_TASKS_RCU is not set
CONFIG_RCU_STALL_COMMON=y
# CONFIG_TREE_RCU_TRACE is not set
# CONFIG_RCU_EXPEDITE_BOOT is not set
# RCU Debugging
# CONFIG_PROVE_RCU is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_RCU_PERF_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set)

 

Backtrace 1

(function calls)

0x8010bd48 (+0x18) show_stack
0x80149f3c (+0xb8) sched_show_task
0x80175d1c (+0x998) rcu_check_callbacks
0x80179148 (+0x3c) update_process_times
0x8018a668 (+0x50) tick_sched_handle
0x8018a6bc (+0x50) tick_sched_timer
0x8017a1b0 (+0xc8) __hrtimer_run_queues
0x8017a3e8 (+0xac) hrtimer_interrupt
0x80188904 (+0x34) tick_receive_broadcast
0x8010e15c (+0xf0) handle_IPI
0x801014a0 (+0x90) gic_handle_irq
0x8010c74c (+0x6c) __irq_svc
0x804a51d8 (+0x1c) cpuidle_enter
0x8015e684 (+0x28) call_cpuidle
0x8015e8cc (+0x140) cpu_startup_entry
0x80651c48 (+0x8c) rest_init
0x80800c54 (+0x350) start_kernel

 

(File/line)

/usr/src/kernel/arch/arm/kernel/traps.c:247
/usr/src/kernel/kernel/sched/core.c:5216
/usr/src/kernel/kernel/rcu/tree.c:1403
/usr/src/kernel/arch/arm/include/asm/thread_info.h:94
/usr/src/kernel/kernel/time/tick-sched.c:153
/usr/src/kernel/kernel/time/tick-sched.c:1164
/usr/src/kernel/kernel/time/hrtimer.c:1255
/usr/src/kernel/kernel/time/hrtimer.c:1356
/usr/src/kernel/kernel/time/tick-broadcast.c:252
/usr/src/kernel/arch/arm/kernel/smp.c:612
/usr/src/kernel/drivers/irqchip/irq-gic.c:382
/usr/src/kernel/arch/arm/kernel/entry-armv.S:222
/usr/src/kernel/drivers/cpuidle/cpuidle.c:270
/usr/src/kernel/kernel/sched/idle.c:120
/usr/src/kernel/kernel/sched/idle.c:185
/usr/src/kernel/init/main.c:410
/usr/src/kernel/init/main.c:665

 

 

Backtrace 2:

(function calls)

0x80653a48 (+0x58) schedule
0x80656a84 (+0x158) schedule_timeout
0x80174e68 (+0x518) rcu_gp_kthread
0x8013fc84 (+0x110) kthread
0x80107d70 (+0x14) ret_from_fork

 

(File/line)

/usr/src/kernel/arch/arm/include/asm/thread_info.h:94 (discriminator 1)
/usr/src/kernel/arch/arm/include/asm/thread_info.h:94
/usr/src/kernel/kernel/rcu/tree.c:2227 (discriminator 13)
/usr/src/kernel/kernel/kthread.c:211
/usr/src/kernel/arch/arm/kernel/entry-common.S:119

 

Here is the actual log:

 

May 31 18:13:06 kernel: INFO: rcu_preempt detected stalls on CPUs/tasks:
May 31 18:13:06 kernel: (detected by 0, t=21002 jiffies, g=149858, c=149857, q=596)
May 31 18:13:06 kernel: All QSes seen, last rcu_preempt kthread activity 21002 (677469-656467), jiffies_till_next_fqs=3, root ->qsmask 0x0
May 31 18:13:06 kernel: swapper/0 R running task 0 0 0 0x00000000
May 31 18:13:06 kernel: Backtrace:
May 31 18:13:06 kernel: Function entered at [<8010ba88>] from [<8010bd48>]
May 31 18:13:06 kernel: r7:4e85a000 r6:8084a540 r5:00000000 r4:80907e00
May 31 18:13:06 kernel: Function entered at [<8010bd30>] from [<80149f3c>]
May 31 18:13:06 kernel: Function entered at [<80149e84>] from [<80175d1c>]
May 31 18:13:06 kernel: r5:8090f100 r4:cf0a4540
May 31 18:13:06 kernel: Function entered at [<80175384>] from [<80179148>]
May 31 18:13:06 kernel: r10:00000000 r9:8018a66c r8:cf0a2780 r7:000000e3 r6:00000000 r5:80907e00
May 31 18:13:06 kernel: r4:ffffe000
May 31 18:13:06 kernel: Function entered at [<8017910c>] from [<8018a668>]
May 31 18:13:06 kernel: r7:000000e3 r6:95b16cfa r5:80901ed0 r4:cf0a2998
May 31 18:13:06 kernel: Function entered at [<8018a618>] from [<8018a6bc>]
May 31 18:13:06 kernel: Function entered at [<8018a66c>] from [<8017a1b0>]
May 31 18:13:06 kernel: r7:000000e3 r6:95b16a60 r5:cf0a2998 r4:cf0a2740
May 31 18:13:06 kernel: Function entered at [<8017a0e8>] from [<8017a3e8>]
May 31 18:13:06 kernel: r10:cf0a27b8 r9:cf0a27d8 r8:cf0a2754 r7:cf0a27f8 r6:ffffffff r5:00000003
May 31 18:13:06 kernel: r4:cf0a2740
May 31 18:13:06 kernel: Function entered at [<8017a33c>] from [<80188904>]
May 31 18:13:06 kernel: r10:00000000 r9:f4a01100 r8:80901ed0 r7:f4a00100 r6:00000000 r5:00000000
May 31 18:13:06 kernel: r4:80848f68
May 31 18:13:06 kernel: Function entered at [<801888d0>] from [<8010e15c>]
May 31 18:13:06 kernel: Function entered at [<8010e06c>] from [<801014a0>]
May 31 18:13:06 kernel: r6:f4a0010c r5:80919000 r4:809043e0
May 31 18:13:06 kernel: Function entered at [<80101410>] from [<8010c74c>]
May 31 18:13:06 kernel: Exception stack(0x80901ed0 to 0x80901f18)
May 31 18:13:06 kernel: 1ec0: 00000000 8099ad30 00000001 80900000
May 31 18:13:06 kernel: 1ee0: 95c85feb 000000e3 cf0a3140 00000001 95b955a8 000000e3 00000000 80901f54
May 31 18:13:06 kernel: 1f00: 80901ee0 80901f20 80188c54 804a5030 200c0013 ffffffff
May 31 18:13:06 kernel: r9:80900000 r8:95b955a8 r7:80901f04 r6:ffffffff r5:200c0013 r4:804a5030
May 31 18:13:06 kernel: Function entered at [<804a4ed8>] from [<804a51d8>]
May 31 18:13:06 kernel: r10:80904144 r9:80909b40 r8:cf0a3140 r7:8090413c r6:00000001 r5:809040ec
May 31 18:13:06 kernel: r4:ffffe000
May 31 18:13:06 kernel: Function entered at [<804a51bc>] from [<8015e684>]
May 31 18:13:06 kernel: Function entered at [<8015e65c>] from [<8015e8cc>]
May 31 18:13:06 kernel: Function entered at [<8015e78c>] from [<80651c48>]
May 31 18:13:06 kernel: r7:ffffffff
May 31 18:13:06 kernel: Function entered at [<80651bbc>] from [<80800c54>]
May 31 18:13:06 kernel: r5:80954000 r4:80954050
May 31 18:13:06 kernel: Function entered at [<80800904>] from [<1000807c>]
May 31 18:13:06 kernel: rcu_preempt kthread starved for 21002 jiffies! g149858 c149857 f0x2 RCU_GP_WAIT_FQS(3) ->state=0x100
May 31 18:13:06 kernel: rcu_preempt W 0 7 2 0x00000000
May 31 18:13:06 kernel: Backtrace:
May 31 18:13:06 kernel: Function entered at [<80653574>] from [<80653a48>]
May 31 18:13:06 kernel: r10:8090f29e r9:00000001 r8:80902d00 r7:cf0ae440 r6:cf0ae440 r5:cc0a1ed0
May 31 18:13:06 kernel: r4:ffffe000
May 31 18:13:06 kernel: Function entered at [<806539f0>] from [<80656a84>]
May 31 18:13:06 kernel: r5:cc0a1ed0 r4:000a0456
May 31 18:13:06 kernel: Function entered at [<8065692c>] from [<80174e68>]
May 31 18:13:06 kernel: r8:8090f290 r7:80902d00 r6:8090f29c r5:8090f100 r4:00000000
May 31 18:13:06 kernel: Function entered at [<80174950>] from [<8013fc84>]
May 31 18:13:06 kernel: r7:8090f100
May 31 18:13:06 kernel: Function entered at [<8013fb74>] from [<80107d70>]
May 31 18:13:06 kernel: r8:00000000 r7:00000000 r6:00000000 r5:8013fb74 r4:cc0402c0

Outcomes