DPAA1.x deadlock with TIPC working

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

DPAA1.x deadlock with TIPC working

912 Views
SayHello
Contributor I

Hi, I'am using B4860. Trying to run TIPC applications (this is protocol over ethenet frames included in linux ./net/tipc). 
uImage is configured in full RT. FMAN is configured to distribute frames in priority: udp/tcp/ipv4/eth - it is all ok, TIPC ethernet frames are distributed on one core because it has constant type field, so hash is always constant.
TIPC links to ethernet device which is managed by DPAA driver. However dpa_poll occured in softirq and hardware irq such as QMAN portal irq. As it poll frame it pass it to kernel with netif, and activate L2 frames consumers as TIPC. When I stress test with huge amounts of packets it is seems like HW IRQ of QMAN portal interrupt ktimersoftirq and process the same chain of polling in HW IRQ context. If I'am lucky enough, I have a deadlock in TIPC spinlock_trylock_bh(socket). 


I make one workaround, in linux-qoriq (SDK 2.0 b1703), I pass flag IRQF_NO_SOFTIRQ_CALL in QMAN portal isr request

if (request_irq(config->public_cfg.irq, portal_isr, IRQF_NO_SOFTIRQ_CALL, portal->irqname,
portal)) {
pr_err("request_irq() failed\n");
goto fail_irq;
}
It helps, but I am wondering if it is quite good practice, and not regress of perfomance.
Common kernel output with deadlock
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-7):
(detected by 2, t=84007 jiffies, g=839, c=838, q=4071)
All QSes seen, last rcu_preempt kthread activity 3 (4294784707-4294784704), jiffies_till_next_fqs=3, roo 0 521 2 0x00000800
Call Trace:
[c0000000788727c0] [c00000000006edc4] .sched_show_task+0xe4/0x190 (unreliable)
[c000000078872840] [c0000000000a0218] .rcu_check_callbacks+0xb68/0xb70
[c000000078872990] [c0000000000a333c] .update_process_tim8872a10] [c0000000000b8b3c] .tick_sched_handle.isra.17+0x3c/0x50
[c000000078872a80] [c0000000000b8bbc] .tick_sched_timer+0x6c/0xe0
[c000000078872b20] [c0000000000a4818] .__run_hrtimer.isra.33+0xa8/0x210
[c000000078872bb0] [c0000000000a52b4] .hrtimer_inte__timer_interrupt+0xb0/0x1a0
[c000000078872d60] [c0000000000105d8] .timer_interrupt+0x138/0x190
--- interrupt: 901 at 0x22ad2e84
LR = .tipc_sk_lookup+0x74/0x1f0
[c000000078872e00] [c00000000001a054] exc_0x900_common+0x104/0x108 (unreliable)
--- intspin_trylock_bh+0xc/0x70
LR = .tipc_sk_rcv+0x13c/0x600
[c0000000788731e0] [c00000000088b84c] .tipc_node_unlock+0xbc/0x280
[c000000078873290] [c000000000880b9c] .tipc_rcv+0x31c/0x9d0
[c000000078873410] [c00000000087ae58] .tipc_l2_rcv_msg+0x58/0xb0
[_skb_core+0x654/0xa00
[c000000078873590] [c000000000726948] .netif_receive_skb_internal+0x48/0x110
[c000000078873630] [c00000000059ffc0] ._dpa_rx+0x1c0/0x710
[c000000078873760] [c00000000059d480] .priv_rx_default_dqrr+0xb0/0x1d0
[c000000078873810] [c000_p_poll_dqrr+0x1c4/0x300
[c0000000788738e0] [c00000000059d7ec] .dpaa_eth_poll+0x2c/0x80
[c000000078873970] [c000000000726fac] .net_rx_action+0x25c/0x380
[c000000078873a70] [c0000000000422a0] .do_current_softirqs+0x2f0/0x3e0
[c000000078873b60] [c00000000h_enable+0xa4/0xb0
[c000000078873bd0] [c00000000009162c] .irq_forced_thread_fn+0x6c/0xc0
[c000000078873c60] [c000000000091900] .irq_thread+0x170/0x250
[c000000078873d30] [c0000000000615d0] .kthread+0xf0/0x110
[c000000078873e30] [c000000000000998] .ret_fx58/0xc0

 

0 Kudos
Reply
1 Reply

895 Views
SayHello
Contributor I
please move to Qoriq
0 Kudos
Reply