Hi,
I am currently facing an Ethernet issue on the i.MX6 Solo (kernel version 4.1.15). When I bring down the Ethernet interface, the following error trace occurs. After that, if I bring the interface up, it comes online but does not connect to the network. This issue occurs randomly.
After further debugging, I found that the problem is related to interrupt handling. In the phy_interrupt handler (in phy.c), there is a condition that checks the state of the Ethernet interface. If the Ethernet state is equal to PHY_HALTED, it returns IRQ_NONE. During the failure scenario, when I bring down the Ethernet interface, the phy_stop function sets phydev->state = PHY_HALTED. The interrupt occurs afterward, and the interrupt handler checks this condition, returning IRQ_NONE. Afterward, the interrupt occurs thousands of times within seconds. There is logic to disable spurious interrupts, which disables the interrupt if the number of unhandled interrupts exceeds 99,900.
[Tue Sep 24 19:21:14 2024] irq 61: nobody cared (try booting with the "irqpoll" option)
[Tue Sep 24 19:21:14 2024] CPU: 0 PID: 1168 Comm: Servicepack Tainted: G O 4.1.15 #1
[Tue Sep 24 19:21:14 2024] Hardware name: Freescale i.MX6 Quad/DualLite (Device Tree)
[Tue Sep 24 19:21:14 2024] [<80017988>] (unwind_backtrace) from [<800135d8>] (show_stack+0x10/0x14)
[Tue Sep 24 19:21:14 2024] [<800135d8>] (show_stack) from [<807c00a8>] (dump_stack+0x84/0xc4)
[Tue Sep 24 19:21:14 2024] [<807c00a8>] (dump_stack) from [<80070b10>] (__report_bad_irq+0x28/0xc4)
[Tue Sep 24 19:21:14 2024] [<80070b10>] (__report_bad_irq) from [<80070f14>] (note_interrupt+0x29c/0x2ec)
[Tue Sep 24 19:21:14 2024] [<80070f14>] (note_interrupt) from [<8006e8e0>] (handle_irq_event_percpu+0xcc/0x134)
[Tue Sep 24 19:21:14 2024] [<8006e8e0>] (handle_irq_event_percpu) from [<8006e984>] (handle_irq_event+0x3c/0x5c)
[Tue Sep 24 19:21:14 2024] [<8006e984>] (handle_irq_event) from [<80071548>] (handle_level_irq+0xc4/0x13c)
[Tue Sep 24 19:21:14 2024] [<80071548>] (handle_level_irq) from [<8006df2c>] (generic_handle_irq+0x2c/0x3c)
[Tue Sep 24 19:21:14 2024] [<8006df2c>] (generic_handle_irq) from [<80311674>] (mxc_gpio_irq_handler+0x3c/0x164)
[Tue Sep 24 19:21:14 2024] [<80311674>] (mxc_gpio_irq_handler) from [<8031181c>] (mx3_gpio_irq_handler+0x80/0xcc)
[Tue Sep 24 19:21:14 2024] [<8031181c>] (mx3_gpio_irq_handler) from [<8006df2c>] (generic_handle_irq+0x2c/0x3c)
[Tue Sep 24 19:21:14 2024] [<8006df2c>] (generic_handle_irq) from [<8006e1e4>] (__handle_domain_irq+0x7c/0xec)
[Tue Sep 24 19:21:14 2024] [<8006e1e4>] (__handle_domain_irq) from [<8000944c>] (gic_handle_irq+0x24/0x5c)
[Tue Sep 24 19:21:14 2024] [<8000944c>] (gic_handle_irq) from [<800140c0>] (__irq_svc+0x40/0x74)
[Tue Sep 24 19:21:14 2024] Exception stack(0xa8a69c38 to 0xa8a69c80)
[Tue Sep 24 19:21:14 2024] 9c20: 00000000 809bf930
[Tue Sep 24 19:21:14 2024] 9c40: 80d413c0 00000000 00000002 00000000 00000010 a8a68000 00000001 80cc6080
[Tue Sep 24 19:21:14 2024] 9c60: a8008000 000002dd 0000001c a8a69c80 80039098 800390a8 20070113 ffffffff
[Tue Sep 24 19:21:14 2024] [<800140c0>] (__irq_svc) from [<800390a8>] (__do_softirq+0xa4/0x238)
[Tue Sep 24 19:21:14 2024] [<800390a8>] (__do_softirq) from [<80039504>] (irq_exit+0xc0/0xfc)
[Tue Sep 24 19:21:14 2024] [<80039504>] (irq_exit) from [<8006e1e8>] (__handle_domain_irq+0x80/0xec)
[Tue Sep 24 19:21:14 2024] [<8006e1e8>] (__handle_domain_irq) from [<8000944c>] (gic_handle_irq+0x24/0x5c)
[Tue Sep 24 19:21:14 2024] [<8000944c>] (gic_handle_irq) from [<800140c0>] (__irq_svc+0x40/0x74)
[Tue Sep 24 19:21:14 2024] Exception stack(0xa8a69d08 to 0xa8a69d50)
[Tue Sep 24 19:21:14 2024] 9d00: 00000001 80070093 00000001 20070013 80d4f6e4 00000003
[Tue Sep 24 19:21:14 2024] 9d20: 00000040 80d81d68 00000000 00000006 00000000 000002dd 00000001 a8a69d50
[Tue Sep 24 19:21:14 2024] 9d40: 807c5210 8006cbac 60070013 ffffffff
[Tue Sep 24 19:21:14 2024] [<800140c0>] (__irq_svc) from [<8006cbac>] (console_unlock+0x334/0x4e8)
[Tue Sep 24 19:21:14 2024] [<8006cbac>] (console_unlock) from [<8006d094>] (vprintk_emit+0x334/0x588)
[Tue Sep 24 19:21:14 2024] [<8006d094>] (vprintk_emit) from [<8006d408>] (vprintk_default+0x20/0x28)
[Tue Sep 24 19:21:14 2024] [<8006d408>] (vprintk_default) from [<807be07c>] (printk+0x74/0x84)
[Tue Sep 24 19:21:14 2024] [<807be07c>] (printk) from [<8047bfb4>] (phy_disable_interrupts+0x1c/0x68)
[Tue Sep 24 19:21:14 2024] [<8047bfb4>] (phy_disable_interrupts) from [<8047c00c>] (phy_stop_interrupts+0xc/0x98)
[Tue Sep 24 19:21:14 2024] [<8047c00c>] (phy_stop_interrupts) from [<8047da98>] (phy_disconnect+0x18/0x34)
[Tue Sep 24 19:21:14 2024] [<8047da98>] (phy_disconnect) from [<8048aaf0>] (fec_enet_close+0x2c/0x128)
[Tue Sep 24 19:21:14 2024] [<8048aaf0>] (fec_enet_close) from [<80674a60>] (__dev_close_many+0x88/0xd0)
[Tue Sep 24 19:21:14 2024] [<80674a60>] (__dev_close_many) from [<80674bb8>] (__dev_close+0x24/0x38)
[Tue Sep 24 19:21:14 2024] [<80674bb8>] (__dev_close) from [<8067c228>] (__dev_change_flags+0x94/0x144)
[Tue Sep 24 19:21:14 2024] [<8067c228>] (__dev_change_flags) from [<8067c2f0>] (dev_change_flags+0x18/0x48)
[Tue Sep 24 19:21:14 2024] [<8067c2f0>] (dev_change_flags) from [<806deda0>] (devinet_ioctl+0x664/0x738)
[Tue Sep 24 19:21:14 2024] [<806deda0>] (devinet_ioctl) from [<80662e64>] (sock_ioctl+0x1bc/0x290)
[Tue Sep 24 19:21:14 2024] [<80662e64>] (sock_ioctl) from [<800fbf04>] (do_vfs_ioctl+0x3e8/0x608)
[Tue Sep 24 19:21:14 2024] [<800fbf04>] (do_vfs_ioctl) from [<800fc158>] (SyS_ioctl+0x34/0x5c)
[Tue Sep 24 19:21:14 2024] [<800fc158>] (SyS_ioctl) from [<80010100>] (ret_fast_syscall+0x0/0x3c)
[Tue Sep 24 19:21:14 2024] handlers:
[Tue Sep 24 19:21:14 2024] [<8047c288>] phy_interrupt
[Tue Sep 24 19:21:14 2024] Disabling IRQ #61
I am unsure of the root cause of this issue. Is there any approach to resolve the problem.
Thanks
Hello,
The interrupts may be disabled otherwise the kernel detects a spurious interrupt, However There is no documentation in the BSP to disable this part it is esencial to work, you may try other suggested page such as:
https://github.com/scylladb/scylladb/issues/1852
Regards
Hi
The given link does not provide any solution. I have analysed further. below are the findings.
1. The Interrupt Enable register of the ar8035 phy chip is set to 0. It remains 0 till the interface is down. Still the interrupt occours despite disabling the interrupt.
2. disable_irq is not called during the flow.
3. There are no pending interrupts. Confirmed using dso.
4. Tried disabling Wake-on-lan. Still the issue persists.
5. If more prints are added there is more delay in the sequence and more time for the interrupt to occour.