Hello,
I am working on a custom board based on imx8mp, and I am using the eqos eth interface with at broadcom bcm54213pe phy.
Everything works perfecly fine: I am able to configure and bring up the interface and the interface works perfectly (iperf3 benchmarks very close to 1 Gbps).
But... I have a big problem when I try to bring down the interface (via ifdown)...
This is what happens almost always:
root@myboard:~# ifdown eth0
[ 1380.106871] imx-dwmac 30bf0000.ethernet eth0: Link is Down
[ 1380.132249] imx-dwmac 30bf0000.ethernet eth0: FPE workqueue stop
[ 1406.211325] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 1406.217430] rcu: 1-...0: (5 ticks this GP) idle=53a4/1/0x4000000000000000 softirq=6359/6360 fqs=2243
[ 1406.226652] (detected by 0, t=5252 jiffies, g=7889, q=16 ncpus=4)
[ 1406.232833] Task dump for CPU 1:
[ 1406.236060] task:ip state:R running task stack:0 pid:1606 ppid:1592 flags:0x0000000e
[ 1406.245977] Call trace:
[ 1406.248422] __switch_to+0xf0/0x170
[ 1406.251917] 0x0
But sometimes it simply hangs (and I have to reboot)... other times it complains about the fact that the system can't set the CPU voltage...
The OS results from a Yocto compilation; I basically use the standard im8mp.dtsi as a base and then I wrote my own dts (attached here).
I really do not know how to debug further or what to attempt...
Any ideas?
Thanks + regards,
/Morix
解決済! 解決策の投稿を見る。
Ok: found (and fixed) the problem.
It was due to the fact that I was using a "modified" version of the stock broadcom driver which initializes the PHY leds configuration reading it from the device tree. The problem was that during ifdown of the PHY the EQOS driver (under some circumstances) re-init the PHY again... but in that context the device tree references were not valid, causing a memory access violation.
I then modified the PHY driver again, for accessing the device-tree only during the first fire-up, and then caching the relevant data in a private structure, accessed during further PHY initializations.
It worked like a charm.
I share my work attaching the patches to be applied to drivers/net/phy/broadcom.c (and releated files).
Thanks for the support.
Hi,
Thank you for your interest in NXP Semiconductor products,
You can prevent RCU stalls with cpuidle.off=1.
Is your issue recurring with ifconfig down?
Regards
Hi Joseph,
thanks for your feed-back. I tried what you've suggested, but it did not help.
I mean: the RCU stall message went away (and this was quite expected) but now the interface bring down simply hangs with no further messages (and, yes, it also happens with ifconfig down, as you can see here below):
root@myboard:~# cat /proc/cmdline
console=ttymxc1,115200 root=/dev/mmcblk1p2 rootwait rw cpuidle.off=1
root@myboard:~# ifconfig eth0 up
root@myboard:~# ifconfig
eth0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 6a:59:80:50:a6:a4 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 218
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 12 bytes 1740 (1.6 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 12 bytes 1740 (1.6 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
root@myboard:~# ifconfig eth0 down--- THE SYSTEM HANGS HERE ---
Any other ideas?
Thanks + regards.
Ok, I've narrowed down the issue... the problem only occurs if I use the specific broadcom driver for the PHY by configuring the kernel with:
CONFIG_BROADCOM_PHY=yCONFIG_BCM54140_PHY=yCONFIG_BCM_NET_PHYLIB=y
If I remove those configs (thus using the generic PHY driver) then the problem goes away.
So, it must be something in the broadcom driver. I am going to digging into it and keep you updated.
Anyway if somebody has ever experienced such a problem with broadcom driver then please let me know.
Regards.
Ok: found (and fixed) the problem.
It was due to the fact that I was using a "modified" version of the stock broadcom driver which initializes the PHY leds configuration reading it from the device tree. The problem was that during ifdown of the PHY the EQOS driver (under some circumstances) re-init the PHY again... but in that context the device tree references were not valid, causing a memory access violation.
I then modified the PHY driver again, for accessing the device-tree only during the first fire-up, and then caching the relevant data in a private structure, accessed during further PHY initializations.
It worked like a charm.
I share my work attaching the patches to be applied to drivers/net/phy/broadcom.c (and releated files).
Thanks for the support.