imx8mp router loopback, causing NETDEV WATCHDOG to reset and unable to restore the mac

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

imx8mp router loopback, causing NETDEV WATCHDOG to reset and unable to restore the mac

1,390 Views
yrj
Contributor III

 Hello, may I ask if our A board and B board are connected to the switch at the same time, and then A-B mutual ping, the switch has a ring network due to misoperation (both ends of a network cable are connected to the switch), there is a problem with the kernel (direct restart, or debug is stuck, etc.), the detailed log is in the attachment.

[ 314.850236] ------------[ cut here ]------------
[ 314.854871] NETDEV WATCHDOG: ens3 (fec): transmit queue 0 timed out 3384 ms
[ 314.861910] WARNING: CPU: 0 PID: 15 at net/sched/sch_generic.c:525 dev_watchdog+0x234/0x23c
[ 314.870271] Modules linked in:
[ 314.873329] CPU: 0 PID: 15 Comm: ksoftirqd/0 Not tainted 6.6.3-g9d3450dbcab9-dirty #71
[ 314.881246] Hardware name: NXP i.MX8MPlus EVK board (DT)
[ 314.886555] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 314.893516] pc : dev_watchdog+0x234/0x23c
[ 314.897528] lr : dev_watchdog+0x234/0x23c
[ 314.901538] sp : ffff8000823e3c20
[ 314.904850] x29: ffff8000823e3c20 x28: 0000000000000100 x27: ffff8000823e3cd0
[ 314.911989] x26: ffff8000820d69c0 x25: 0000000000000d38 x24: ffff8000820d6000
[ 314.919128] x23: 0000000000000000 x22: ffff0000055503dc x21: ffff000005550000
[ 314.926266] x20: ffff00000589c400 x19: ffff000005550488 x18: 0000000000000006
[ 314.933404] x17: ffff8000821b62e8 x16: 0000000073e4d9ea x15: ffff8000823e3640
[ 314.940541] x14: 0000000000000000 x13: ffff8000820f0da0 x12: 0000000000000672
[ 314.947679] x11: 0000000000000226 x10: ffff800082148da0 x9 : ffff8000820f0da0
[ 314.954818] x8 : 00000000ffffefff x7 : ffff800082148da0 x6 : 80000000fffff000
[ 314.961956] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
[ 314.969093] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff00000412d880

Labels (2)
0 Kudos
Reply
5 Replies

1,387 Views
yrj
Contributor III

内核版本是:6.6.3

0 Kudos
Reply

1,375 Views
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi @yrj 

这个问题更多得是和交换机有关,建议通过交换机的环路检测功能解决环网问题。

Best Regards,
Zhiming

0 Kudos
Reply

1,372 Views
yrj
Contributor III

Hi @Zhiming_Liu 

      不小心造成环路,imx8mp可以预防吗

0 Kudos
Reply

1,362 Views
Zhiming_Liu
NXP TechSupport
NXP TechSupport

Hi,


最佳的方法是交换机侧设置防御,一般都有这种功能,可以参考这个:

https://www.cisco.com/c/zh_cn/support/docs/smb/switches/cisco-250-series-smart-switches/smb5794-enab...

 

当然也可以在驱动层添加广播检测统计,但是目前fec驱动不支持这项功能。

你可以通过一个脚本在用户层监测广播,如果出现大量广播,就把网卡down掉。


Best Regards,
Zhiming

0 Kudos
Reply

1,351 Views
yrj
Contributor III

Hi @Zhiming_Liu

     网络风暴发生时,系统有概率直接重启,或者debug调试口卡死,或者系统跑飞(如b.log),需要重启才能恢复。

 

0 Kudos
Reply
%3CLINGO-SUB%20id%3D%22lingo-sub-2140437%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3Eimx8mp%20router%20loopback%2C%20causing%20NETDEV%20WATCHDOG%20to%20reset%20and%20unable%20to%20restore%20the%20mac%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2140437%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CP%3E%3CSPAN%3E%E2%80%85Hello%2C%20may%20I%20ask%20if%20our%20A%20board%20and%20B%20board%20are%20connected%20to%20the%20switch%20at%20the%20same%20time%2C%20and%20then%20A-B%20mutual%20ping%2C%20the%20switch%20has%20a%20ring%20network%20due%20to%20misoperation%20(both%20ends%20of%20a%20network%20cable%20are%20connected%20to%20the%20switch)%2C%20there%20is%20a%20problem%20with%20the%20kernel%20(direct%20restart%2C%20or%20debug%20is%20stuck%2C%20etc.)%2C%20the%20detailed%20log%20is%20in%20the%20attachment.%3C%2FSPAN%3E%3C%2FP%3E%3CP%3E%3CSPAN%3E%5B%20314.850236%5D%20------------%5B%20cut%20here%20%5D------------%3CBR%20%2F%3E%5B%20314.854871%5D%20NETDEV%20WATCHDOG%3A%20ens3%20(fec)%3A%20transmit%20queue%200%20timed%20out%203384%20ms%3CBR%20%2F%3E%5B%20314.861910%5D%20WARNING%3A%20CPU%3A%200%20PID%3A%2015%20at%20net%2Fsched%2Fsch_generic.c%3A525%20dev_watchdog%2B0x234%2F0x23c%3CBR%20%2F%3E%5B%20314.870271%5D%20Modules%20linked%20in%3A%3CBR%20%2F%3E%5B%20314.873329%5D%20CPU%3A%200%20PID%3A%2015%20Comm%3A%20ksoftirqd%2F0%20Not%20tainted%206.6.3-g9d3450dbcab9-dirty%20%2371%3CBR%20%2F%3E%5B%20314.881246%5D%20Hardware%20name%3A%20NXP%20i.MX8MPlus%20EVK%20board%20(DT)%3CBR%20%2F%3E%5B%20314.886555%5D%20pstate%3A%2060000005%20(nZCv%20daif%20-PAN%20-UAO%20-TCO%20-DIT%20-SSBS%20BTYPE%3D--)%3CBR%20%2F%3E%5B%20314.893516%5D%20pc%20%3A%20dev_watchdog%2B0x234%2F0x23c%3CBR%20%2F%3E%5B%20314.897528%5D%20lr%20%3A%20dev_watchdog%2B0x234%2F0x23c%3CBR%20%2F%3E%5B%20314.901538%5D%20sp%20%3A%20ffff8000823e3c20%3CBR%20%2F%3E%5B%20314.904850%5D%20x29%3A%20ffff8000823e3c20%20x28%3A%200000000000000100%20x27%3A%20ffff8000823e3cd0%3CBR%20%2F%3E%5B%20314.911989%5D%20x26%3A%20ffff8000820d69c0%20x25%3A%200000000000000d38%20x24%3A%20ffff8000820d6000%3CBR%20%2F%3E%5B%20314.919128%5D%20x23%3A%200000000000000000%20x22%3A%20ffff0000055503dc%20x21%3A%20ffff000005550000%3CBR%20%2F%3E%5B%20314.926266%5D%20x20%3A%20ffff00000589c400%20x19%3A%20ffff000005550488%20x18%3A%200000000000000006%3CBR%20%2F%3E%5B%20314.933404%5D%20x17%3A%20ffff8000821b62e8%20x16%3A%200000000073e4d9ea%20x15%3A%20ffff8000823e3640%3CBR%20%2F%3E%5B%20314.940541%5D%20x14%3A%200000000000000000%20x13%3A%20ffff8000820f0da0%20x12%3A%200000000000000672%3CBR%20%2F%3E%5B%20314.947679%5D%20x11%3A%200000000000000226%20x10%3A%20ffff800082148da0%20x9%20%3A%20ffff8000820f0da0%3CBR%20%2F%3E%5B%20314.954818%5D%20x8%20%3A%2000000000ffffefff%20x7%20%3A%20ffff800082148da0%20x6%20%3A%2080000000fffff000%3CBR%20%2F%3E%5B%20314.961956%5D%20x5%20%3A%200000000000000000%20x4%20%3A%200000000000000000%20x3%20%3A%200000000000000000%3CBR%20%2F%3E%5B%20314.969093%5D%20x2%20%3A%200000000000000000%20x1%20%3A%200000000000000000%20x0%20%3A%20ffff00000412d880%3C%2FSPAN%3E%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-LABS%20id%3D%22lingo-labs-2140437%22%20slang%3D%22en-US%22%20mode%3D%22CREATE%22%3E%3CLINGO-LABEL%3Ei.MX8ULP%3C%2FLINGO-LABEL%3E%3CLINGO-LABEL%3ELinux%3C%2FLINGO-LABEL%3E%3C%2FLINGO-LABS%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2140582%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3E%E5%9B%9E%E5%A4%8D%EF%BC%9A%20imx8mp%20router%20loopback%2C%20causing%20NETDEV%20WATCHDOG%20to%20reset%20and%20unable%20to%20restore%20the%20mac%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2140582%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3E%3CP%3EHi.%3C%2FP%3E%0A%3CP%3E%3CBR%20%2F%3EThe%20best%20way%20to%20do%20this%20is%20to%20set%20up%20a%20defense%20on%20the%20switch%20side%2C%20which%20is%20usually%20available%2C%20see%20this%3A%3C%2FP%3E%0A%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fwww.cisco.com%2Fc%2Fzh_cn%2Fsupport%2Fdocs%2Fsmb%2Fswitches%2Fcisco-250-series-smart-switches%2Fsmb5794-enable-loopback-detection-on-a-switch.html%22%20target%3D%22_blank%22%20rel%3D%22nofollow%20noopener%20noreferrer%22%3Ehttps%3A%2F%2Fwww.cisco.com%2Fc%2Fzh_cn%2Fsupport%2Fdocs%2Fsmb%2Fswitches%2Fcisco-250-series-smart-switches%2Fsmb5794-enable-loopback-detection-on-a-%20switch.html%3C%2FA%3E%3C%2FP%3E%0A%3CBR%20%2F%3E%0A%3CP%3EOf%20course%20it%20is%20also%20possible%20to%20add%20broadcast%20detection%20statistics%20at%20the%20driver%20level%2C%20but%20currently%20the%20fec%20driver%20does%20not%20support%20this%20feature.%3C%2FP%3E%0A%3CP%3EYou%20can%20monitor%20broadcasts%20at%20the%20user%20level%20via%20a%20script%20and%20if%20there%20are%20a%20lot%20of%20broadcasts%2C%20down%20the%20NIC.%3C%2FP%3E%0A%3CP%3E%3CBR%20%2F%3EBest%20Regards%2C%20%3CBR%20%2F%3EZhiming%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2140485%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3E%E5%9B%9E%E5%A4%8D%EF%BC%9A%20imx8mp%20router%20loopback%2C%20causing%20NETDEV%20WATCHDOG%20to%20reset%20and%20unable%20to%20restore%20the%20mac%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2140485%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3E%3CP%3EHi%20%3CA%20href%3D%22https%3A%2F%2Fcommunity.nxp.com%2Ft5%2Fuser%2Fviewprofilepage%2Fuser-id%2F253076%22%20target%3D%22_blank%22%3E%40yrj%3C%2FA%3E%20%3CBR%20%2F%3E%3CBR%20%2F%3EThis%20problem%20is%20more%20to%20do%20with%20the%20switch%2C%20it%20is%20recommended%20to%20solve%20the%20loop%20problem%20through%20the%20loop%20detection%20function%20of%20the%20switch.%20%3CBR%20%2F%3E%3CBR%20%2F%3EBest%20Regards%2C%20%3CBR%20%2F%3EZhiming%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-2140441%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%20translate%3D%22no%22%3E%E5%9B%9E%E5%A4%8D%EF%BC%9A%20imx8mp%20router%20loopback%2C%20causing%20NETDEV%20WATCHDOG%20to%20reset%20and%20unable%20to%20restore%20the%20mac%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-2140441%22%20slang%3D%22zh-CN%22%20mode%3D%22CREATE%22%3E%3CP%3EKernel%20version%20is%3A%206.6.3%3C%2FP%3E%3C%2FLINGO-BODY%3E