Ethernet driver crashed LS1021a

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Ethernet driver crashed LS1021a

1,712 Views
yasushi_yamawak
Contributor I

Hello,

I am using LS1021a CPU and linux kernel 4.1.18, then observed suddenly ether driver crash during EtherNet/IP communicating.

Here is a crash log and uboot output.

------------[ cut here ]------------
WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:318 dev_watchdog+0x161/0x1dc()
NETDEV WATCHDOG: eth0 (fsl-gianfar): transmit queue 0 timed out
Modules linked in: rtpmac(O) atemsys(O) libppmac(O) libmath(O) ppmachw(O) r8169 mii firmware_class
CPU: 1 PID: 0 Comm: swapper/1 Tainted: G O 4.1.18-ipipe #102
Hardware name: Freescale LS1021A
[<80011dfd>] (unwind_backtrace) from [<8000f9bf>] (show_stack+0xb/0xc)
[<8000f9bf>] (show_stack) from [<803b14ff>] (dump_stack+0x63/0x80)
[<803b14ff>] (dump_stack) from [<8001c385>] (warn_slowpath_common+0x55/0x7c)
[<8001c385>] (warn_slowpath_common) from [<8001c3c7>] (warn_slowpath_fmt+0x1b/0x24)
[<8001c3c7>] (warn_slowpath_fmt) from [<80349725>] (dev_watchdog+0x161/0x1dc)
[<80349725>] (dev_watchdog) from [<80048fb9>] (call_timer_fn+0x15/0x5c)
[<80048fb9>] (call_timer_fn) from [<80049697>] (run_timer_softirq+0x141/0x17e)
[<80049697>] (run_timer_softirq) from [<8001e4e7>] (__do_softirq+0xa7/0x17c)
[<8001e4e7>] (__do_softirq) from [<8001e74f>] (irq_exit+0x4b/0x9c)
[<8001e74f>] (irq_exit) from [<8003ff7f>] (__handle_domain_irq+0x77/0xac)
[<8003ff7f>] (__handle_domain_irq) from [<8005c319>] (__ipipe_do_sync_stage+0x195/0x1f0)
[<8005c319>] (__ipipe_do_sync_stage) from [<800092c5>] (__ipipe_grab_irq+0x4d/0x64)
[<800092c5>] (__ipipe_grab_irq) from [<80009493>] (gic_handle_irq+0x2b/0x40)
Exception stack(0xbf075f90 to 0xbf075fd8)
5f80: 8005c438 800f0033 ffffffff 803b499b
5fa0: bffdb7a8 00000000 3f9ff000 00000000 bf074000 805e4508 805db380 80615eec
5fc0: 80003010 410fc075 00000000 00000000 8066f180 bf075fe8
[<80009493>] (gic_handle_irq) from [<803b499b>] (__irq_svc+0x3b/0x42)
Exception stack(0xbf075fa0 to 0xbf075fe8)
5fa0: bffdb7a8 00000000 3f9ff000 00000000 bf074000 805e4508 805db380 80615eec
5fc0: 80003010 410fc075 00000000 00000000 8066f180 bf075fe8 8003af1d 8005c438
5fe0: 800f0033 ffffffff
[<803b499b>] (__irq_svc) from [<8005c438>] (ipipe_unstall_root+0x2c/0x3c)
[<8005c438>] (ipipe_unstall_root) from [<8003af1d>] (cpu_startup_entry+0xf1/0x128)
[<8003af1d>] (cpu_startup_entry) from [<80009551>] (__enable_mmu+0x1/0x10)
---[ end trace cde1cacd8601d35c ]---

U-Boot 2015.01+SDKv1.9+geb3d4fc (Feb 23 2018 - 15:49:52)

CPU: Freescale LayerScape LS1021, Version: 2.0, (0x87001120)
Clock Configuration:
CPU0(ARMV7):1000 MHz,
Bus:300 MHz, DDR:800 MHz (1600 MT/s data rate),
Reset Configuration Word (RCW):
00000000: 0608000a 00000000 00000000 00000000
00000010: 20000000 00403900 60025a00 21046000
00000020: 00000000 00000000 00000000 18000000
00000030: 00000000 4b1b7340 00000000 00000000
Board: LS1021UMAC
CPLD: V1.0
PCBA: V2
CPLD8: RC68 WC63
I2C: ready
DRAM: 1 GiB
Using SERDES1 Protocol: 32 (0x20)
The regulator (MC34VR500) does not exist. The device does not support deep sleep.
Flash: 0 Bytes
MMC: FSL_SDHC: 0
EEPROM: Invalid ID (ea a0 83 76)
PCIe1: Root Complex x1 gen1, regs @ 0x3400000
01:00.0 - 10ec:8168 - Network controller
PCIe1: Bus 00 - 01
PCIe2: Root Complex no link, regs @ 0x3500000
In: serial
Out: serial
Err: serial
SATA link 0 timeout.
AHCI 0001.0300 1 slots 1 ports ? Gbps 0x1 impl SATA mode
flags: 64bit ncq pm clo only pmp fbss pio slum part ccc
scanning bus for devices...
Found 0 device(s).
SCSI: Net: eTSEC1 is in sgmii mode.
Phy 2 not found
PHY reset timed out
eTSEC1, eTSEC3 [PRIME]
Error: eTSEC3 address not set.

Could you provide me the suggested solutions and causes?

Thanks

Yasushi Yamawaki

Tags (1)
0 Kudos
10 Replies

1,384 Views
yasushi_yamawak
Contributor I

Hi Mr. Yiping

I am struggling to see the issue reason, and I want to see errata information related /drivers/net/ethernet/freescale.

Could you provide me that from 4.1.18 to the latest?

Thank you for your coorporation.

Yasushi

0 Kudos

1,384 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello Yasushi.

Please refer to known issue section in QorIQ SDK v2.0-1703 Documentation.

Fixed, Open and Closed Issues 

Thanks,

Yiping

0 Kudos

1,384 Views
yasushi_yamawak
Contributor I

Hello Mr. Yiping

Thank you for issue documentation. I will check it. 

Yesterday, the issue was duplicated after 10 days running, and I found that tx_fifo_errors is counted up to 2.

Then, I have some questions,

1. Did you fix something like tx buffering issue mentioned above before?

2. Did you receive some reports about consuming all tx buffers while ethernet running from other customers?

I would like you to share a workaround if you know something.

Thanks, Yasushi

0 Kudos

1,384 Views
yasushi_yamawak
Contributor I

Hello Yiping

This is a complementary information that I can run Ethernet/IP communicating for 3 days. After 3 days, Ether driver seems to be crashed. I tried again your suggested commands. It seems ping works and link status on.

U-Boot 2015.01+SDKv1.9+geb3d4fc (Feb 23 2018 - 15:49:52)

CPU: Freescale LayerScape LS1021, Version: 2.0, (0x87001120)
Clock Configuration:
CPU0(ARMV7):1000 MHz,
Bus:300 MHz, DDR:800 MHz (1600 MT/s data rate),
Reset Configuration Word (RCW):
00000000: 0608000a 00000000 00000000 00000000
00000010: 20000000 00403900 60025a00 21046000
00000020: 00000000 00000000 00000000 18000000
00000030: 00000000 4b1b7340 00000000 00000000
Board: LS1021UMAC
CPLD: V1.0
PCBA: V2
CPLD8: RC68 WC63
I2C: ready
DRAM: 1 GiB
Using SERDES1 Protocol: 32 (0x20)
The regulator (MC34VR500) does not exist. The device does not support deep sleep.
Flash: 0 Bytes
MMC: FSL_SDHC: 0
EEPROM: Invalid ID (b9 f8 01 cf)
PCIe1: Root Complex no link, regs @ 0x3400000
PCIe2: Root Complex no link, regs @ 0x3500000
In: serial
Out: serial
Err: serial
SATA link 0 timeout.
AHCI 0001.0300 1 slots 1 ports ? Gbps 0x1 impl SATA mode
flags: 64bit ncq pm clo only pmp fbss pio slum part ccc
scanning bus for devices...
Found 0 device(s).
SCSI: Net: eTSEC1 is in sgmii mode.
Phy 2 not found
PHY reset timed out
eTSEC1, eTSEC3 [PRIME]
Error: eTSEC3 address not set.

Hit any key to stop autoboot: 0
=> mdio list
FSL_MDIO:
0 - Generic PHY <--> eTSEC3
2 - Generic PHY <--> eTSEC1
=> mdio read eTSEC3 1
Reading from bus FSL_MDIO
PHY at address 0:
1 - 0x786d
=> mdio read eTSEC1 1
Reading from bus FSL_MDIO
PHY at address 2:
1 - 0xffff
=> setenv ipaddr 192.168.0.200
=> setenv netmask 255.255.255.0
=> ping 192.168.0.171
Speed: 100, full duplex
*** ERROR: `eth1addr' not set
Speed: 1000, full duplex
Using eTSEC1 device
ping failed; host 192.168.0.171 is not alive
=> setenv eth1addr 00:00:0A:12:34:56
=> ping 192.168.0.171
Speed: 1000, full duplex
Using eTSEC1 device
ping failed; host 192.168.0.171 is not alive
=> setenv ethact eTSEC3
=> ping 192.168.0.171
Speed: 100, full duplex
Using eTSEC3 device
host 192.168.0.171 is alive
=> ping 192.168.0.230
Speed: 100, full duplex
Using eTSEC3 device
host 192.168.0.230 is alive
=>
Speed: 100, full duplex
Using eTSEC3 device
host 192.168.0.230 is alive
=> mii info
PHY 0x00: OUI = 0x0885, Model = 0x16, Rev = 0x01, 100baseT, FDX
PHY 0x1F: OUI = 0x0000, Model = 0x00, Rev = 0x00, 10baseT, HDX
=> mii dump 0 0
0. (3100) -- PHY control register --
(8000:0000) 0.15 = 0 reset
(4000:0000) 0.14 = 0 loopback
(2040:2000) 0. 6,13 = b01 speed selection = 100 Mbps
(1000:1000) 0.12 = 1 A/N enable
(0800:0000) 0.11 = 0 power-down
(0400:0000) 0.10 = 0 isolate
(0200:0000) 0. 9 = 0 restart A/N
(0100:0100) 0. 8 = 1 duplex = full
(0080:0000) 0. 7 = 0 collision test enable
(003f:0000) 0. 5- 0 = 0 (reserved)


=> mii dump 0 1
1. (786d) -- PHY status register --
(8000:0000) 1.15 = 0 100BASE-T4 able
(4000:4000) 1.14 = 1 100BASE-X full duplex able
(2000:2000) 1.13 = 1 100BASE-X half duplex able
(1000:1000) 1.12 = 1 10 Mbps full duplex able
(0800:0800) 1.11 = 1 10 Mbps half duplex able
(0400:0000) 1.10 = 0 100BASE-T2 full duplex able
(0200:0000) 1. 9 = 0 100BASE-T2 half duplex able
(0100:0000) 1. 8 = 0 extended status
(0080:0000) 1. 7 = 0 (reserved)
(0040:0040) 1. 6 = 1 MF preamble suppression
(0020:0020) 1. 5 = 1 A/N complete
(0010:0000) 1. 4 = 0 remote fault
(0008:0008) 1. 3 = 1 A/N able
(0004:0004) 1. 2 = 1 link status
(0002:0000) 1. 1 = 0 jabber detect
(0001:0001) 1. 0 = 1 extended capabilities


=> mii dump 0 2
2. (0022) -- PHY ID 1 register --
(ffff:0022) 2.15- 0 = 34 OUI portion


=> mii dump 0 3
3. (1561) -- PHY ID 2 register --
(fc00:1400) 3.15-10 = 5 OUI portion
(03f0:0160) 3. 9- 4 = 22 manufacturer part number
(000f:0001) 3. 3- 0 = 1 manufacturer rev. number


=> mii dump 0 4
4. (81e1) -- Autonegotiation advertisement register --
(8000:8000) 4.15 = 1 next page able
(4000:0000) 4.14 = 0 (reserved)
(2000:0000) 4.13 = 0 remote fault
(1000:0000) 4.12 = 0 (reserved)
(0800:0000) 4.11 = 0 asymmetric pause
(0400:0000) 4.10 = 0 pause enable
(0200:0000) 4. 9 = 0 100BASE-T4 able
(0100:0100) 4. 8 = 1 100BASE-TX full duplex able
(0080:0080) 4. 7 = 1 100BASE-TX able
(0040:0040) 4. 6 = 1 10BASE-T full duplex able
(0020:0020) 4. 5 = 1 10BASE-T able
(001f:0001) 4. 4- 0 = 1 selector = IEEE 802.3


=> mii dump 0 5
5. (cde1) -- Autonegotiation partner abilities register --
(8000:8000) 5.15 = 1 next page able
(4000:4000) 5.14 = 1 acknowledge
(2000:0000) 5.13 = 0 remote fault
(1000:0000) 5.12 = 0 (reserved)
(0800:0800) 5.11 = 1 asymmetric pause able
(0400:0400) 5.10 = 1 pause able
(0200:0000) 5. 9 = 0 100BASE-T4 able
(0100:0100) 5. 8 = 1 100BASE-X full duplex able
(0080:0080) 5. 7 = 1 100BASE-TX able
(0040:0040) 5. 6 = 1 10BASE-T full duplex able
(0020:0020) 5. 5 = 1 10BASE-T able
(001f:0001) 5. 4- 0 = 1 selector = IEEE 802.3


=> mii dump 0 6
The MII dump command only formats the standard MII registers, 0-5.
=>

Thanks,

Yasushi

0 Kudos

1,384 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello Yasushi,

According to your u-boot log, it seems that eTSEC3 works normally, eTSEC1 doesn't work due to the hardware problem of PHY.

Would you please try whether eth2 work normally in Linux? And the Linux Kernel crashing problem only occurs when using eth0?

Thanks,

Yiping

0 Kudos

1,384 Views
yasushi_yamawak
Contributor I

Hello Mr. Yiping

It is only on eth0 that the issue is actually occurred. We do not use another ethernet ports.

Best regards,

Yasushi

0 Kudos

1,384 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello Yasushi Yamawaki,

Tx snooping freezes the transmission during traffic, so it should be disable. This may be a HW issue or a L2 initialization issue.

Would you try whether "ping" command works under u-boot?

Would you please send your Kernel uImage to me to do more verification?

In addition, please check link status for the external PHY under u-boot.

=> mdio list
FSL_MDIO:
1 - AR8031/AR8033 <--> eTSEC3
2 - AR8031/AR8033 <--> eTSEC2
=> mdio read eTSEC3 1
Reading from bus FSL_MDIO
PHY at address 1:
1 - 0x796d
=> mdio read eTSEC2 1
Reading from bus FSL_MDIO
PHY at address 2:
1 - 0x7949

Thanks,

Yiping

0 Kudos

1,384 Views
yasushi_yamawak
Contributor I

How can I attach uImage?

0 Kudos

1,384 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello Yasushi,

Under u-boot, please configure the environment variable "ipaddr" first, then run ping command.

=>setenv ipaddr 10.150.168.155

=>ping 10.150.168.205

Please click "Use advanced editor" on the right top of the comment panel, then attach your uImage.

Thanks,

Yiping

0 Kudos

1,384 Views
yasushi_yamawak
Contributor I

Hello Yiping

I tried ping command and check link status, shown the result below.

=> mdio list
FSL_MDIO:
0 - Generic PHY <--> eTSEC3
2 - Generic PHY <--> eTSEC1
=> mdio read eTSEC3 1
Reading from bus FSL_MDIO
PHY at address 0:
1 - 0x786d
=> mdio read eTSEC2 1
eTSEC2 is not a known ethernet
mdio - MDIO utility commands

Usage:
mdio list - List MDIO buses
mdio read <phydev> [<devad>.]<reg> - read PHY's register at <devad>.<reg>
mdio write <phydev> [<devad>.]<reg> <data> - write PHY's register at <devad>.<reg>
mdio rx <phydev> [<devad>.]<reg> - read PHY's extended register at <devad>.<reg>
mdio wx <phydev> [<devad>.]<reg> <data> - write PHY's extended register at <devad>.<reg>
<phydev> may be:
<busname> <addr>
<addr>
<eth name>
<addr> <devad>, and <reg> may be ranges, e.g. 1-5.4-0x1f.

=> ping 10.150.168.155
Speed: 100, full duplex
*** ERROR: `ipaddr' not set
ping failed; host 10.150.168.155 is not alive
=> ping 10.150.168.205
Speed: 100, full duplex
*** ERROR: `ipaddr' not set
ping failed; host 10.150.168.205 is not alive
=> ping 10.150.168.157
Speed: 100, full duplex
*** ERROR: `ipaddr' not set
ping failed; host 10.150.168.157 is not alive

10.150.168.155 is myself, 10.150.168.205 is the interface of my laptop, and 10.150.168.157 is another controller.

Yasushi

0 Kudos