One of Ethernet interfaces can't find PHY until reboot

a_r_f_ · ‎08-20-2020

Hi,

I have a design using i.M6ULL. I need both of its Ethernet interfaces (FEC1 + FEC2). On my board both FECs are connected to LAN8720A PHYs. FEC1 has PHY with MDIO address 0, and FEC2 has PHY withMDIO address 1. The device is running Linux.

The problem which I am now facing is that after power-on only one Ethernet interface is running. After reboot (using command or reset button, not power-cycle) both interfaces can be used without any problems. Details below

Did somebody have a similar issue? Please, help.

Best regards,
Adam

1) Config in device tree:

&fec1 {
 pinctrl-names = "default";
 pinctrl-0 = <&pinctrl_enet1>;
 phy-mode = "rmii";
 phy-handle = <&ethphy0>;
 status = "okay";
};

&fec2 {
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_enet2>, <&pinctrl_enet2_mdio>;
        phy-mode = "rmii";
        phy-handle = <&ethphy1>;
        status = "okay";

        mdio {
                #address-cells = <1>;
                #size-cells = <0>;

                ethphy0: ethernet-phy@0 {
                        reg = <0>;
                };

                ethphy1: ethernet-phy@1 {
                        reg = <1>;
                };
        };
};

2) The state after power on - only eth0 is working. The console says:

Starting network: fec 2188000.ethernet eth1: Unable to connect to phy
ip: SIOCSIFFLAGS: No such device
SMSC LAN8710/LAN8720 20b4000.ethernet-1:01: attached PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=20b4000.ethernet-1:01, irq=POLL)
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
(...)
fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

But, both PHYs are alive, mii-diag can see them (both are connected and have the autonegotiation complete):

# mii-diag -p0
Using the default interface 'eth0'.
Using the specified MII PHY index 0.
Basic registers of MII PHY #0:  3100 7829 0007 c0f1 01e1 41e1 0003 ffff.
 The autonegotiated capability is 01e0.
The autonegotiated media type is 100baseTx-FD.
 Basic mode control register 0x3100: Auto-negotiation enabled.
 Basic mode status register 0x7829 ... 782d.
   Link status: previously broken, but now reestablished.
 Your link partner advertised 41e1: 100baseTx-FD 100baseTx 10baseT-FD 10baseT.
   End of basic transceiver information.

# mii-diag -p1
Using the default interface 'eth0'.
Using the specified MII PHY index 1.
Basic registers of MII PHY #1:  3100 782d 0007 c0f1 05e1 cde1 0009 ffff.
 The autonegotiated capability is 01e0.
The autonegotiated media type is 100baseTx-FD.
 Basic mode control register 0x3100: Auto-negotiation enabled.
 You have link beat, and everything is working OK.
 Your link partner advertised cde1: Flow-control 100baseTx-FD 100baseTx 10baseT-FD 10baseT, w/ 802.3X flow control.
   End of basic transceiver information.

3) The state after reboot (command from console or reset button pressed, not power-cycled) - both eth0 and eth1 are working. The console says:

Starting network: SMSC LAN8710/LAN8720 20b4000.ethernet-1:00: attached PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=20b4000.ethernet-1:00, irq=POLL)
IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
SMSC LAN8710/LAN8720 20b4000.ethernet-1:01: attached PHY driver [SMSC LAN8710/LAN8720] (mii_bus:phy_addr=20b4000.ethernet-1:01, irq=POLL)
IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
(...)
fec 2188000.ethernet eth1: Link is Up - 100Mbps/Full - flow control off
IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
fec 20b4000.ethernet eth0: Link is Up - 100Mbps/Full - flow control rx/tx
IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

Gandalf-kern · ‎07-27-2021

Did you resolve this issue and what was your solution?

We had a similar issue on a imx8x using a second PHY. The problem was that fec2 got initialized before fec1, and that caused no MDIO due to how the driver expects the fec that owns the MDIO registers to be initialized first in the shared MDIO bus case.

This was ultimately caused by using phy-resets-gpios= in fec1 (Toradex device tree config) and we picked the phy-resets= method inside the phy object for the 2nd ethernet interface in the device tree. When using the same method for both, the initialization order is correct, when using different methods, the first time fec1 is tried, it gets deferred with -EPROBE_DEFER and then fec2 gets first to go through initialization "successfully".

Also see https://community.toradex.com/t/apalis-imx8qm-external-rgmii-interface-issue/12100/7

igorpadykov · ‎08-20-2020

Hi Adam

for dual ethernet one can look at imx6ul eth work abnormal

Issue with detection after reset may be related to power on sequence,

one can recheck it in i.MX 6ULL Applications Processors for Consumer Products

or LAN8720 PHY Address Strapping (input on the RXER/PHYAD0 pin) set by hardware reset.

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

a_r_f_ · ‎08-20-2020

Hi Igor,

thanks for a quick reply.

However, perhaps I should better clarify, that in my case the problem is that eth0 can't connect to its PHY only after power-on. This happens despite that both PHYs get their addresses correctly, and mii-diag can talk to both even after power-on, although a moment ago eth0 was not able to.

Resetting (without power-cycling) fixes the operation of the system (but of course I can't leave it like this).

I've made a mistake in my original post, the ethernet peripherals in iMX6ULL are FEC1 and FEC2 (not 0 and 1). FEC2 becomes eth0 and has PHY with address 1, while FEC1 becomes eth1 and has PHY with address 0. (I will edit the post to correct it.)

This whole thing looks a bit like Linux would not want to talk to PHY 0 (connected to FEC1, so eth1) on a fresh boot, like MDIO bus was not yet configured. If during reset nothing modifies the contents of the config registers (my u-boot is stripped down to bare minumum and doesn't touch any network related stuff), then during repeated boot the MDIO bus is already available. That's my best guess so far.

Initially I had only fec1 enabled in the device tree, with MDIO bus was defined under &fec1, and there only one PHY (address 0). In this configuration FEC1 with PHY0 was working fine. This is still the same hardware, no changes! Now just second Ethernet interface is configured in the device tree.

I would appreciate any hint on how to debug this, like, how to ask Linux to tell in details what exactly is happening. I am a hardware guy and I don't know that much about Linux yet.

Best regards,
Adam

igorpadykov · ‎08-20-2020

Hi Adam

one can try to debug it using AN4553 Using Open Source Debugging

Tools for Linux on i.MX Processors
https://www.nxp.com/docs/en/application-note/AN4553.pdf

Best regards
igor

a_r_f_ · ‎08-21-2020

Hi Igor,

for some reason that didn't work for me. I believe I've added the debug symbols and KGDB to the kernel, the whole kernel got rebuilt, and then KGDB halted the boot, but GDB on my computer was not able to find the debug symbols in vmlinux. Since I generally don't like to use debuggers, after two or three attempts I gave up and just told the kernel to printk some details.

And I think I found the reason. First I found this message, which I've missed previously:

libphy: fec_enet_mii_bus: probed
mdio_bus 20b4000.ethernet-1: MDIO device at address 0 is missing.

Based on printk'ed information I can tell, that MDIO tries to read ID of the PHY with address 0, but reads only 0xFFFF. Then it tries to do the same for PHY with address 1, and succeeds. After reset it can correctly read IDs of both PHYs.

PHY with address 0 is connected to FEC1.
PHY with address 1 is connected to FEC2.
So my understanding of the problem is that at the time when MDIO bus is looking for the PHYs, FEC1 is still not yet started, and the PHY with address 0 connected thereto is getting no clock. After reset both FECs are running (nothing touches their configuration during reboot), so both PHYs respond and get identified.

The solution would be to have the kernel first enable both FECs, before MDIO probes the PHYs.

Is there a known solution for that?

Best regards,
Adam

a_r_f_ · ‎08-28-2020

The code of the driver (fec_main.c) suggestes that this problem might have been solved for i.MX28. However, with respect to i.MX6 the driver is messy, because it refers to FEC0, while i.MX6 has FEC1 and FEC2. Which one corresponds to that FEC0 from i.MX28? What is the order of probing FECs in i.MX28?

I may have been able to "move the problem elsewhere" by guessing this contents of the device tree:

&fec1 {
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_enet1>;
        phy-mode = "rmii";
        phy-handle = <&ethphy0>;
        status = "okay";
        fsl,mii-exclusive;

        mdio {
                #address-cells = <1>;
                #size-cells = <0>;

                ethphy0: ethernet-phy@0 {
//                      compatible = "smsc,lan8720";
//                      device_type = "ethernet-phy";
                        reg = <0>;
                };

//              ethphy1: ethernet-phy@1 {
//                      compatible = "smsc,lan8720";
//                      device_type = "ethernet-phy";
//                      reg = <1>;
//              };
        };
};

&fec2 {
        pinctrl-names = "default";
        pinctrl-0 = <&pinctrl_enet2>, <&pinctrl_enet2_mdio>;
        phy-mode = "rmii";
        phy-handle = <&ethphy1>;
        status = "okay";

        mdio {
                #address-cells = <1>;
                #size-cells = <0>;

//              ethphy0: ethernet-phy@0 {
//                      compatible = "smsc,lan8720";
//                      device_type = "ethernet-phy";
//                      reg = <0>;
//              };

                ethphy1: ethernet-phy@1 {
//                      compatible = "smsc,lan8720";
//                      device_type = "ethernet-phy";
                        reg = <1>;
                };
        };
};

Then, the startup sequence looks like what I would expect:

1) FEC2 gets probed, and its clocks initialized.
2) MDIO bus gets initialized and probed, but looks only for PHY1.

3) FEC1 gets probed, and its clocks initialized.

4) MDIO bus looks for PHY0.

The new problem is that FEC1 is unable to use the MDIO bus. Previously the read ID of (inactive) PHY0 was 0xffffffff (default state of MDIO), now it is 0x00000000 (no access to MDIO).

If I try to define MDIO pins in pinctrl-0 of FEC1, kernel detects a conflict with the already assigned MDIO pins of FEC2.

Best regards,
Adam

igorpadykov · ‎08-28-2020

Hi Adam

what bsp used in the case, could you try latest from

nxp source.codeaurora.org/external/imx/linux-imx repository

linux-imx - i.MX Linux kernel

Best regards
igor

a_r_f_ · ‎08-28-2020

Hi Igor,

my starting point was
GitHub - SoMLabs/somlabs-linux-imx at imx_4.19.35_1.1.0

because I've used a module from SomLabs.

I've made a comparison against
linux-imx - i.MX Linux kernel
and it seems that at least the key files (like fec_main.c) are identical.

At this point I don't want to use a newer kernel.

The current problem is "how to make FEC1 reuse MDIO bus from FEC2".

Best regards,
Adam

igorpadykov · ‎08-28-2020

Hi Adam

formally this board and software are not supported by nxp,

may be recommended to proceed with help of NXP Professional Services:

Professional Services Software Technologies Form | NXP Semiconductors

Best regards
igor

igorpadykov · ‎08-21-2020

Hi Adam

>Is there a known solution for that?

not sorry.

Just for test one can try to increase POR up to 1-2 sec.

Best regards
igor

a_r_f_ · ‎08-21-2020

Hi Igor,

timing is not the problem here, the order of actions is. I guess that u-boot could activate both FECs before Linux tries to use them. But this is so wack, having to rely on which bootloader somebody prepares and how it gets configured. There has to be an elegant way.

Best regards,
Adam

2548903578 · ‎07-08-2021

Hi, a_r_f_

I ran into the same problem as you, and I don’t know if you solved it.

igorpadykov · ‎08-21-2020

Hi Adam

POR suggestion was not for "timing", it could help to understand if

issue is related to instabilities of power supplies and clocks when first (faulty)

after power-up access happens. Another test can be to heat/cool problematic chip.

Best regards
igor

a_r_f_ · ‎08-21-2020

Hi Igor,

ok, I see your point. But if I configure only FEC1 with its PHY and move MDIO under FEC1, then it runs perfectly (that was my first configuration after bringup, so it's "well tested"). I think that the problem is not related to power or hardware as such, and fixing the order of initializations in Linux would solve it.

Best regards,
Adam

igorpadykov · ‎08-21-2020

Hi Adam

if reboot without removing power helps this definitely points that

problem is related to power or hardware.

Best regards
igor

a_r_f_ · ‎08-28-2020

Hi Igor,

no, this is a software problem. A hardware problem would be less predictable or independent on the software. Here the hardware works surprisingly well and the behaviour depends on the configuration of the system, always in the same way in relation to tried configuration.

Best regards,
Adam

Hienn · ‎03-29-2023

Hi a_r_f,

Did you resolved your problems?

One of Ethernet interfaces can't find PHY until reboot

One of Ethernet interfaces can't find PHY until reboot

i.MX6_All

Linux