Hardware / Software Summary:
Our application required more USB Host ports than were implemented by the i.MX7 SABRE board, so the Microchip USB3503 Hub was included to add the two additional ports that we needed. The guidelines for hardware design for both the i.MX7D and the USB3503 were taken into account and the critical routing of the DATA and STROBE signals was implemented to be less than 1" with their length matched to within just a little more than 5 mils. Most boards work correctly 99% of the time, but once in a while I have seen a system fail to enumerate the USB Hub. In these cases, a reboot (equivalent to a cold boot because the PMIC is forced off then turned back on again) will result in the USB Hub enumerating correctly.
This was true until I discovered two boards that fail 99.9% of the time on a cold boot / reboot. With these boards, I actually have something to try to find the root cause. First the high level visibility of the issue in the console serial output is:
ci_hdrc ci_hdrc.2: EHCI Host Controller
ci_hdrc ci_hdrc.2: new USB bus registered, assigned bus number 2
ci_hdrc ci_hdrc.2: USB 2.0 started, EHCI 1.00
usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: EHCI Host Controller
usb usb2: Manufacturer: Linux 4.1.15+ ehci_hcd
usb usb2: SerialNumber: ci_hdrc.2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 1 port detected
usb 2-1: new high-speed USB device number 2 using ci_hdrc
usb 2-1: device no response, device descriptor read/64, error -71
usb 2-1: device no response, device descriptor read/64, error -71
usb 2-1: new high-speed USB device number 3 using ci_hdrc
usb 2-1: device no response, device descriptor read/64, error -71.......
A normal system with a USB Flash drive on a USB Hub downstream port looks like this:
ci_hdrc ci_hdrc.2: EHCI Host Controller
ci_hdrc ci_hdrc.2: new USB bus registered, assigned bus number 2
ci_hdrc ci_hdrc.2: USB 2.0 started, EHCI 1.00
usb usb2: New USB device found, idVendor=1d6b, idProduct=0002
usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: Product: EHCI Host Controller
usb usb2: Manufacturer: Linux 4.1.15+ ehci_hcd
usb usb2: SerialNumber: ci_hdrc.2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 1 port detected
usb 2-1: new high-speed USB device number 2 using ci_hdrc
usb 2-1: New USB device found, idVendor=0424, idProduct=3503
usb 2-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
hub 2-1:1.0: USB hub found
hub 2-1:1.0: 2 ports detected
usb 2-1.2: new high-speed USB device number 3 using ci_hdrc
usb 2-1.2: New USB device found, idVendor=0781, idProduct=5595
usb 2-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 2-1.2: Product: Ultra USB 3.0
usb 2-1.2: Manufacturer: SanDisk
usb 2-1.2: SerialNumber: 4C531001641115101474
usb-storage 2-1.2:1.0: USB Mass Storage device detected
.......
Over the last 2 to 3 weeks I have taken the following steps to isolate the cause of the problem:
ci_hdrc ci_hdrc.2: detected XactErr len 0/8 retry 1
Looking deeper into ehci_hcd.c and getting some visibility on what ehci register read / write operations are being done, I found that a "bad board" is detecting that the UEI bit of the USB2_USBSTS register is being set to indicate that an USB error interrupt has been detected. The following debug output (non-standard code added by me) shows the first few register read / write operations when it tries to enumerate the USB3503 Hub.
On a "bad board":
usb 2-1: new high-speed USB device number 2 using ci_hdrc
ehci_readl: 0xf5b30144 = 0x00000080
ehci_writel: 0xf5b30140 = 0x00010b25
ehci_readl: 0xf5b30140 = 0x00010b25
ehci_readl: 0xf5b30144 = 0x00008082 <<-- read of USB2_USBSTS with UEI bit set
ehci_writel: 0xf5b30144 = 0x00000002 <<-- write of USB2_USBSTS to clear UEI bit
ehci_readl: 0xf5b30140 = 0x00010b25
ci_hdrc ci_hdrc.2: detected XactErr len 0/8 retry 1
ehci_readl: 0xf5b30144 = 0x00008082
ehci_writel: 0xf5b30144 = 0x00000002
ehci_readl: 0xf5b30140 = 0x00010b25
On a normally good board or during a resume from suspend:
usb 2-1: new high-speed USB device number 6 using ci_hdrc
ehci_readl: 0xf5b30144 = 0x00000080
ehci_writel: 0xf5b30140 = 0x00010b25
ehci_readl: 0xf5b30140 = 0x00010b25
ehci_readl: 0xf5b30144 = 0x00048081
ehci_writel: 0xf5b30144 = 0x00000001
ehci_readl: 0xf5b30140 = 0x00010b25
usb 2-1: usb_start_wait_urb length=18, retval=0
ehci_readl: 0xf5b30184 = 0x0a001205
ehci_writel: 0xf5b30184 = 0x0a001301
ehci_readl: 0xf5b30140 = 0x00010b25
usb usb2: usb_start_wait_urb length=0, retval=0
At this point, I've run out of places to look for new information so I'm stuck and need someones deep insight into what can trigger this error and particularly why it only happens on a cold boot / reboot. What is different for a suspend / resume than what is done on a cold boot / reboot? Why does the interface to the USB3503 work flawlessly if you can get past the initial enumeration, but that part is variable from board to board and between boot cycles?
Thanks,
Bill Gessaman
Hello!
I am currently designing an MX7D based system as well, needing 3 USB ports +1 for config and one for an RF module. It seemed to be a good solution to use a 3 port HSIC HUB and the two OTG ports.
When looking for a 3 Port+HSIC HUB I ran into the USB3503 as well and thought it'd be the perfect device for my needs.
However, on running into this thread, some issues are to be expected that were not being resolved despite the effort you guys put in. So there are three options for me:
The USB3503 is exactly what I was looking for and using HSIC is definitively my prefered approach here, but I don't want to hunt down issues for mutliple days/weeks ofc.
2) Yes we do have customer's using this interface.
I wonder: What do these customers use this interface for? Any Hub? What device would that be?
Does the HSIC interface enumerate during power up only? Or may a reset of the attached device trigger a new numeration as well?
Hi Bill Gessaman,
Thank you very much for your prompt and detailed response.
Regards,
Gopinath S
Setting of the USB_OTG2_USBSTS[UEI] bit indicates that the last USB transaction resulted in an error for some reason.
To make me able to carefully study the case, please provide the complete schematic of your board.
Also, please specify some more details on the issue.
1) You mentioned that you tried both L4.1.15-1.0.0-GA and L4.1.15-2.0.0-GA BSPs. Is there any difference in the behaviour between these BSPs?
2) When the issue occurs (as far as I understand, it occurs at power-on boot only), is any USB Device connected to the Hub's downstream port(s)?
3) If the answer to 2) is "yes", does any issue occur if there is no device connected at the boot time?
4) If the answer to 2) is "yes", does the issue occurence depend on an USB Device(s) connected?
Have a great day,
Artur
Hi Artur,
I appreciate your interest in this issue. I'm reluctant to share the "complete" schematic on a public forum and I think it would be necessary for me to leave out a couple of details that might compromise our product security. I will work on this and follow up. I think the board layout is as critical as the schematic, so an image (or images) of the i.MX7 to USB3503 layout may also be of interest.
Answers to your questions:
1) The observed behavior was the same with the two BSPs that I listed above. I didn't see any Git repo commits that made me think that the 2.0.0 release had significant changes in this area, but I thought it was worth trying anyway.
2) You are correct that it happens only at power-on boot. Technically it also happens when a Linux "reboot" command is executed, because our system turns a reboot into a power off / power on sequence by turning off the PMIC for approximately 200ms and then turning it back on. The enumeration error with the external USB3503 Hub does not seem to be affected by whether a USB device is connected to a downstream port on the USB3503 Hub. I have tested with an without having a USB flash drive on one of the downstream ports.
3) Yes - the error happens when no device is connected at boot time.
4) I think you are asking whether the error is dependent on the type of USB device connected to the hub. I have tested with several flash drives as well as our custom measurement modules which are effectively USB devices and use a downstream port on the Hub to connect into the system.
Thanks,
Bill
Hello,
I know this thread is very old, but I finally got the USB today running with Kernel 5.10.
Basically, our issue was the following:
In the device tree, we specified the reference clock. The code actually seems to stop and restart the clock with clk_prepare_enable(). Therefore, we added 4ms sleep afterwards.
Another thing is that we needed to initialize the reset GPIO as low to bring the chip through a proper reset after restarting the clock.
I believe, other folks which do have a crystal or other "stable" clock input do not face any issue. Only if a clock output of the CPU is used.
This is the patch I needed:
diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c
index 48099c6bf04c..a003a71c8e11 100644
--- a/drivers/usb/misc/usb3503.c
+++ b/drivers/usb/misc/usb3503.c
@@ -217,6 +217,9 @@ static int usb3503_probe(struct usb3503 *hub)
return err;
}
+ /* Wait T_HUBINIT == 4ms for hub logic to stabilize */
+ usleep_range(4000, 10000);
+
property = of_get_property(np, "disabled-ports", &len);
if (property && (len / sizeof(u32)) > 0) {
int i;
@@ -247,7 +250,7 @@ static int usb3503_probe(struct usb3503 *hub)
if (hub->connect)
gpiod_set_consumer_name(hub->connect, "usb3503 connect");
- hub->reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_HIGH);
+ hub->reset = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
if (IS_ERR(hub->reset))
return PTR_ERR(hub->reset);
if (hub->reset) {