Hi,
I have a data corruption (or packet loss) problem with MPC8347 USB Device Controller.
I am using our 8347 custom board as USB Device and a laptop as USB Host.
My test application sends data from USB Device to USB Host.
MPC8347 (Device) -----> Laptop (Host)
It sends 1448 bytes of data to the host in the following structure.
unsigned int buf[362];
buf[0] = 1; // sequence number
buf[1] = 1440; // data length in byte
buf[2] = 0xDEADBEEF;
buf[3] = 0xDEADBEEF;
…
buf[361] = 0xDEADBEEF;
There is 2 ms delay after the write and it sends another 1448 bytes of data in the same structure with sequence number incremented by 1.
The device keeps sending data and the host receives and verifies the data.
It works quiet well for about 10~20 minutes and the problem occurs.
The problem is that the host expects to receive 0xDEADBEEF, but it receives wrong data. In fact, it receives the sequence number of the next data packet.
It looks like there is packet loss in DR (Dual-Role) UDC.
Would this be a known issue with DR UDC? Could you confirm that it is caused by USB-A001 in MPC8349E Chip Errata?
The MPC8347 board is running Linux 3.2. My laptop is running Ubuntu 12.04.
Here is configuration of DR UDC in my kernel.
usb@23000 {
compatible = "fsl-usb2-dr";
reg = <0x23000 0x1000>; // OK, USB DR starts at 0x23000
#address-cells = <1>;
#size-cells = <0>;
interrupt-parent = <&ipic>;
interrupts = <38 0x8>; // OK, IRQ ID no = 38 for USB-DR
dr_mode = "peripheral";
phy_type = "ulpi";
};
Here is output of lsusb -v on my laptop. It shows information of MPC8347 Gadget driver.
Bus 001 Device 110: ID xxxx:yyyy ABCD
Device Descriptor:
bLength 18
bDescriptorType 1
bcdUSB 2.00
bDeviceClass 0 (Defined at Interface level)
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
idVendor 0xxxxx ABCD
idProduct 0xyyyy
bcdDevice 2.19
iManufacturer 1 ABCD
iProduct 2 ABCD
iSerial 3 ABCDABCD
bNumConfigurations 1
Configuration Descriptor:
bLength 9
bDescriptorType 2
wTotalLength 32
bNumInterfaces 1
bConfigurationValue 1
iConfiguration 0
bmAttributes 0xc0
Self Powered
MaxPower 2mA
Interface Descriptor:
bLength 9
bDescriptorType 4
bInterfaceNumber 0
bAlternateSetting 0
bNumEndpoints 2
bInterfaceClass 7 Printer
bInterfaceSubClass 1 Printer
bInterfaceProtocol 2 Bidirectional
iInterface 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x81 EP 1 IN
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Endpoint Descriptor:
bLength 7
bDescriptorType 5
bEndpointAddress 0x01 EP 1 OUT
bmAttributes 2
Transfer Type Bulk
Synch Type None
Usage Type Data
wMaxPacketSize 0x0200 1x 512 bytes
bInterval 0
Device Qualifier (for other device speed):
bLength 10
bDescriptorType 6
bcdUSB 2.00
bDeviceClass 7 Printer
bDeviceSubClass 0
bDeviceProtocol 0
bMaxPacketSize0 64
bNumConfigurations 1
Device Status: 0x0001
Self Powered
Best regards,
Chan
Hello Park,
Please check whether the following errata would be helpful for you.
USB31: Transmit data loss based on bus latency
Description: Devices: MPC8349E, MPC8347E, MPC8343E
When acting as a Device, after receiving a Token IN, the USB controller will reply with a data
packet. If the bus memory access is not fast enough to backfill the TX fifo, it will cause an
under-run. In this situation a CRC error will be introduced in the packet and the Host will ignore
it. However, when an underrun happens, the TX fifo will get a flush command. This situation
may cause an inconsistence in the TX fifo controls, leading to a possible data loss (a complete
packet or sections of a packet can be never transmitted). This situation may also happen if the
software issues a TX flush command.
Impact: When the USB controller is configured as a device, it can not be used in the stream mode due
to this erratum. Therefore, the USB external bus utilization is decreased.
Workaround: A valid software workaround is to disable the stream mode by setting USBMODE[SDIS] bit.
This can avoid the issue at the expense of decreased USB external bus utilization.
Fix plan: No plans to fix
M
Have a great day,
Yiping
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------
Hi Yiping,
Thanks for sharing the information.
Actually, I have already tested STREAM_DISABLE and found that the problem still occurs even with Stream Mode disabled.
Here is code changes I have applied to the FSL UDC core driver (Linux 3.2) to disable Steam Mode.
diff --git a/drivers/usb/gadget/fsl_udc_core.c b/drivers/usb/gadget/fsl_udc_core.c
index 01de16e..1a3e3d3 100644
--- a/drivers/usb/gadget/fsl_udc_core.c
+++ b/drivers/usb/gadget/fsl_udc_core.c
@@ -295,6 +295,7 @@ static int dr_controller_setup(struct fsl_udc *udc)
tmp = fsl_readl(&dr_regs->usbmode);
tmp &= ~USB_MODE_CTRL_MODE_MASK; /* clear mode bits */
tmp |= USB_MODE_CTRL_MODE_DEVICE;
+ tmp |= USB_MODE_STREAM_DISABLE;
/* Disable Setup Lockout */
tmp |= USB_MODE_SETUP_LOCK_OFF;
if (udc->pdata->es)
@@ -363,6 +364,7 @@ static void dr_controller_run(struct fsl_udc *udc)
/* Set the controller as device mode */
temp = fsl_readl(&dr_regs->usbmode);
temp |= USB_MODE_CTRL_MODE_DEVICE;
+ temp |= USB_MODE_STREAM_DISABLE;
fsl_writel(temp, &dr_regs->usbmode);
/* Set controller to Run */
Apparently, this problem seems to happen due to USB-A001. I think so because if I remove the 2ms delay between 1448 Bytes transfers, the problem disappears for more than 15 hours.
Having the delay between transfers, the problem appears after 10 minutes to 5 hours of testing (I would say, it happens randomly).
I really appreciate if you can shed some light on this urgent issue as this problem is quiet ciritical to our product.
USB-A001: Last read of the current dTD done after USB interrupt
Description: Devices: MPC8349E, MPC8347E, MPC8343E
After executing a dTD, the device controller executes a final read of the dTD terminate bit. This
is done in order to verify if another dTD has been added to the linked list by software right at
the last moment.
It was found that the last read of the current dTD is being performed after the interrupt was
issued. This causes a potential race condition between this final dTD read and the interrupt
handling routine servicing the interrupt on complete which may result in the software freeing
the data structure memory location, prior to the last dTD read being completed. This issue is
only applied to a USB device controller.
Best regards,
Chan
How do you build your system? Is your USB device a storage device or printer?
Hi lunminliang,
It is a printer device.
drivers/usb/gadget/printer.c
Thanks,
Chan
1. Can you use a USB protocol analyzer to check whether the data have been sent out from the USB PHY of the 834x to the USB host?
2. I would like to check the data package send out from the USB controller on the ULPI interface to isolate the problem if the missing data is confirmed from the step 1.
Hi lunminliang,
Based on the following description, I have added 10us delay in the interrupt handling routine (tx_complete()) and it appears that the problem goes away. Do you think it can be a valid workaround? If yes, would the 10us delay be enough?
USB-A001: Last read of the current dTD done after USB interrupt
Description: Devices: MPC8349E, MPC8347E, MPC8343E
After executing a dTD, the device controller executes a final read of the dTD terminate bit. This
is done in order to verify if another dTD has been added to the linked list by software right at
the last moment.
It was found that the last read of the current dTD is being performed after the interrupt was
issued. This causes a potential race condition between this final dTD read and the interrupt
handling routine servicing the interrupt on complete which may result in the software freeing
the data structure memory location, prior to the last dTD read being completed. This issue is
only applied to a USB device controller.
Thanks
Chan Park
See apps comment:
If it works, it could be a solution for your system. If it is really the USB-A001 issue, 10 us delay should be enough. I assume that the platform clock is > 100 MHz.
The following change solved the issue with USB-A001.
As far as I know, this is a common problem with Chipidea IP. Newer Chipidea IP would not have such issue though.
I am glad there is a workaround for this problem. :smileyhappy:
I managed to reproduce the problem again and recorded the traffic using Ellisys USB Analyzer.
Here is the screen shot showing the packet loss problem.