Coldfire DemoJM USB invalid endpoint

nmay · ‎08-05-2013

I've got a modified DEMOJM Coldfire board with a 51JM128 processor using the Freescale USB Stack 4.0.3. In host mode, the code fails writing to an attached USB flash drive intermittently, sometimes taking more than a couple of hours to fail. When the failure occurs, the host send an invalid endpoint in a DATA OUT packet header, then follows with another DATA OUT packet header with the correct endpoint. The CRC on the header with the invalid endpoint is valid, so it looks like the endpoint value was intentional. But there isn't any data transferred, just the setup packet. Afterwards the USB flash drive (mostly) starts returning a NAK for every data output @packet and after the timeout period, the f_write operation fails.

I will point out that on some flash drives, I successfully ran for 16+ hours. But I need this to work on a variety of flash drives.

How do I see how the invalid endpoint is being generated in the firmware? It looks like the Coldfire USB hardware uses the BDT and DMA to send the USB values, so how do I trigger on the actual error to see why the value is invalid?

I have attached the logic analyzer waveform and an excel spreadsheet with the transactions. Row 22355 has the invalid endpoint. Again, this shows only the DATA OUT header with the address and endpoint followed by another DATA OUT with the correct endpoint.

Thanks,

--Norm

nmay · ‎08-06-2013

I've made some progress, in that I did find that writing to the TOKEN register is what starts the USB transaction. The register contains the PID and endpoint values, then the DMA engine appears to read the address and complete the transaction. And this register is set in khci.c (_usb_khci_atom_tr). It appears that the switch statement for type should take the TR_OUT case handler. So I added an assert call here that will trigger when the endpoint is > 2. However when it failed next, the assert did not trigger! So either another place is writing this register on purpose, or there might be a memory corruption going on.

Can I set a hardware breakpoint on writing to the TOKEN control register? Or writing to any specific memory location?

And if I can set this breakpoint, will the target run at full speed? I tried some tracing for debugging and the target runs very slowly. Totally unusable for debugging.

Thanks,

--Norm

TomE · ‎08-06-2013

> Can I set a hardware breakpoint on writing to the TOKEN control register? Or writing to any specific memory location?

That chip has a very capable internal debugging unit. Read through the Debug chapter. It has four hardware program-counter breakpoints and two additional address comparison registers.

Now you just have to find how to convince your IDE to use those resources to let you set a "data breakpoint" or "watchpoint". "Tracing" is a lowest common denominator debugging method and not what you want.

It would be very disappointing if the debugger doesn't support the chip features, but it is possible.

If it won't let you set hardware breakpoints, then you should be able to run without the debugger connected and program the internal hardware with WDEBUG instructions from your code to create a debug interrupt when a memory location is accessed/written. Getting that working might take you a while.

From your description (taking hours to trigger) I'd suspect a simple (and stupid) "interrupt versus mainline Hazard" in the code somewhere. Some piece of mainline USB code is probably messing with a data structure or some USB registers without disabling CPU interrupts around a critical section, and if the interrupt comes in at exactly the wrong time ...

I'd suggest reading through all the USB code and looking for anything like this. You could start with disabling interrupts around large slabs of the USB code and see if the problem goes away.

I'd also suggest trying to get it to fail more often to make your testing easier. I'd program a spare timer to interrupt at a very high and somewhat "random" rate (at least 50kHz) to try and make USB interrupts happen at different times during the mainline.

Tom

Coldfire DemoJM USB invalid endpoint

Coldfire DemoJM USB invalid endpoint

General