Hi,
We have a custom i.MX6solo-based board running Linux 3.10.17. One of our serial ports is connected to an external device, and runs at 921600 with hardware flow control. Most of the time it works fine, and I can see (by using ioctl to read the modem control bits) that CTS is occasionally deasserted to throttle the imx6's transmission, then asserts and the flow quickly resumes as the other end keeps going.
However, after a while, it will eventually get stuck; from the application's point of view, I can see that the tx buffer fills up and never goes down, and CTS is asserted (the other end is saying it can receive). We continue to receive data; the other end is sending, and a different thread can read the same file descriptor and get the data. It's just the sending that is stopped. I have reduced it to a fairly simple program that can reproduce it within 10-20 seconds, and see the same behavior as in the application. The test program just: (1) starts a thread that reads and discards incoming serial data, then (2) in a different thread, sends messages as fast as possible. It will run for 10-20 sec, then the writes will start blocking and stay blocked. The reads continue to work. If I kill and restart the program, writes work as before for another 10-20 sec, then it will get stuck again. I assume closing/reopening the fd fixed it, but have not tried just doing that.
We recently switched from 115k to 921k, and it seems to happen more. I can't say for sure whether it happened a 115k, but I suspect it could have, given time or suitable temporary backup in the other device.
I think the problem is on the imx6 side rather than the other end, since when it is stuck, I can look at the modem bits and see that the other end is ready for data.
I looked at the commit history from 3.10.17 to present (fido?) and did not see anything obvious in the serial driver relevant to this, but I'm not sure about that.
Has anyone seen anything like this, or better, know of a patch to fix hardware handshaking in the serial driver?
Thanks!
Hi John,
Have you verified the impedance of UART traces? Are they suitable for the communication speeds?
Additionally, you could add an external pull-up on the CTS pin and verify if the issue still happens.
Hope this will be useful for you.
Best regards!
/Carlos
Thanks for the ideas Carlos.
We temporarily backed off to 115k to keep moving forward, but need to figure out what's going on soon.
I have not scoped the UART traces, nor checked to see that they are matched in length. There is no room for them to differ much in length, especially relative to 921k. I have long been intending to put a scope on them (gotta open it up / take it apart), but the fact that I don't see any data errors at 921k (everything is CRCed), and just see the hang eventually, indicates that the bits are getting through okay. The pull-up is also a good idea, but I'm pretty sure both ends are driving all their lines (will double-check that). The final state - imx6 not transmitting, fixed by port close/reopen - just really looks like a lost handshaking interrupt.
I had a request from out-of-band for additional information:
* It is UART2, with txd, rxd, cts, and rts connected
* The other device is an STM32-based microcontroller running an RTOS
* There isn't really any relevant console output or error log, since the port just stops
Has anyone heard of other applications running at high speed with hw flow control enabled (and actively throttling the data)?