We're facing a performance problem while using iMX6SX (ARM CPU only):
During the development of our custom board based on i.MX6SX under LINUX we tripped over a "bug" we have to worry about:
First, let me explain our general conditions:
We are using up to four UARTS as asynchrony UART interfaces as RS485 full duplex without RTS/CTS under LINUX (linux-yocto-3.19-3.19 with some back port of 3.19.5)
As shown in the flysheet, we planned to use speeds of 500kBaud streaming our data into the ARM Core of the i.MX6sx, which should be capable to handle speeds up to a few mega Baud.
Depended on the bandwidth, this speed will not be reached by far under linux operating system.
Now let me explain the problem:
We are using a terminal on one UART to communicate with the LINUX system without any error at speeds of over 460800 Baud.
So only "some" characters transmitted with very high speed over UART seem not to be a problem.
On the other hand, if we try to stream a lot of characters with high speed, the FIFO runs over an Data get lost.
First, we recognized that the compatibility for SDMA to MX6-family was not given in UART driver.
So we started without DMA and got lost in underperformance.
After bug fixing this, we are now able to use DMA in the usual way and the performance rises up:
Now we can stream up to 115200 Baud, tested with simultaneously 3 UARTs and maximum bandwidth without any error. So far, so good.
But our goal is out of reach:
If we switch to 230400 Baud, we got significant error rates, even if only some percent of bandwidth is used, only one UART connected.
So we had to investigate what happen:
First, we recognized that all errors based on "FIFO OVERRUNS". This means, that the CPU is not able to drain the FIFO before it runs over.
Well, the system is quite in idle state, no interrupts get lost and the schedulers are working without any sign of effort.
Using "Polling" without DMA makes it worse. Playing around with Watermark and Burstlevel shows an effect, but not sufficiently....
So we tried a lot to fix this:
We optimized threshold of FIFO, Size of DMA-blocks, tuning IOSCHEDULE, turn off everything but the UART handling... cutting down the driver to
it's rudimentary task. We found out, that even the necessary step, which copies the FIFO via DMA into DDR-RAM, is too slow to keep UART speeds above 230400
error-free with higher bandwidth than an console- UART.
No matter what we tried to fix it: This bug still remain.
So: Are we realy facing the bottleneck of the Bus while using UART without CTS/RTS functionality?
Or is there someone out there, who solved such a problem on his board?
Thanks a lot!