imx8mm RS485: UART port sometimes gets stuck in an RX loop

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

imx8mm RS485: UART port sometimes gets stuck in an RX loop

2,078 Views
DynamicDyno
Contributor I

Hi everyone,

I'm currently an issue on a RS485 bus where occasionally, usually during high trafic periods, the RX becomes stuck in a continuous read loop and reads a lot of 0x01 bytes. Even if we disconnect the bus connector, the serial port driver keeps acting like it's receiving data.

 

Dmesg shows the following error logs:

 

[61493.145629][    C0] imx-sdma 30bd0000.dma-controller: All bds consumed,restart now.

[61493.439425][    C1] imx-sdma 30bd0000.dma-controller: Timeout waiting for CH0 ready

[61493.447106][    C1] imx-uart 30860000.serial: We cannot prepare for the TX slave dma!

[61494.074353][    C2] imx-sdma 30bd0000.dma-controller: Timeout waiting for CH0 ready

 

We are using the imx8mm, with Linux kernel version 5.15-2.2.x-imx. We are able to reproduce the error by connecting 2 serial ports (ttymxc2 and ttymxc3) together and sending a stream of data from both ports and then reading that data on both ports. In about 1 in 3 runs, the program hangs because of this error. Here is the log trace:

 

[817021.794338] imx-sdma 30bd0000.dma-controller: Timeout waiting for CH0 ready

[817021.802108] ------------[ cut here ]------------

[817021.802111] WARNING: CPU: 2 PID: 2074 at kernel/dma/mapping.c:528 dma_free_attrs+0xb0/0xe0

[817021.811169] Modules linked in: bd718x7_regulator rohm_regulator igb i2c_algo_bit ti_tla2024 at24 rtc_ds1307 lm75 rohm_bd718x7

[817021.811196] CPU: 2 PID: 2074 Comm: .NET Long Runni Tainted: G        W         5.15.157-5.15.157-2.2.0-5.15.157-2.2.0+g5134e031114e+p11 #1

[817021.811203] Hardware name: fsl-imx8mm-sclv2-3717-000 (DT)

[817021.811207] pstate: 600000c5 (nZCv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--)

[817021.811213] pc : dma_free_attrs+0xb0/0xe0

[817021.811221] lr : dma_free_attrs+0x50/0xe0

[817021.811227] sp : ffff80000bd6b9e0

[817021.811230] x29: ffff80000bd6b9e0 x28: 0000000000000000 x27: ffff000006b28ce8

[817021.811241] x26: ffff000006032900 x25: 00000000000002ed x24: 0000000000000000

[817021.811252] x23: 0000000046505000 x22: 0000000000000000 x21: ffff800009247000

[817021.811264] x20: 000000000000000c x19: ffff00000660b410 x18: ffffffffffffffff

[817021.811274] x17: 0000000000000000 x16: 0000000000000000 x15: ffff80000bd6b5b8

[817021.811285] x14: 0000000000000000 x13: ffff800008e9b4b8 x12: 0000000000000aa4

[817021.811298] x11: 000000000000038c x10: ffff800008e9b4b8 x9 : ffff800008e9b4b8

[817021.811306] x8 : 00000000ffffefff x7 : ffff800008ef34b8 x6 : ffff800008ef34b8

[817021.811317] x5 : 0000000000000040 x4 : 0000000000000000 x3 : 0000000046505000

[817021.811327] x2 : ffff800009247000 x1 : 00000000000000c0 x0 : 0000000000000080

[817021.811338] Call trace:

[817021.811341]  dma_free_attrs+0xb0/0xe0

[817021.811347]  sdma_free_bd+0x60/0x6c

[817021.811355]  sdma_transfer_init+0x1f0/0x25c

[817021.811362]  sdma_prep_slave_sg+0x7c/0x2c0

[817021.811369]  imx_uart_dma_tx+0xdc/0x250

[817021.811378]  imx_uart_start_tx+0x10c/0x200

[817021.811384]  __uart_start.isra.0+0x3c/0x4c

[817021.811390]  uart_write+0x154/0x2c0

[817021.811395]  n_tty_write+0x2c0/0x48c

[817021.811404]  file_tty_write.constprop.0+0x130/0x294

[817021.811410]  tty_write+0x14/0x20

[817021.811415]  new_sync_write+0xec/0x18c

[817021.811422]  vfs_write+0x22c/0x290

[817021.811429]  ksys_write+0x6c/0x100

[817021.811437]  __arm64_sys_write+0x1c/0x30

[817021.811445]  invoke_syscall+0x48/0x120

[817021.811453]  el0_svc_common.constprop.0+0xd4/0xf4

[817021.811459]  do_el0_svc+0x28/0xa0

[817021.811466]  el0_svc+0x28/0x80

[817021.811474]  el0t_64_sync_handler+0xa4/0x130

[817021.811481]  el0t_64_sync+0x1a0/0x1a4

[817021.811488] ---[ end trace be2e3ed6cac0a4c6 ]---

[817021.816895] imx-uart 30a60000.serial: We cannot prepare for the TX slave dma!

 

Thanks in advance for any insight on this issue.

Labels (2)
0 Kudos
Reply
6 Replies

1,754 Views
Parmiss
Contributor I

Hi,

@DynamicDyno , could you get the patch to work? we face the same issue, getting "RX flood detected: soft reset" over and over.
@ceggers , did you try increasing idle_counter? did it fix the issue?

regards,

Parmiss

0 Kudos
Reply

1,727 Views
DynamicDyno
Contributor I

Hi Parmiss,

As mentioned below, my first patch did not fix the issue.
I missing Christian's reply a few days later and had to move to a more pressing project since then. I'm will get back to this issue in the next few days. I'll try increasing the idle_counter to see if that helps.

When I find what works, I'll post it here. Hopefully it'll help you too.

Regards,

DynamicDyno

0 Kudos
Reply

1,746 Views
ceggers
Contributor V

Hi Parmiss,

we face the same issue, getting "RX flood detected: soft reset" over and over.
@ceggers , did you try increasing idle_counter? did it fix the issue?

the situation on my system may be different to your setup:

  1. I have problems with "real" RX flooding (verified with a JTAG debugger). This can be caused by plugging in the external RS-485 connection while the system is up. Due to the missing idle-state-biasing in our setup, we could also trigger the problem by switching power of other devices on the bus.
  2. The warnings / data loss on you setup could also be caused by "false positives" (caused by the situation I already described).
  3. I use a custom SDMA script for reassembling the incoming UART data (together with a modified UART driver). In my case I had to implement the "RX flood detection" in the SDMA script. I use a idle_counter value of 35.

Did you already try a higher value for idle_counter? Do you use (S)DMA for RX?

regards,
Christian

0 Kudos
Reply

2,050 Views
ceggers
Contributor V

Hi,

maybe you are affected by the "RX flooding" problem described here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/tty/serial/imx.c?h...

I had the some with RS-485 when using a device discovery algorithm where multiple slave devices respond at the same time. The (simultaneous) responses produce broken UART frames. The discovery algorithm is aware is this, but these frames where able to cause the "RX flooding" problem on my system.

regards,
Christian

0 Kudos
Reply

1,989 Views
DynamicDyno
Contributor I

Hi Christian,

I tried the suggested patch (with some tweaking as it was initially targeted for a more recent version of the kernel) but it does not seem to have worked when I run my tests. I do see that Rx loops are being detected successfully. However, it appears the proposed software reset function either is not resetting the UART correctly or it is, and the problem is elsewhere.

When running my test, in addition to seeing the "RX flood detected: soft reset." log entry repeated in dmesg, I'm also seeing "dma-controller: All bds consumed,restart now." at very regular intervals.

0 Kudos
Reply

1,966 Views
ceggers
Contributor V

Hi,

I just tested the following with an i.MX6ULL:

1. I can make the UART receiving an endless stream of 0xFF after receiving a single 0xF0 from another UART (the sender uses a baud rate 12 times higher than the receiver, as mentioned in the patch description). I need to send the 0xF0 about 1–3 times in order to trigger the error.

2. The UART immediately stops receiving 0xFF after asserting UCR2.SRST (I did this with a JTAG debugger).

Do have a working JTAG debugger on you setup, so that you can examine the UART registers when the CPU is stopped? It would be good to know whether we are talking about the same problem. As you use RS-485 communication, you may need an idle-state-biasing resistor network.

In an internal discussion we came to the result that the "idle_counter" value of 3 may not be enough to ensure that the erroneous is currently active. Assume that the UART FIFO is filled with some bytes an no further data is received. If clearing the USR2_WAKE bit when processing the first byte, this bit will not become active again during processing of the remaining bytes in the FIFO (as not further data is actually received).

The same may be true for DMA buffers. When the USR2_WAKE bit is cleared when processing the first buffer, it will stay cleared also for the following buffers as long as no "fresh" bytes are received by the UART.

If the two paragraphs above are correct, it would be better to implement the "idle_counter" also in the SDMA script. Additionally the threshold should be increased to about the RX FIFO size (plus some margin).

 

regards,
Christian

0 Kudos
Reply