We are having an issue with a USB-to-Serial converter device using the cdc-acm driver. More specifically, the port can be opened and communicated with correctly, but closing the port results in a crash.
The problem occurs with a USB 2.0 device connected to the USB1 interface of our i.MX8M-Mini (USB1 configured as host). The board is our own design based on the X-8MMINILPD4-EV evaluation board. The device is a USB-to-Serial converter which is also our own design. We are using linux-imx kernel version imx_4.14.98_2.2.0.
We have enabled the cdc-acm driver in our DEFCONFIG file (CONFIG_USB_ACM=y) and the device appears as expected under /dev/ttyACM0. I can open the port and communicate normally with the device. However, a crash occurs as soon as the port gets closed. The error message refers to a NULL pointer dereference. I have tried different ways of talking to the device (with minicom, with a simple echo command, with some C code) and the crash occurs in every scenario. The full crash log is below.
The same USB device can be opened and closed normally when it is connected to a Ubuntu computer, therefore the problem appears to be on the iMX8MM side and not with the USB device.
I've searched on the internet and found some references to known issues related to cdc-acm driver and NULL pointer dereference, but it looks like fixes have been released years ago so I wouldn't expect to encounter them today.
I wonder if a newer kernel might solve the problem and I've been trying to upgrade, but I'm facing lots of issues with the migration. I don't know if a newer kernel would be more stable but in any case this is not the preferred solution.
Can anyone advise on potential solutions to solve this with kernel 4.14.98_2.2.0?
Thanks.
>>> [ 52.482669] Unable to handle kernel NULL pointer dereference at virtual address 00000010
[ 52.490833] Mem abort info:
[ 52.493627] Exception class = DABT (current EL), IL = 32 bits
[ 52.499545] SET = 0, FnV = 0
[ 52.502598] EA = 0, S1PTW = 0
[ 52.505737] Data abort info:
[ 52.508617] ISV = 0, ISS = 0x00000004
[ 52.512451] CM = 0, WnR = 0
[ 52.515420] user pgtable: 4k pages, 48-bit VAs, pgd = ffff8000e4c32000
[ 52.521945] [0000000000000010] *pgd=0000000000000000
[ 52.526912] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 52.532484] Modules linked in:NOR | Minicom 2.7.1 | VT102 | Offline | tyACM0
[ 52.535543] CPU: 0 PID: 4046 Comm: minicom Tainted: G W 4.14.98-06466-g5910884f0fa2-dirty #1
[ 52.545107] Hardware name: FSL i.MX8MM EVK board (DT)
[ 52.550157] task: ffff8000e5249b00 task.stack: ffff00001a288000
[ 52.556084] PC is at start_unlink_async.part.36+0x34/0x88
[ 52.561483] LR is at ehci_urb_dequeue+0xcc/0xe0
[ 52.566012] pc : [<ffff0000088d31a4>] lr : [<ffff0000088d739c>] pstate: a00001c5
[ 52.573405] sp : ffff00001a28bb70
[ 52.576718] x29: ffff00001a28bb70 x28: ffff8000e5249b00
[ 52.582031] x27: ffff000008d61000 x26: 0000000000000039
[ 52.587344] x25: 0000000000000124 x24: ffff8000f2ca5928
[ 52.592658] x23: 0000000000000140 x22: ffff8000f392d100
[ 52.597971] x21: ffff8000f93cb30c x20: 0000000000000000
[ 52.603284] x19: ffff8000f93cb000 x18: 0000000000000000
[ 52.608597] x17: 0000ffff8907d6b8 x16: ffff000008211298
[ 52.613910] x15: 16170f12001a1311 x14: 071c71c71c71c71c
[ 52.619223] x13: 0000000000000000 x12: 0000000c1b110c39
[ 52.624536] x11: 00000000000004a2 x10: 0000000000000980
[ 52.629849] x9 : ffff00001a28bba0 x8 : ffff8000e524a4e0
[ 52.635162] x7 : 00000000000005ec x6 : 0000000002abbc65
[ 52.640475] x5 : 0000000000000000 x4 : ffff8000f93cb330
[ 52.645788] x3 : 0000000000000000 x2 : 0000000000000000
[ 52.651101] x1 : ffff8000f2d42600 x0 : ffff8000f93cb238
[ 52.656415] Process minicom (pid: 4046, stack limit = 0xffff00001a288000)
[ 52.663201] Call trace:
[ 52.665647] Exception stack(0xffff00001a28ba30 to 0xffff00001a28bb70)
[ 52.672088] ba20: ffff8000f93cb238 ffff8000f2d42600
[ 52.679917] ba40: 0000000000000000 0000000000000000 ffff8000f93cb330 0000000000000000
[ 52.687747] ba60: 0000000002abbc65 00000000000005ec ffff8000e524a4e0 ffff00001a28bba0
[ 52.695576] ba80: 0000000000000980 00000000000004a2 0000000c1b110c39 0000000000000000
[ 52.703406] baa0: 071c71c71c71c71c 16170f12001a1311 ffff000008211298 0000ffff8907d6b8
[ 52.711236] bac0: 0000000000000000 ffff8000f93cb000 0000000000000000 ffff8000f93cb30c
[ 52.719066] bae0: ffff8000f392d100 0000000000000140 ffff8000f2ca5928 0000000000000124
[ 52.726895] bb00: 0000000000000039 ffff000008d61000 ffff8000e5249b00 ffff00001a28bb70
[ 52.734725] bb20: ffff0000088d739c ffff00001a28bb70 ffff0000088d31a4 00000000a00001c5
[ 52.742555] bb40: ffff8000fff6b000 ffff000009488000 0000ffffffffffff ffff8000e5249b00
[ 52.750383] bb60: ffff00001a28bb70 ffff0000088d31a4
[ 52.755263] [<ffff0000088d31a4>] start_unlink_async.part.36+0x34/0x88
[ 52.761703] [<ffff0000088d739c>] ehci_urb_dequeue+0xcc/0xe0
[ 52.767277] [<ffff00000889f870>] unlink1+0x28/0x130
[ 52.772156] [<ffff0000088a1d28>] usb_hcd_unlink_urb+0x98/0xc8
[ 52.777903] [<ffff0000088a3010>] usb_kill_urb.part.4+0x30/0xb0
[ 52.783736] [<ffff0000088a30b0>] usb_kill_urb+0x20/0x30
[ 52.788963] [<ffff0000088e02c4>] acm_kill_urbs+0x54/0x70
[ 52.794276] [<ffff0000088e075c>] acm_port_shutdown+0x7c/0x90
[ 52.799937] [<ffff0000085f590c>] tty_port_shutdown+0x94/0xb0
[ 52.805597] [<ffff0000085f60c4>] tty_port_close+0x4c/0x88
[ 52.810996] [<ffff0000088e0f94>] acm_tty_close+0x1c/0x28
[ 52.816310] [<ffff0000085ebcf4>] tty_release+0x10c/0x490
[ 52.821625] [<ffff000008216aa0>] __fput+0x88/0x1d0
[ 52.826417] [<ffff000008216c44>] ____fput+0xc/0x18
[ 52.831210] [<ffff0000080ebe64>] task_work_run+0x9c/0xc0
[ 52.836525] [<ffff00000808990c>] do_notify_resume+0xfc/0x108
[ 52.842183] Exception stack(0xffff00001a28bec0 to 0xffff00001a28c000)
[ 52.848624] bec0: 0000000000000000 0000ffff891a8f10 0000000000000000 0000000000003c71
[ 52.856454] bee0: 0000aaaaddce0100 0000000000000004 0000aaaad1c764b8 0000ffff891af6d0
[ 52.864283] bf00: 0000000000000039 0000000000000000 0000000000000016 0000000000000000
[ 52.872113] bf20: 00000a3b00001cb2 0000000000000005 00010004157f1c03 16170f12001a1311
[ 52.879943] bf40: 0000aaaad1c70b58 0000ffff8907d6b8 0000000000000000 0000000000000003
[ 52.887773] bf60: 0000ffff891a9710 0000ffffc1e7d828 0000aaaad1c7f2e8 0000aaaad1c70000
[ 52.895602] bf80: 0000aaaad1c57000 0000aaaad1c70000 0000aaaad1c77a40 0000aaaad1c76e08
[ 52.903433] bfa0: 0000aaaad1c71000 0000ffffc1e7d230 0000aaaad1c3bd58 0000ffffc1e7d230
[ 52.911262] bfc0: 0000ffff8907d6e4 0000000000000000 0000000000000003 0000000000000039
[ 52.919092] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 52.926922] [<ffff000008083964>] work_pending+0x8/0x10
[ 52.932062] Code: f9000043 f9407403 14000002 aa0203e3 (f9400862)
[ 52.938156] ---[ end trace 12c9a1d9f603ec7e ]---
[ 63.059546] mmc1: Timeout waiting for hardware interrupt.
[ 63.064955] mmc1: sdhci: ============ SDHCI REGISTER DUMP ===========
[ 63.071398] mmc1: sdhci: Sys addr: 0x00000008 | Version: 0x00000002
[ 63.077837] mmc1: sdhci: Blk size: 0x00000200 | Blk cnt: 0x00000000
[ 63.084277] mmc1: sdhci: Argument: 0x00068788 | Trn mode: 0x0000002b
[ 63.090717] mmc1: sdhci: Present: 0x01f88008 | Host ctl: 0x00000013
[ 63.097157] mmc1: sdhci: Power: 0x00000002 | Blk gap: 0x00000080
[ 63.103596] mmc1: sdhci: Wake-up: 0x00000008 | Clock: 0x0000000f
[ 63.110036] mmc1: sdhci: Timeout: 0x0000008f | Int stat: 0x00000003
[ 63.116475] mmc1: sdhci: Int enab: 0x107f100b | Sig enab: 0x107f100b
[ 63.122915] mmc1: sdhci: AC12 err: 0x00000000 | Slot int: 0x00000502
[ 63.129354] mmc1: sdhci: Caps: 0x07eb0000 | Caps_1: 0x8000b407
[ 63.135794] mmc1: sdhci: Cmd: 0x0000193a | Max curr: 0x00ffffff
[ 63.142233] mmc1: sdhci: Resp[0]: 0x00000900 | Resp[1]: 0x00eddf7f
[ 63.148673] mmc1: sdhci: Resp[2]: 0x325b5900 | Resp[3]: 0x00000900
[ 63.155112] mmc1: sdhci: Host ctl2: 0x00000088
[ 63.159555] mmc1: sdhci: ADMA Err: 0x00000000 | ADMA Ptr: 0x7804f208
[ 63.165994] mmc1: sdhci: ============================================
已解决! 转到解答。
We have identified the root of the problem. Our USB device had an error with one of its endpoint descriptors. The direction for the interrupt endpoint of the serial device was wrong. That endpoint is not used so we don't understand why this was causing a problem with the CDC-ACM driver in the NXP environment. There is no more crash after changing the direction of the interrupt.
We have identified the root of the problem. Our USB device had an error with one of its endpoint descriptors. The direction for the interrupt endpoint of the serial device was wrong. That endpoint is not used so we don't understand why this was causing a problem with the CDC-ACM driver in the NXP environment. There is no more crash after changing the direction of the interrupt.
Hi Aldo, thank you for your answer. We have set CONFIG_USB_ACM=y because we need the driver to be active on boot and detect/map our USB device right away. We had first tried with the default configuration (CONFIG_USB_ACM=m) but then the device doesn't appear as /dev/ttyACM0 when we plug it in.
Besides, we have been able to confirm that the cdc-acm driver is working correctly in our iMX8M-Mini platform with another USB device from a different manufacturer. It would seem there are differences between our device and that other device which are causing a crash inside the linux-imx environment, but are working seamlessly in a Ubuntu workstation. We are currently trying to figure out the differences in the USB descriptors between the two devices.
Thanks.
Yannick.
Update:
I have replicated the problem on the i.MX8MM EVK (LPDDR4) with linux-imx kernel 4.14.98_2.0.0ga, and also with the latest version from the repo which is lf-6.6.3-1.0.0. I can see that the cdc-acm driver has evolved between the two versions, therefore I assume that any pending bug fix should be included in there. Yet, I see the same crash when closing the ttyACM0 port with the latest version of the kernel.
The USB-serial device works correctly on my DELL Optiplex computer with Ubuntu 20.04 and kernel 5.15.0-105. So it seems there is something specific to the i.MX8MM environment that causes problems.
Has anyone had success using a USB-serial converter with cdc-acm driver on iMX8M-Mini?
Thanks.
Hello,
I'm looking into the issue, I cannot find reports of this so I will need to check what is actually causing this.
As a side note I see that in our Config we have this enabled as a module, while you have set it as built-in, could you try to set it as module?
Best regards/Saludos,
Aldo.