Bugs in Linux driver for DPAA2 qDMA engine

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Bugs in Linux driver for DPAA2 qDMA engine

727 Views
cvachoucek
Contributor I

Hi all,

we're trying to use qDMA engine in LX2160A to transfer data from our PCIe devices. While working on the kernel driver for our device, we've identified two issues with the dpaa2-qdma driver.

1) Function dpaa2_qdma_setup() fails if the DPDMAI object is created statically from the DPL.

The DPDMAI object specified in DPL seems to be always created with single queue, no matter how many priorities are specified in the DPL file. But the driver initializes the number of queues to be used from the number of priorities. If the number of priorities is 2, then the dpdmai_get_rx_queue() call fails when trying to read attributes for second (non-existent) queue.

2) Deadlock in dpaa2_qdma_fqdan_cb().

The driver registers for notifications from DPIO object. In the callback function, the driver tries to directly dequeue frames from the DPDMAI Rx queue. This is done in this loop:

	do {
		err = dpaa2_io_service_pull_fq(NULL, ppriv->rsp_fqid,
					       ppriv->store);
	} while (err);

 

The callback function is running in interrupt context (hardirq). If this IRQ is running on a CPU, where the same DPIO object (QBMan sw portal) is currently locked by other driver (e.g. Ethernet driver pulling frames from the DPNI - in softirq context), then the function returns -EBUSY and the code stays in the endless loop, because softirq cannot continue while the hardirq is running.

This is easy to trigger - start periodically DMA transfers while there is heavy network traffic (I'm using iperf test). After a while, both the network traffic and DMA transfers stop and after cca 20 seconds you get RCU stall warnings in the kernel log.

The solution seems to be to avoid pulling frames from the Rx queue in the hardirq context, where the dpaa2_qdma_fqdan_cb() is running. Instead, schedule a tasklet (deferred work) which will run later in softirq context.

I'm attaching a patch for the driver, with changes made to the driver to make it work for me. It would be nice if the NXP would clarify these issues and either accepted the patch as is or addressed these issues in next release.

Thank you and have a nice day!

Petr Cvachoucek

 

0 Kudos
3 Replies

628 Views
Irene
NXP Pro Support
NXP Pro Support

What is the version of the MC firmware you are using?

0 Kudos

593 Views
cvachoucek
Contributor I

MC firmware is version 10.38.0, taken from here:
https://github.com/nxp-qoriq/qoriq-mc-binary/releases/tag/mc_release_10.38.0

Linux kernel is version 6.1.22, taken from here:
https://github.com/nxp-qoriq/linux/releases/tag/lf-6.1.22-2.0.0

I'm testing this on LX2160ARDB development kit. The DPL file is attached for reference. Here are outputs of some commands, with unpatched driver.

root@lx2160a-rdb:~# uname -a
Linux lx2160a-rdb 6.1.22 #1 SMP PREEMPT Wed Sep 20 06:44:51 UTC 2023 aarch64 GNU/Linux

root@lx2160a-rdb:~# restool --mc-version
MC firmware version: 10.38.0

root@lx2160a-rdb:~# restool dpdmai info dpdmai.0
dpdmai version: 3.4
dpdmai id: 0
plugged state: plugged
number of priorities: 2
number of queues: 1
dpdmai.options value is: 0

root@lx2160a-rdb:~# dmesg | grep dpaa2-qdma
[    2.976302] dpaa2-qdma dpdmai.0: Adding to iommu group 5
[    2.979455] dpaa2-qdma dpdmai.0: dpdmai_get_rx_queue() failed
[    2.985719] dpaa2-qdma dpdmai.0: dpaa2_dpdmai_setup() failed
[    2.991923] dpaa2-qdma dpdmai.0: fsl_mc_driver_probe failed: -6

 

0 Kudos

723 Views
Irene
NXP Pro Support
NXP Pro Support

I will take a look at this.

0 Kudos