Performance issue for PCIe through DMA on i.MX8QM custom board

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Performance issue for PCIe through DMA on i.MX8QM custom board

530 Views
Ethan42
Contributor I

Hello,

We are developing an embedded system using i.MX8QM. This custom board uses a co-processor which is connected through PCIe. Also, we are using DMA subsystem to write and read data into PCIe bus. The following block diagram describes the brief data flow diagram.

Screenshot from 2022-01-24 11-33-24.png

The issue is that the execution for dma_map_sg()/dma_unmap_sg() takes a considerable time compared to the actual DMA operation(dma_async_issue_pending()) so that the total performance(BW) is poorer than we expect.

Screenshot from 2022-01-24 11-47-08.png

We are struggling with this issue with no luck so far. Any suggestion, guide, or checkpoint on this issue? Please don't hesitate if one need more information like source code or concrete HW block.

Thank you in advance!

Tags (1)
0 Kudos
1 Reply

475 Views
richardkim
NXP Employee
NXP Employee

Hello @Ethan42,

 

Following is our System engineer's suggestion.

I think the dma_map_sg() performance can be impacted by CMA, for CMA based kernel, when the DMA memory is not used, it can be used by other applications in the system, and when PCIE starts to us DMA memory, the kernel will swap these memory out of CMA, it will cost time.

 

We have a reference patch to reserve DMA memory from CMA, customer can try it, it is based on 4.14.98 kernel, customer can port it to 5.x kernel.

https://community.nxp.com/t5/i-MX-Processors-Knowledge-Base/How-to-get-rid-of-CMA/ta-p/1123287

 

Would you have a check and try this suggestion?

0 Kudos