PCI Express Burst Transaction on LS1046

andreasr · ‎06-04-2018

Hello,

according to this post https://community.nxp.com/message/912490 you need to use a DMA controller to create PCIe read requests with a payload size that is bigger than 8 bytes.

Just for clarification:

Ist this statement valid in general or just for the P1022? (I'm using the LS1046)

Is there a possibility to do this without a DMA controller?

I'm working with the LS1046ARDB and VxWorks 7.

According to the LS1046 reference manual it seems that the eDMA doesn't support access to PCIe devices.

There's also the qDMA, but unfortunately the BSP that I'm using for the LS1046ARB doesn't support the qDMA.

Do you know if there's a qDMA driver available for VxWorks 7?

ufedor · ‎06-04-2018

You wrote:

> Is there a possibility to do this without a DMA controller?

DMA controller is needed to initiate internal SOC transaction having transfer size bigger than 8 bytes. There is no possibility to initiate such transaction by a core.

> Do you know if there's a qDMA driver available for VxWorks 7?

Please address this question to the Wind River technical support.

andreasr · ‎06-04-2018

Thank you!

I have one related question.

My setup:

LS1046 as root complex is reading data from a PCIe peripheral device that acts as an endpoint. The region I'm reading from is memory mapped through a BAR of the endpoint. The endpoint supports PCIe Gen1 (2.5Gb/s).

When I create a PCIe read request with the size of 8 bytes (without a DMA controller) it seems that the peripheral device returns more than one completion packet.

I assume this because I'm doing some benchmarks and the time difference between requesting 4 bytes and 8 bytes is about at least 300 nanoseconds.

If there was only one completion packet returned there should only be a time difference of a few ns since only 4 bytes of payload more have to be sent.

But if the answer to the read request was split in more than one completion packet there would be more overhead which would explain the time difference of at least 300ns.

The max payload size of the root port and the endpoint is 256 bytes, the max read request of root port and endpoint is 512 bytes. The read completion boundary of the root port is 128 bytes and of the endpoint is 64 bytes ( I make sure that the read request doesn't cut this alignment boundary)

For comparison:

There's no time difference between requesting 1 byte and 4 bytes, which makes sense, since the minimum payload that a completion packet contains is 4 bytes.

Do you have an idea what else could influence this behaviour?

Do I have to consider something else, when I'm creating that read request?

I tried two different endpoint devices and witnessed the same behaviour.

ufedor · ‎06-05-2018

Please use a PCie analyser to obtain additional information concerning 8-byte read response from the peripheral device.

PCI Express Burst Transaction on LS1046

PCI Express Burst Transaction on LS1046

QorIQ LS1 Devices