PCIe does not work

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

PCIe does not work

2,447 Views
volki
Contributor III

We currently try to setup a PCIe communication between an Artix7 and the i.mx6q

The CPU is placed on a Q7 module from MSC that is connected to an eval board with a PCIe 1.0 switch. The FPGA is placed on the PCIe slot of the eval board.

Linux kernel is 3.0.35.Q7_IMX6-13.12.01.

Due to the BAR length limitation of the i.MX6 we decided to add the LogiCore AXI CDMA to the FPGA and let it write to a preallocated memory space within the i.MX6 DDR.

We tested the communication and software with an x86 system (Ubuntu 12.04 with kernel 3.2.0-60) first and then tried to port it 1:1 to the i.MX6 system.

But it seems the communication does not work the same way. Writing to the FPGA configuration registers mapped through BAR0 is working. But if we wanted to write using the CDMA to the DDR of the i.MX6 the data seems to be lost somewhere.

The FPGA initiates the transfer as we can see in chip scope, but the buffer on i.MX6 side is unchanged. Additionally the MSI capability seems to be interpreted differently on the i.MX6 and we need to use them in our final setup.

PC Log of lspci hex dump and config dump

02:00.0 Memory controller: Xilinx Corporation Device 7042

00: ee 10 42 70 07 04 10 00 00 00 80 05 10 00 00 00

10: 00 00 cf df 00 00 00 00 00 00 00 00 00 00 00 00

20: 00 00 00 00 00 00 00 00 00 00 00 00 ee 10 07 00

30: 00 00 00 00 40 00 00 00 00 00 00 00 ff 00 00 00

40: 01 48 23 00 08 00 00 00 05 60 85 00 0c 30 e0 fe

50: 00 00 00 00 b9 41 00 00 00 00 00 00 00 00 00 00

60: 10 00 02 00 29 80 28 00 16 29 00 00 12 f4 03 00

70: 40 00 11 10 00 00 00 00 00 00 00 00 00 00 00 00

80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

90: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00

a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

02:00.0 Memory controller: Xilinx Corporation Device 7042

  Subsystem: Xilinx Corporation Device 0007

  Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+

  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

  Latency: 0, Cache Line Size: 64 bytes

  Interrupt: pin ? routed to IRQ 47

  Region 0: Memory at dfcf0000 (32-bit, non-prefetchable) [size=64K]

  Capabilities: [40] Power Management version 3

  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

  Capabilities: [48] MSI: Enable+ Count=1/4 Maskable- 64bit+

  Address: 00000000fee0300c  Data: 41b9

  Capabilities: [60] Express (v2) Endpoint, MSI 00

  DevCap: MaxPayload 256 bytes, PhantFunc 1, Latency L0s <64ns, L1 <1us

  ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

  DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported-

  RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+

  MaxPayload 128 bytes, MaxReadReq 512 bytes

  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

  LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited

  ClockPM- Surprise- LLActRep- BwNot-

  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+

  ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

  LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

  DevCap2: Completion Timeout: Not Supported, TimeoutDis-

  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-

  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB

  Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

  Compliance De-emphasis: -6dB

  LnkSta2: Current De-emphasis Level: -3.5dB

  Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00

  Kernel driver in use: pciDriver

i.MX6 dump:

03:00.0 Memory controller: Xilinx Corporation Device 7042

00: ee 10 42 70 46 05 10 00 00 00 80 05 08 00 00 00

10: 00 00 10 01 00 00 00 00 00 00 00 00 00 00 00 00

20: 00 00 00 00 00 00 00 00 00 00 00 00 ee 10 07 00

30: 00 00 00 00 40 00 00 00 00 00 00 00 00 00 00 00

40: 01 48 23 00 08 00 00 00 05 60 85 00 00 00 00 00

50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

60: 10 00 02 00 29 80 64 00 10 28 00 00 12 f4 03 00

70: 00 00 11 10 00 00 00 00 00 00 00 00 00 00 00 00

80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

90: 02 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00

a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

03:00.0 Memory controller: Xilinx Corporation Device 7042

  Subsystem: Xilinx Corporation Device 0007

  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+

  Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

  Latency: 0, Cache Line Size: 32 bytes

  Interrupt: pin ? routed to IRQ 502

  Region 0: Memory at 01100000 (32-bit, non-prefetchable) [size=64K]

  Capabilities: [40] Power Management version 3

  Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)

  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-

  Capabilities: [48] MSI: Enable+ Count=1/4 Maskable- 64bit+

  Address: 0000000000000000  Data: 0000

  Capabilities: [60] Express (v2) Endpoint, MSI 00

  DevCap: MaxPayload 256 bytes, PhantFunc 1, Latency L0s <64ns, L1 <1us

  ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

  DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

  RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+

  MaxPayload 128 bytes, MaxReadReq 512 bytes

  DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

  LnkCap: Port #0, Speed 5GT/s, Width x1, ASPM L0s, Latency L0 unlimited, L1 unlimited

  ClockPM- Surprise- LLActRep- BwNot-

  LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk-

  ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

  LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

  DevCap2: Completion Timeout: Not Supported, TimeoutDis-, LTR-, OBFF Not Supported

  DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled

  LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-

  Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

  Compliance De-emphasis: -6dB

  LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete-, EqualizationPhase1-

  EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

  Capabilities: [100 v1] Device Serial Number 00-00-00-00-00-00-00-00

  Kernel driver in use: pciDriver

  Kernel modules: pciDriver

Is there anything obvious we have to care about to get PCIe busmaster communication running?

Maybe something we have to check regarding the iATU settings because of the PCIe switch on our evaluation board?

Labels (3)
0 Kudos
6 Replies

1,157 Views
YixingKong
Senior Contributor IV

Volker

We have not got your response yet and will close the discussion in 3 days. If you still need help, please feel free to reply with an update to this discussion.

Thanks,

Yixing

0 Kudos

1,157 Views
YixingKong
Senior Contributor IV

Volker

Had your issue got resolved? If yes, we are going to close the discussion in 3 days. If you still need help, please feel free to reply with an update to this discussion.

Thanks,

Yixing

0 Kudos

1,157 Views
b47504
NXP Employee
NXP Employee

We should better avoid the dma address remapped by ATU. Do you have the TLP snapshot of pcie protocol analysis to help address the issue?    

0 Kudos

1,157 Views
volki
Contributor III

Sorry for the late reply, I wasn't notified about responses to my post.

We finally got the test running. It turned out to be a problem when mapping between kernel space and user space (ARM seems to be a bit different here than x86: dma_mmap_coherent vs remap_pfn_range).

Unfortunately the data transfer bandwidth was not as good as we would have expected on the i.MX6. We generated a data stream within the FPGA using a clock counter. By subtracting the last transferred data word from the first we calculated the data transfer speed, so calculation should not depend on any CPU latencies. On the i.MX6 we got 270MB/s while we got ~400MB/s on an x86 system with one PCIe 2.0 lane.

Is there any obvious reason (apart from the maximum payload size of 128byte on the i.MX6) that the transfer rate is only that low? As the maximum payload size on the FPGA is currently limited to 256 bytes anyway it shouldn't have a very big effect in comparison to the x86 system.

0 Kudos

1,157 Views
snowman
Contributor II

We are also looking to connect the i.MX6 to an Artix7 through PCIe. In the posts above it looks like you got this to work with ASPM disabled:

     "LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+"


It this the case? Have you been able to get PCIe working between the iMX6 and Artix7 with ASPM enabled? We would like to use the power management modes of ASPM but I have not seen (conclusively) that PCIe ASPM works reliably on the iMX6 working as a RC.


Thanks,

Tom

0 Kudos

1,157 Views
volki
Contributor III

As we currently don't care about power consumption, we haven't looked at this yet. Our biggest concern is data transfer bandwidth.

0 Kudos