IMX8M Pcie link lost after minutes

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

IMX8M Pcie link lost after minutes

Jump to solution
4,954 Views
fredericduchass
Contributor III

Hello All,

I’m working on IMX8M Variscite product with  Yocto : https://www.variscite.com/product/system-on-module-som/cortex-a53-krait/dart-mx8m-nxp-imx-8m/

I try to link this SOM with a FGPA using xillybus Pcie IP : http://xillybus.com/

The goal is to catch a video stream on gstreamer. I try to realize a POC using 2 evalboard : the AC701 evalboard from xilinx + the evalboard from variscite.

1) Using Variscite BSP Morty, all is working perfectly. The PCIe link is established and data could be send/receive. I never loose the connection. This setup validate all my fpga IP + my hardware + my gstreamer pipeline. Morty BSP is based on kernel 5.9.51.

2) But using Variscite BSP Sumo, all is working perfectly BUT only during few minutes. After the PCIe link seems lost. Sumo BSP is based on kernel 4.14.78

Attached you can find the 2 logs during boot of the 2 different BSP.

In Brief, same hardware, same user application, in one case (Morty BSP) all is working very well. In the other case (Sumo BSP), the pcie link shutdown few minutes later.

This is the dmesg during boot and when it crashes :

root@imx8m-var-dart:/opt/7880UHD/bin# dmesg | grep -i pci

[ 0.000000] Kernel command line: console=ttymxc0,115200 earlycon=ec_imx6q,0x30860000,115200 root=/dev/mmcblk1p1 rootwait rw video=HDMI-A-1:1920x1080-32@60 pcie_aspm=off

[ 0.000000] PCIe ASPM is disabled

[ 0.000000] PCI I/O : 0xffff7dfffee00000 - 0xffff7dffffe00000 ( 16 MB)

[ 0.526763] PCI: CLS 0 bytes, default 128

[ 2.179918] imx6q-pcie 33800000.pcie: 33800000.pcie supply epdev_on not found, using dummy regulator

[ 2.188095] OF: PCI: host bridge /pcie@0x33800000 ranges:

[ 2.192252] OF: PCI: No bus range found for /pcie@0x33800000, using [bus 00-ff]

[ 2.198494] OF: PCI: IO 0x1ff80000..0x1ff8ffff -> 0x00000000

[ 2.203118] OF: PCI: MEM 0x18000000..0x1fefffff -> 0x18000000

[ 2.208595] imx6q-pcie 33800000.pcie: pcie phy pll is locked.

[ 2.288520] imx6q-pcie 33800000.pcie: Speed change timeout

[ 2.292706] imx6q-pcie 33800000.pcie: Roll back to GEN1 link!

[ 2.297154] imx6q-pcie 33800000.pcie: Link up, Gen1

[ 2.302690] imx6q-pcie 33800000.pcie: PCI host bridge to bus 0000:00

[ 2.307755] pci_bus 0000:00: root bus resource [bus 00-ff]

[ 2.311996] pci_bus 0000:00: root bus resource [io 0x0000-0xffff]

[ 2.316878] pci_bus 0000:00: root bus resource [mem 0x18000000-0x1fefffff]

[ 2.322454] pci_bus 0000:00: scanning bus

[ 2.322474] pci 0000:00:00.0: [16c3:abcd] type 01 class 0x060400

[ 2.322493] pci 0000:00:00.0: reg 0x10: [mem 0x00000000-0x000fffff 64bit]

[ 2.322500] pci 0000:00:00.0: reg 0x38: [mem 0x00000000-0x0000ffff pref]

[ 2.322531] pci 0000:00:00.0: supports D1

[ 2.322535] pci 0000:00:00.0: PME# supported from D0 D1 D3hot D3cold

[ 2.322540] pci 0000:00:00.0: PME# disabled

[ 2.322637] pci_bus 0000:00: fixups for bus

[ 2.322643] pci 0000:00:00.0: scanning [bus 01-ff] behind bridge, pass 0

[ 2.322687] pci_bus 0000:01: scanning bus

[ 2.322752] pci 0000:01:00.0: [10ee:ebeb] type 00 class 0xff0000

[ 2.322859] pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x0000007f 64bit]

[ 2.323304] pci_bus 0000:01: fixups for bus

[ 2.323308] pci_bus 0000:01: bus scan returning with max=01

[ 2.323314] pci 0000:00:00.0: scanning [bus 01-ff] behind bridge, pass 1

[ 2.323319] pci_bus 0000:00: bus scan returning with max=ff

[ 2.323341] pci 0000:00:00.0: BAR 0: assigned [mem 0x18000000-0x180fffff 64bit]

[ 2.329358] pci 0000:00:00.0: BAR 8: assigned [mem 0x18100000-0x181fffff]

[ 2.334849] pci 0000:00:00.0: BAR 6: assigned [mem 0x18200000-0x1820ffff pref]

[ 2.340777] pci 0000:01:00.0: BAR 0: assigned [mem 0x18100000-0x1810007f 64bit]

[ 2.346827] pci 0000:00:00.0: PCI bridge to [bus 01-ff]

[ 2.350756] pci 0000:00:00.0: bridge window [mem 0x18100000-0x181fffff]

[ 2.356459] pcieport 0000:00:00.0: assign IRQ: got 263

[ 2.356558] pcieport 0000:00:00.0: Signaling PME with IRQ 231

[ 4.283788] xillybus_pcie 0000:01:00.0: assign IRQ: got 0

[ 4.283823] xillybus_pcie 0000:01:00.0: enabling device (0000 -> 0002)

[ 4.289521] xillybus_pcie 0000:01:00.0: enabling bus mastering

[ 4.332821] xillybus_pcie 0000:01:00.0: Created 6 device files.

[ 1548.087705] xillybus_pcie 0000:01:00.0: Hardware failed to respond to close command, therefore left in messy state.

As you can see there is nothing that explain a lot this crash problem….

I have tried several things :

- play with PCIE kernel parameters :

   CONFIG_PCIEASPM_PERFORMANCE to Y.

   CONFIG_PCI_DEBUG=Y ( it adds no logs during failure !)

-disable ASPM with the following command :

   fw_setenv kernelargs pcie_aspm=off

All these tries, give no success.

What i can do to add log during the fail ?

Is there other tries that I can do ?

BR

Frédéric

0 Kudos
1 Solution
3,935 Views
fredericduchass
Contributor III

Problem solved using the following patch :

[3/3] PCI: designware: Move interrupt acking into the proper callback - Patchwork 

0001-PCI-dwc-Move-interrupt-acking-into-the-proper-callba.patch

It could be usefull for some persons.

BR

Frédéric

View solution in original post

0 Kudos
14 Replies
3,936 Views
fredericduchass
Contributor III

Problem solved using the following patch :

[3/3] PCI: designware: Move interrupt acking into the proper callback - Patchwork 

0001-PCI-dwc-Move-interrupt-acking-into-the-proper-callba.patch

It could be usefull for some persons.

BR

Frédéric

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

Hi Frederic

one can look at description of that error in xillybus documentation:

xillybus: Failed to close file. Hardware left in messy state. 

Best regards
igor
-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
3,935 Views
fredericduchass
Contributor III

Hi Igor,

I already read this on xillybus website: The meaning is that the link between FPGA and linux driver is broken.

My fpga boot from flash and is never power reseted. So there is no reconfguration during my test.

So the PCie link seems broken as i said.... but only with the newest BSP version.

BR

Frédéric

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

Hi Frédéric

based on log error happens on xillybus side,

one can try for example increase size of fpga dma buffers.

Also one can post issue on xillybus forum

The Xillybus Forum • Index page 

Best regards
igor

0 Kudos
3,935 Views
fredericduchass
Contributor III

Hi Igor,

I'm already in contact with the xillybus support and with Variscite support.

It appears that the problem is the break of the pcie link.

They advice me to  post on this community because they haven't any solution at this time.

This problem doesn't exist with old kernel 4.9.51...

There is no possibility to add log on the pcie link ?

BR

Frédéric

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

Hi Frédéric

Hi Frédéric

 

do you have issues with other PCIe cards, not xillybus.

If not, problem is on xillybus side.

My assumption is that link may be broken due to overflowing of xillybus buffers,

so just for test one can try to decrease video stream frame rate.

 

Best regards
igor

0 Kudos
3,935 Views
fredericduchass
Contributor III

After testing, even with low bitrate, i get the same error...

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

are you sure that xillybus driver is fully functiona/tested on that version of linux kernel.

0 Kudos
3,935 Views
fredericduchass
Contributor III

This is the answer of xillybus support concerning that :

Xillybus is used by hundreds of users worldwide, in random configurations and different kernels. The driver hasn't changed since kernel 4.6 or so, and is in essence the same since 2011. Almost zero complaints in that direction, none has been proven justified. So the answer is: Yes, the driver has most likely been tested with that kernel, and versions before and after. If the PCIe interface works properly, so does Xillybus.

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

one can try patches dated [Aug 23, 2018 12:14 AM] from below link

i.MX8M EVK MIPI CSI Camera Frame Rate 

Best regards
igor

0 Kudos
3,935 Views
fredericduchass
Contributor III

Hello Igor,

Any other idea ? I have not tried the patches : as i explain i don't use v4l2 nor camera.

I'm completely sure that there is a problem using kernel 4.14.78 instead of 4.9.51 ?I don't really know what but something gives problems (interrupt....?)

Br

Frédéric

0 Kudos
3,935 Views
igorpadykov
NXP Employee
NXP Employee

Hi Frédéric

one can try to run that linux version on PC and check xillybus with it. In general

may be suggested to try extended support with NXP Professional Services | NXP 

Best regards
igor

0 Kudos
3,935 Views
fredericduchass
Contributor III

Hi Igor,

I don't use an external camera and i don't use v4l2 source ...

0 Kudos
3,935 Views
fredericduchass
Contributor III

Hi Igor,

I haven't any other pcie device.

I understand your point of vue concerning bitrate but why i haven't the same problem with old kernel ?

BR

Frédéric

0 Kudos