LS1023A_PCIE Problem

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

LS1023A_PCIE Problem

Jump to solution
4,587 Views
1028605553
Contributor I

Hi

I have some problems with the PCIE controller of ls1023a

We link the fpga of a pcie interface to the  ls1023(is x1 g1)

When the pcie bus runs for a period of time
the problem where the AER driver reports Completion Timeouts" for any PCI memory read access to a certain endpoint device:

[ 1200.937459] pcieport 0001:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000
[ 1200.945559] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[ 1200.957413] pcieport 0001:00:00.0: device [1957:808a] error status/mask=00004000/00400000
[ 1200.965759] pcieport 0001:00:00.0: [14] Completion Timeout (First)
[ 1200.972619] pcieport 0001:00:00.0: AER: Device recovery failed
[ 1200.981337] pcieport 0001:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000
[ 1200.989444] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[ 1201.001271] pcieport 0001:00:00.0: device [1957:808a] error status/mask=00004000/00400000
[ 1201.009616] pcieport 0001:00:00.0: [14] Completion Timeout (First)
[ 1201.020172] pcieport 0001:00:00.0: AER: Device recovery failed
[ 1201.053647] pcieport 0001:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000
[ 1201.061737] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[ 1201.073592] pcieport 0001:00:00.0: device [1957:808a] error status/mask=00004000/00400000
[ 1201.081937] pcieport 0001:00:00.0: [14] Completion Timeout (First)
[ 1201.088819] pcieport 0001:00:00.0: AER: Device recovery failed
[ 1201.097411] pcieport 0001:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000
[ 1201.105471] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[ 1201.117328] pcieport 0001:00:00.0: device [1957:808a] error status/mask=00004000/00400000
[ 1201.125681] pcieport 0001:00:00.0: [14] Completion Timeout (First)
[ 1201.132564] pcieport 0001:00:00.0: AER: Device recovery failed
[ 1201.140808] pcieport 0001:00:00.0: AER: Uncorrected (Non-Fatal) error received: id=0000
[ 1201.170951] drv_fpga_read signal_pending error!
[ 1201.256622] pcieport 0001:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, id=0000(Requester ID)
[ 1201.268363] pcieport 0001:00:00.0: device [1957:808a] error status/mask=00004000/00400000

thanks

0 Kudos
1 Solution
3,504 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello zz zw,

Please refer to A-008822

Description: By default, when the PCI Express controller experiences an erroneous completion from an external completer for its outbound non-posted request, it always sends an OKAY response to the device’s internal AXI slave system interface. This is desirable for outbound configure transactions to prevent an unnecessary error response from propagating through higher-level system hierarchy, because erroneous completion is a commonly expected behavior during PCI Express bus scan.
However, such default system error response behavior cannot be used for other types of outbound non-posted requests. For example, the outbound memory read transaction requires
an actual ERROR response when experiencing erroneous completion from an external completer, like UR completion or completion timeout.

Impact: The device's higher level system hierarchy cannot detect the error condition when the PCI Express controller experiences an erroneous completion from the external completer for its
outbound non-posted request. This is not the case for configure transactions.

Workaround: Workaround: Write to the PCI Express controller's configure space offset 8D0h with 0000_9401h during the
pre-boot initialization (PBI) process.
Fix plan: No plans to fix

Please refer to patch [2/2] pci/layerscape: change the default error response behavior - Patchwork .


Have a great day,
TIC

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

View solution in original post

0 Kudos
3 Replies
3,504 Views
1028605553
Contributor I
  • The reason is that I forgot to connect two clock lines for pcie  in my schematic

0 Kudos
3,505 Views
yipingwang
NXP TechSupport
NXP TechSupport

Hello zz zw,

Please refer to A-008822

Description: By default, when the PCI Express controller experiences an erroneous completion from an external completer for its outbound non-posted request, it always sends an OKAY response to the device’s internal AXI slave system interface. This is desirable for outbound configure transactions to prevent an unnecessary error response from propagating through higher-level system hierarchy, because erroneous completion is a commonly expected behavior during PCI Express bus scan.
However, such default system error response behavior cannot be used for other types of outbound non-posted requests. For example, the outbound memory read transaction requires
an actual ERROR response when experiencing erroneous completion from an external completer, like UR completion or completion timeout.

Impact: The device's higher level system hierarchy cannot detect the error condition when the PCI Express controller experiences an erroneous completion from the external completer for its
outbound non-posted request. This is not the case for configure transactions.

Workaround: Workaround: Write to the PCI Express controller's configure space offset 8D0h with 0000_9401h during the
pre-boot initialization (PBI) process.
Fix plan: No plans to fix

Please refer to patch [2/2] pci/layerscape: change the default error response behavior - Patchwork .


Have a great day,
TIC

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos
3,504 Views
luckyluo
Contributor II

is this patch can fix the issue?

patch [2/2] pci/layerscape: change the default error response behavior - Patchwork .

My LS1088A card has encounter the similar issue after it runs for a period:

[ 2551.253634] pcieport 0000:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, id=0000(Transmitter ID)

[ 2551.253638] pcieport 0000:00:00.0:   device [1957:80c0] error status/mask=00001081/00006000

[ 2551.253640] pcieport 0000:00:00.0:    [ 0] Receiver Error       

[ 2551.253643] pcieport 0000:00:00.0:    [ 7] Bad DLLP             

[ 2551.253646] pcieport 0000:00:00.0:    [12] Replay Timer Timeout 

[ 2551.253650] pcieport 0000:00:00.0: AER: Multiple Corrected error received: id=0000

[ 2551.253659] pcieport 0000:00:00.0: can't find device of ID0000

So I am interested in the further test result of LS1023A after Linux is patched.

Because I found my LSDK has include this patched already but it has the same issue, so I was wondering if this patch can fix such issue at all?

Thanks

0 Kudos