PCIe link goes down randomly

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

PCIe link goes down randomly

1,611 Views
ravikumar1631
Contributor II

In  multiple T2080 & P3041 boards PCIe link down error is getting triggered at random times. The PCIe error handler prints below  traces from PCI ERR handler. What I know  that link down occurs after PCIe RC detects the Hot reset event. And hot reset can be triggered in RC mode after writing into secondary bridge control register in config space. Nowhere our code can trigger hot reset implicitly. Initially I suspected, it could be a hardware issue, but  I have observed this behaviour in multiple boards. Its strange If all boards are faulty. What   are  the other reasons which can cause PCI hot reset or link down event?

[ 4526.366107] (c0 -96 ksoftirqd <9>) PCIe error(s) detected
[ 4526.372035] (c0 -96 ksoftirqd <9>) PCIe ERR_DR register: 0x00000020
[ 4526.378830] (c0 -96 ksoftirqd <9>) PCIe ERR_CAP_STAT register: 0x00000107
[ 4526.386147] (c0 -96 ksoftirqd <9>) PCIe ERR_CAP_R0 register: 0x00000000
[ 4526.393289] (c0 -96 ksoftirqd <9>) PCIe ERR_CAP_R1 register: 0x00000001
[ 4526.400432] (c0 -96 ksoftirqd <9>) PCIe ERR_CAP_R2 register: 0x08443c01
[ 4526.407574] (c0 -96 ksoftirqd <9>) PCIe ERR_CAP_R3 register: 0xf000002d
[ 4526.414852] (c0 -31 reboot <1>) writing fields 'un-comm-peripherals'  

0 Kudos
1 Reply

1,592 Views
yipingwang
NXP TechSupport
NXP TechSupport

The PCI Express error capture registers, PEX_ERR_CAP_R0 through
PEX_ERR_CAP_R3, allow vital error information to be captured when an error occurs.
Different error information is reported depending on whether the error source is from an
outbound transaction from an internal source or from an inbound transaction from an
external source; the source of the captured error is reflected in
PEX_ERR_CAP_STAT[GSID]. Note that after the initial error is captured, no further
capturing is performed until the PEX_ERR_CAP_STAT[ECV] bit is clear.

Please refer to section "20.6.1.12.4.1 Error capture registers (outbound error)" and "20.6.1.12.4.2 Error capture registers (inbound error)" in T2080 Reference manual for details.

0 Kudos