PCIE AER IRQ 482 failure

cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

PCIE AER IRQ 482 failure

1,531 Views
jerryalexander
Contributor I

Dmesg output is below.   AER irq 482 failure.   Is there a fix?  Does AER work on P4080's?

                                                                                             Jerry

[    2.136605] irq 482: nobody cared (try booting with the "irqpoll" option)   
[    2.136611] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.13.0-24-powerpc-e5006
[    2.136614] Call Trace:                                                     
[    2.136624] [eed0bcd0] [c000806c] show_stack+0xfc/0x1c0 (unreliable)        
[    2.136632] [eed0bd20] [c08046d8] dump_stack+0x78/0xa0                      
[    2.136639] [eed0bd30] [c00ac960] __report_bad_irq+0x40/0x100               
[    2.136644] [eed0bd50] [c00acf68] note_interrupt+0x238/0x290                
[    2.136649] [eed0bd80] [c00aa104] handle_irq_event_percpu+0x154/0x270       
[    2.136654] [eed0bdd0] [c00aa26c] handle_irq_event+0x4c/0x80                
[    2.136659] [eed0bde0] [c00ad70c] handle_level_irq+0xcc/0x160               
[    2.136664] [eed0bdf0] [c00a9488] generic_handle_irq+0x48/0x70              
[    2.136671] [eed0be00] [c0024d5c] fsl_error_int_handler+0xec/0x100          
[    2.136675] [eed0be20] [c00aa028] handle_irq_event_percpu+0x78/0x270        
[    2.136680] [eed0be70] [c00aa26c] handle_irq_event+0x4c/0x80                
[    2.136685] [eed0be80] [c00ae11c] handle_fasteoi_irq+0xdc/0x1a0             
[    2.136690] [eed0be90] [c00052fc] __do_irq+0x5c/0x150                       
[    2.136695] [eed0bea0] [c00054e0] do_IRQ+0xf0/0x110                         
[    2.136701] [eed0bec0] [c0010bbc] ret_from_except+0x0/0x18                  
[    2.136710] --- Exception: 501 at __do_softirq+0xc4/0x2b0                   
[    2.136710]     LR = __do_softirq+0x24/0x2b0                                
[    2.136716] [eed0bfe0] [c0055d44] irq_exit+0xb4/0xf0                        
[    2.136720] [eed0bff0] [c000e890] call_do_irq+0x24/0x3c                     
[    2.136725] [c0b7fe90] [c0005488] do_IRQ+0x98/0x110                         
[    2.136730] [c0b7feb0] [c0010bbc] ret_from_except+0x0/0x18                  
[    2.136736] --- Exception: 501 at arch_cpu_idle+0x30/0x80                   
[    2.136736]     LR = arch_cpu_idle+0x30/0x80                                
[    2.136743] [c0b7ff70] [c00b5b38] rcu_idle_enter+0xb8/0x100 (unreliable)    
[    2.136748] [c0b7ff80] [c00a9300] cpu_startup_entry+0x160/0x260             
[    2.136756] [c0b7ffc0] [c0a927ec] start_kernel+0x33c/0x350                  
[    2.136761] [c0b7fff0] [c00003fc] skpinv+0x2e8/0x324                        
[    2.136763] handlers:                                                       
[    2.136769] [<c03ea8a0>] aer_irq                                            
[    2.136773] [<c03eba20>] pcie_pme_irq                                       
[    2.136775] Disabling IRQ #482               

0 Kudos
3 Replies

821 Views
yipingwang
NXP TechSupport
NXP TechSupport

On some legacy platforms with legacy PCI conroller(e.g. some non-DPAA platforms),

hardware doesn't support Fatal error type for AER, just support Non-Fatal error.

Generally, DPAA platforms with new PCIE controller can support both Fatal error and

Non-Fatal error.

0 Kudos

821 Views
jerryalexander
Contributor I

Yiping:

          We are running " a rev3 P4080 SoC with rev3 e500mc cores" .

          It is definitely DPAA and definitely NOT legacy.

                                                                                            Jerry

0 Kudos

821 Views
yipingwang
NXP TechSupport
NXP TechSupport

Do you use the SDK 1.5?

Please check the Kernel configuration

Bus options --->

[*] PCI Express support

[*] Root Port Advanced Error Reporting

support

<*> PCIe AER error injector support

And the following test steps.

2.1In the uboot prompt:  Adding pcie_ports=native to bootargs

=>setenv othbootargs pcie_ports=native

2.2Reboot the board with the kernel

# zcat /proc/config.gz|grep -i CONFIG_PCIEAER_INJECT

# cat /proc/cmdline
root=/dev/ram rw console=ttyS0,115200 pcie_ports=native

2.3Check wheather the inject device node is created.

# ls /dev/aer_inject

The test device node /dev/aer_inkect exists.

3. Download the aer_inject test program from "http://www.kernel.org/pub/linux/utils/pci/aer-inject/".

Cross compile it on the PC(server):

$ tar -xf aer-inject-0.1.tar.gz

$ cd aer-inject-0.1

$ source /opt/fsl/1.1/environment-setup-ppce500mc-fsl-linux

$ make

A binary file named "aer-inject" is created in current folder.

4. On the board:

Geting the pcie bus device and funciont number, and this step is a prepration for the next.

# lspci -vvv

01:00.1 Ethernet controller: Intel Corporation 82598EB 10-Gigabit AF Dual Port Network Connection (rev 01) Capabilities: [100] Advanced Error Reporting

Here "01:00.1" means  BUS 1; device 0;function 1

5.

Write the test config file

In the aer-inject folder.

$ mkdir test

$ cd test

$ cat ear1

AER
BUS 1 DEV 0 FN 0
UNCOR_STATUS {ERROR_NUM}
HEADER_LOG 0 1 2 3

Note:  {ERROR_NUM} should be one of

TRAIN,DLP,POISON_TLP,FCP,COMP_TIME,COMP_ABORT,UNX_COMP,RX_OVER,MALF_TLP,ECRC,UNS

6.Transfer the file aer-inject, aer1,aer2 and aer3 to the board

7.

root@p5020ds:~# ./aer-inject aer3

pcieport 0000:00:00.0: AER: Uncorrected (Fatal) error received: id=0100

e1000e 0000:01:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Unaccessible, id=0100(Unregistered Agent ID)

e1000e 0000:01:00.0: broadcast error_detected message

root@p5020ds:~# pcieport 0000:00:00.0: Root Port link has been reset

e1000e 0000:01:00.0: broadcast slot_reset message

e1000e 0000:01:00.0: Disabling ASPM  L1

e1000e 0000:01:00.0: enabling device (0000 -> 0002)

e1000e 0000:01:00.0: restoring config space at offset 0x6 (was 0x1, writing 0x1001)

e1000e 0000:01:00.0: restoring config space at offset 0x5 (was 0x0, writing 0xe0020000)

e1000e 0000:01:00.0: restoring config space at offset 0x4 (was 0x0, writing 0xe0000000)

e1000e 0000:01:00.0: restoring config space at offset 0x3 (was 0x10, writing 0x8)

e1000e 0000:01:00.0: broadcast resume message

e1000e 0000:01:00.0: AER driver successfully recovered

e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx

8.

Double check whether the pcie device can still work. Take the PCIE-NIC for example.

# ifocnfig eth0 192.168.1.2

# ping 192.168.1.1


Have a great day,
Yiping Wang

-----------------------------------------------------------------------------------------------------------------------
Note: If this post answers your question, please click the Correct Answer button. Thank you!
-----------------------------------------------------------------------------------------------------------------------

0 Kudos