We have a custom hardware design based on the P1022DS board. We have connected PCIe interfaces 1 and 2 each to their own 2:4 PCIe switch. One of the PCIe switches has attached to it 2 PCIe devices. Memory accesses to the 2 PCIe devices work fine, but we are having problems enabling interrupts on the 2 PCIe devices.
When request_irq() is called on each PCIe device, the interrupt storm handler kicks in and disables each IRQ.
The /proc/interrupts output is:
CPU0
16: 0 OpenPIC 16 Level fsl-lbc-err, [PCI] PME, [PCI] PME
19: 24170 OpenPIC 19 Level fsl-lbc
25: 100000 OpenPIC 6 Level btipci
26: 100000 OpenPIC 7 Level btipci
42: 383 OpenPIC 42 Level serial
LOC: 6449 Local timer interrupts
SPU: 0 Spurious interrupts
PMI: 0 Performance monitoring interrupts
MCE: 0 Machine check exceptions
The two entries for "btipci" are the IRQs in question.
Why the discrepancy between the IRQ (25 and 26) and the interrupt type (6 Level and 7 Level)? Is this the reason for the interrupt storm?
BTW, both PCIe devices are using legacy interrupts and have no MSI or MSI-X capabilities.
An interrupt storm happens because a device is generating an interrupt that no driver is handling (or, some misconfiguration makes the MPIC think that a device is generating the interrupt -- particularly likely given that these are external IRQs). It has nothing to do with the hardware IRQ number being different from the virtual IRQ number. The reason these two interrupts differ while the others don't is that their number is under 16, and virtual IRQ numbers under 16 are reserved for ISA (even if no such hardware is present). In theory any interrupt can have a virtual IRQ number that doesn't match the hardware number.