RE: [5.14-rc1] mlx5_core receives no interrupts with maxcpus=8

From: Dexuan Cui
Date: Wed Aug 18 2021 - 17:08:26 EST


> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Sent: Wednesday, July 21, 2021 2:17 PM
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>; Saeed Mahameed
>
> On Mon, Jul 19 2021 at 20:33, Dexuan Cui wrote:
> > This is a bare metal x86-64 host with Intel CPUs. Yes, I believe the
> > issue is in the IOMMU Interrupt Remapping mechanism rather in the
> > NIC driver. I just don't understand why bringing the CPUs online and
> > offline can work around the issue. I'm trying to dump the IOMMU IR
> > table entries to look for any error.
>
> can you please enable GENERIC_IRQ_DEBUGFS and provide the output of
>
> cat /sys/kernel/debug/irq/irqs/$THENICIRQS
>
> Thanks,
>
> tglx

Sorry for the late response! I checked the below sys file, and the output is
exactly the same in the good/bad cases -- in both cases, I use maxcpus=8;
the only difference in the good case is that I online and then offline CPU 8~31:
for i in `seq 8 31`; do echo 1 > /sys/devices/system/cpu/cpu$i/online; done
for i in `seq 8 31`; do echo 0 > /sys/devices/system/cpu/cpu$i/online; done

# cat /sys/kernel/debug/irq/irqs/209
handler: handle_edge_irq
device: 0000:d8:00.0
status: 0x00004000
istate: 0x00000000
ddepth: 0
wdepth: 0
dstate: 0x35409200
IRQD_ACTIVATED
IRQD_IRQ_STARTED
IRQD_SINGLE_TARGET
IRQD_MOVE_PCNTXT
IRQD_AFFINITY_SET
IRQD_AFFINITY_ON_ACTIVATE
IRQD_CAN_RESERVE
IRQD_HANDLE_ENFORCE_IRQCTX
node: 1
affinity: 0-7
effectiv: 5
pending:
domain: INTEL-IR-MSI-3-3
hwirq: 0x6c00000
chip: IR-PCI-MSI
flags: 0x30
IRQCHIP_SKIP_SET_WAKE
IRQCHIP_ONESHOT_SAFE
parent:
domain: INTEL-IR-3
hwirq: 0x20000
chip: INTEL-IR
flags: 0x0
parent:
domain: VECTOR
hwirq: 0xd1
chip: APIC
flags: 0x0
Vector: 42
Target: 5
move_in_progress: 0
is_managed: 0
can_reserve: 1
has_reserved: 0
cleanup_pending: 0

Thanks,
Dexuan