Re: [RFC PATCH] genirq: introduce handle_fasteoi_edge_irq flow handler

From: Marc Zyngier
Date: Fri Apr 14 2023 - 07:26:00 EST


On Fri, 10 Mar 2023 10:14:17 +0000,
Yipeng Zou <zouyipeng@xxxxxxxxxx> wrote:
>
> Recently, We have a LPI migration issue on the ARM SMP platform.
>
> For example, NIC device generates MSI and sends LPI to CPU0 via ITS,
> meanwhile irqbalance running on CPU1 set irq affinity of NIC to CPU1,
> the next interrupt will be sent to CPU2, due to the state of irq is
> still in progress, kernel does not end up performing irq handler on
> CPU2, which results in some userland service timeouts, the sequence
> of events is shown as follows:
>
> NIC CPU0 CPU1
>
> Generate IRQ#1 READ_IAR
> Lock irq_desc
> Set IRQD_IN_PROGRESS
> Unlock irq_desc
> Lock irq_desc
> Change LPI Affinity
> Unlock irq_desc
> Call irq_handler
> Generate IRQ#2
> READ_IAR
> Lock irq_desc
> Check IRQD_IN_PROGRESS
> Unlock irq_desc
> Return from interrupt#2
> Lock irq_desc
> Clear IRQD_IN_PROGRESS
> Unlock irq_desc
> return from interrupt#1
>
> For this scenario, The IRQ#2 will be lost. This does cause some exceptions.

Please see my reply to James at [1]. I'd appreciate if you could give
that patch a go, which I expect to be a better avenue to fix what is
effectively a GIC architecture defect.

Thanks,

M.

[1] https://lore.kernel.org/all/86pm89kyyt.wl-maz@xxxxxxxxxx/

--
Without deviation from the norm, progress is not possible.