Re: [PATCH] irq: fasteoi handler re-runs on concurrent invoke

From: Marc Zyngier
Date: Tue May 02 2023 - 06:30:34 EST


Catching up...

On Tue, 18 Apr 2023 11:56:07 +0100,
"Gowans, James" <jgowans@xxxxxxxxxx> wrote:
>
> On Wed, 2023-04-12 at 14:32 +0100, Marc Zyngier wrote:
> >
> > From c96d2ab37fe273724f1264fba5f4913259875d56 Mon Sep 17 00:00:00 2001
> > From: Marc Zyngier <maz@xxxxxxxxxx>
> > Date: Mon, 10 Apr 2023 10:56:32 +0100
> > Subject: [PATCH] irqchip/gicv3-its: Force resend of LPIs taken while
> > already
> > in-progress
>
> Perhaps you can pillage some of my commit message to explain the race here
> when you send this patch?

Sure. At the moment, we're still far from a final patch though.

> >
> > Signed-off-by: Marc Zyngier <maz@xxxxxxxxxx>
> >
> > diff --git a/include/linux/irq.h b/include/linux/irq.h
> > index b1b28affb32a..4b2a7cc96eb2 100644
> > --- a/include/linux/irq.h
> > +++ b/include/linux/irq.h
> > @@ -223,6 +223,8 @@ struct irq_data {
> > * irq_chip::irq_set_affinity() when
> > deactivated.
> > * IRQD_IRQ_ENABLED_ON_SUSPEND - Interrupt is enabled on suspend by irq
> > pm if
> > * irqchip have flag
> > IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND set.
> > + * IRQD_RESEND_WHEN_IN_PROGRESS - Interrupt may fire when already in
> > progress,
> > + * needs resending.
> > */
> > enum {
> > IRQD_TRIGGER_MASK = 0xf,
> > @@ -249,6 +251,7 @@ enum {
> > IRQD_HANDLE_ENFORCE_IRQCTX = (1 << 28),
> > IRQD_AFFINITY_ON_ACTIVATE = (1 << 29),
> > IRQD_IRQ_ENABLED_ON_SUSPEND = (1 << 30),
> > + IRQD_RESEND_WHEN_IN_PROGRESS = (1 << 31),
> > };
>
> Do we really want a new flag here? I'd be keen to fix this race for all
> drivers, not just those who know to set this flag. I think the patch
> you're suggesting is pretty close to being safe to enable generally? If so
> my preference is for one less config option - just run it always.

I contend that this really is a GICv3 architectural bug. The lack of
an active state on LPIs leads to it, and as far as I can tell, no
other interrupt architecture has the same issue. So the onus should be
on the GIC, the GIC only, and only the parts of the GIC that require
it (SPIs, PPIs and SGIs are fine, either because they have an active
state, or because the lock isn't dropped when calling the handler).

Thanks,

M.

--
Without deviation from the norm, progress is not possible.