Re: [PATCH] irqchip/gic-v3: Workaround for GIC-700 erratum 2941627

From: Lorenzo Pieralisi
Date: Tue Jul 04 2023 - 11:14:15 EST


On Tue, Jul 04, 2023 at 03:44:50PM +0100, Marc Zyngier wrote:
> Lorenzo,
>
> On Tue, 04 Jul 2023 13:34:36 +0100,
> Lorenzo Pieralisi <lpieralisi@xxxxxxxxxx> wrote:
> >
> > GIC700 erratum 2941627 may cause GIC-700 missing SPIs wake
> > requests when SPIs are deactivated while targeting a
> > sleeping CPU - ie a CPU for which the redistributor:
> >
> > GICR_WAKER.ProcessorSleep == 1
> >
> > This runtime situation can happen if an SPI that has been
> > activated on a core is retargeted to a different core, it
> > becomes pending and the target core subsequently enters a
> > power state quiescing the respective redistributor.
> >
> > When this situation is hit, the de-activation carried out
> > on the core that activated the SPI (through either ICC_EOIR1_EL1
> > or ICC_DIR_EL1 register writes) does not trigger a wake
> > requests for the sleeping GIC redistributor even if the SPI
> > is pending.
> >
> > Fix the erratum by de-activating the SPI using the
>
> s/Fix/ Work around/
>
> > redistributor GICD_ICACTIVER register if the runtime
> > conditions require it (ie the IRQ was retargeted between
> > activation and de-activation).
> >
> > Signed-off-by: Lorenzo Pieralisi <lpieralisi@xxxxxxxxxx>
> > Cc: Marc Zyngier <maz@xxxxxxxxxx>
> > ---
> > Documentation/arm64/silicon-errata.rst | 3 ++
> > drivers/irqchip/irq-gic-v3.c | 71 +++++++++++++++++++++++++-
> > 2 files changed, 73 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
> > index 9e311bc43e05..e77c57a0adf8 100644
> > --- a/Documentation/arm64/silicon-errata.rst
> > +++ b/Documentation/arm64/silicon-errata.rst
> > @@ -141,6 +141,9 @@ stable kernels.
> > | ARM | MMU-500 | #841119,826419 | N/A |
> > +----------------+-----------------+-----------------+-----------------------------+
> > +----------------+-----------------+-----------------+-----------------------------+
> > +| ARM | GIC-700 | #2941627 | ARM64_ERRATUM_2941627 |
> > ++----------------+-----------------+-----------------+-----------------------------+
> > ++----------------+-----------------+-----------------+-----------------------------+
> > | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_845719 |
> > +----------------+-----------------+-----------------+-----------------------------+
> > | Broadcom | Brahma-B53 | N/A | ARM64_ERRATUM_843419 |
> > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > index a605aa79435a..a0a9ccf23742 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -68,6 +68,8 @@ struct gic_chip_data {
> > static void __iomem *t241_dist_base_alias[T241_CHIPS_MAX] __read_mostly;
> > static DEFINE_STATIC_KEY_FALSE(gic_nvidia_t241_erratum);
> >
> > +static DEFINE_STATIC_KEY_FALSE(gic_arm64_2941627_erratum);
> > +
> > static struct gic_chip_data gic_data __read_mostly;
> > static DEFINE_STATIC_KEY_TRUE(supports_deactivate_key);
> >
> > @@ -591,10 +593,35 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
> > gic_irq_set_prio(d, GICD_INT_DEF_PRI);
> > }
> >
> > +static bool gic_arm64_erratum_2941627_needed(struct irq_data *d)
> > +{
> > + if (!static_branch_unlikely(&gic_arm64_2941627_erratum))
> > + return false;
> > +
> > + /*
> > + * The workaround is needed if the IRQ is an SPI and
> > + * the target cpu is different from the one we are
> > + * executing on.
> > + */
> > + return !((gic_irq_in_rdist(d)) || gic_irq(d) >= 8192 ||
> > + cpumask_equal(irq_data_get_effective_affinity_mask(d),
> > + cpumask_of(smp_processor_id())));
>
> I dislike this statement for multiple reasons:
>
> - it is written as a negation, making it harder than strictly
> necessary to parse as it is the opposite of the comment above

Yes, I agree.

> - gic_irq_in_rdist() and gic_irq(d) >= 8192 are two ways of checking
> the interrupt range -- maybe we should just do that
>
> - cpumask_equal() is *slow* if you have more that 64 CPUs, something
> that is increasingly common -- a better option would be to check
> whether the current CPU is in the mask or not, which would be enough
> as we only have a single affinity bit set
>
> - smp_processor_id() can check for preemption, which is pointless
> here, as we're doing things under the irq_desc raw spinlock.

These are valid points and there is no reason why this should not be
rewritten as you suggest below.

> I would expect something like:
>
> enum gic_intid_range range = get_intid_range(d);
>
> return (range == SGI_RANGE || range == ESPI_RANGE) &&
> !cpumask_test_cpu(raw_smp_processor_id(),
> irq_data_get_effective_affinity_mask(d));

It should work (and it is easier to read in the process), thanks.

> > +}
> > +
> > static void gic_eoi_irq(struct irq_data *d)
> > {
> > write_gicreg(gic_irq(d), ICC_EOIR1_EL1);
> > isb();
> > +
> > + if (gic_arm64_erratum_2941627_needed(d)) {
> > + /*
> > + * Make sure the GIC stream deactivate packet
> > + * issued by ICC_EOIR1_EL1 has completed before
> > + * deactivating through GICD_IACTIVER.
> > + */
> > + dsb(sy);
> > + gic_poke_irq(d, GICD_ICACTIVER);
> > + }
> > }
> >
> > static void gic_eoimode1_eoi_irq(struct irq_data *d)
> > @@ -605,7 +632,11 @@ static void gic_eoimode1_eoi_irq(struct irq_data *d)
> > */
> > if (gic_irq(d) >= 8192 || irqd_is_forwarded_to_vcpu(d))
> > return;
> > - gic_write_dir(gic_irq(d));
> > +
> > + if (!gic_arm64_erratum_2941627_needed(d))
> > + gic_write_dir(gic_irq(d));
> > + else
> > + gic_poke_irq(d, GICD_ICACTIVER);
> > }
> >
> > static int gic_set_type(struct irq_data *d, unsigned int type)
> > @@ -1796,6 +1827,25 @@ static bool gic_enable_quirk_nvidia_t241(void *data)
> > return true;
> > }
> >
> > +static bool gic_enable_quirk_arm64_2941627(void *data)
> > +{
> > + /*
> > + * If CPUidle is not enabled the erratum runtime
> > + * conditions can't be hit, since that requires:
> > + *
> > + * - A core entering a deep power state with
> > + * the associated GIC redistributor asleep
> > + * and an IRQ active and pending targeted at it
> > + * - A different core handling the IRQ and
> > + * related GIC operations at the same time
> > + */
> > + if (!IS_ENABLED(CONFIG_CPU_IDLE))
> > + return false;
>
> Could this still hit on a system that traps WFI to EL3 and uses this
> as a way to enter a low-power mode?

That's a valid point, I have not thought about that. If there are set-ups
where this is allowed (and I think it *is* architecturally allowed with
EL3 saving/restoring context and entering a deep power state - if you
asked I suspect you have something concrete in mind :)) this "optimization"
must be removed since we can still hit the bug; that's what I shall do
for v2.

Thanks,
Lorenzo