Re: [PATCH] irqchipi/gic-v4: Ensure accessing the correct RD when and writing INVLPIR

From: Marc Zyngier
Date: Thu Apr 13 2023 - 03:35:31 EST


On Thu, 13 Apr 2023 04:57:17 +0100,
Kunkun Jiang <jiangkunkun@xxxxxxxxxx> wrote:
>
> Hi Marc,
>
> On 2023/4/12 17:42, Marc Zyngier wrote:
> > On Wed, 12 Apr 2023 05:15:10 +0100,
> > Kunkun Jiang <jiangkunkun@xxxxxxxxxx> wrote:
> >> commit f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between
> >> vPE affinity change and RD access") tried to address the race
> >> between the RD accesses and the vPE affinity change, but somehow
> >> forgot to take GICR_INVLPIR into account. Let's take the vpe_lock
> >> before evaluating vpe->col_idx to fix it.
> >>
> >> Fixes: f3a059219bc7 ("irqchip/gic-v4.1: Ensure mutual exclusion between vPE affinity change and RD access")
> >> Signed-off-by: Kunkun Jiang <jiangkunkun@xxxxxxxxxx>
> >> Signed-off-by: Xiang Chen <chenxiang66@xxxxxxxxxxxxx>
> >> Signed-off-by: Nianyao Tang <tangnianyao@xxxxxxxxxx>
> > Yup, nice catch. A few remarks though:
> >
> > - the subject looks odd: there is a spurious 'and' there, and it
> > doesn't say this is all about VPE doorbell invalidation (the code
> > that deals with direct LPI is otherwise fine)
> >
> > - the SoB chain is also odd. You should be last in the chain, and all
> > the others have Co-developed-by tags in addition to the SoB, unless
> > you wanted another tag
> Thanks for your guidance.
> >
> > - I'm curious about how you triggered the issue. Could you please
> > elaborate on that>
>
> I will detail it here:
> 1. a multi-core VM with a pass-through device, enable GICv4
> 2. insmod the device driver in guest
> then,the following two call trace information is displayed occasionally:
>
> > [ 1055.683978] BUG: spinlock already unlocked on CPU#1,

[...]

> After analysis, I found that when insmod the device driver in guest,
> its_map_vm will be executed on the host and map all vPEs to the
> first possible CPU(pCPU0). When dealing with WFI, its_vpe_send_inv
> will access RD, obtain and release the 'rd_lock' through 'vpe->col_idx'.

[...]

> So something like this happens:
> After obtaining the 'rd_lock' through 'vpe->col_idx', its_map_vm
> modifies 'vpe->col_idx'.
> This is also the cause of the preceding call trace.

Right. I figured this out, but the key information is that you get it
at boot time due to moving the VPEs away from CPU0

> There's a little question I haven't figured out here. Why map all vPEs
> to the first possible CPU in its_map_vm? In addition, vpe_lock is not
> used here. Will concurrency problems occur in the future?

Where would you like to map them? Any physical CPU would do the
trick. We could take the current CPU, but the fact is that we don't
always know where the VPEs are running.

As for holding vpe_lock, I don't think we need to. This only happens
on the first VLPI being forwarded, so there should be no real GIC
context for anything.

But I must admit that I haven't looked at this code in a long time,
and have no way to test it anymore.

>
> >
> > Finally, I think we can fix it in a better way, see below:
> >
> >> ---
> >> drivers/irqchip/irq-gic-v3-its.c | 10 +++++++---
> >> 1 file changed, 7 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> >> index 586271b8aa39..041f06922587 100644
> >> --- a/drivers/irqchip/irq-gic-v3-its.c
> >> +++ b/drivers/irqchip/irq-gic-v3-its.c
> >> @@ -3943,13 +3943,17 @@ static void its_vpe_send_inv(struct irq_data *d)
> >> if (gic_rdists->has_direct_lpi) {
> >> void __iomem *rdbase;
> >> + unsigned long flags;
> >> + int cpu;
> >> /* Target the redistributor this VPE is currently
> >> known on */
> >> - raw_spin_lock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
> >> - rdbase = per_cpu_ptr(gic_rdists->rdist, vpe->col_idx)->rd_base;
> >> + cpu = vpe_to_cpuid_lock(vpe, &flags);
> >> + raw_spin_lock(&gic_data_rdist_cpu(cpu)->rd_lock);
> >> + rdbase = per_cpu_ptr(gic_rdists->rdist, cpu)->rd_base;
> >> gic_write_lpir(d->parent_data->hwirq, rdbase + GICR_INVLPIR);
> >> wait_for_syncr(rdbase);
> >> - raw_spin_unlock(&gic_data_rdist_cpu(vpe->col_idx)->rd_lock);
> >> + raw_spin_unlock(&gic_data_rdist_cpu(cpu)->rd_lock);
> >> + vpe_to_cpuid_unlock(vpe, flags);
> >> } else {
> >> its_vpe_send_cmd(vpe, its_send_inv);
> >> }
> > The main reason this bug crept in is that we have a some pretty silly
> > code duplication going on.
> >
> > Wouldn't it be nice if irq_to_cpuid() could work out whether it is
> > dealing with a LPI or a VLPI like it does today, but also directly
> > with a VPE? We could then use the same code as derect_lpi_inv(). I
> > came up with this the hack below, which is totally untested as I don't
> > have access to GICv4.1 HW.
> >
> > Could you give it a spin?
>
> Nice, I will test it as soon as possible.

Cool, please let me know when you get a chance.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.