Re: [RFC PATCH] KVM: arm/arm64: GICv4: Support shared VLPI

From: Marc Zyngier
Date: Sat Nov 04 2023 - 06:29:35 EST


On Thu, 02 Nov 2023 14:35:07 +0000,
Kunkun Jiang <jiangkunkun@xxxxxxxxxx> wrote:
>
> In some scenarios, the guest virtio-pci driver will request two MSI-X,
> one vector for config, one shared for queues. However, the host driver
> (vDPA or VFIO) will request a vector for each queue.

Well, VFIO will request *all* available MSI-X. It doesn't know what a
queue is.

>
> In the current implementation of GICv4/4.1 direct injection of vLPI,
> pINTID and vINTID have one-to-one correspondence. Therefore, the

This matching is a hard requirement that matches the architecture. You
cannot change it.

> above scenario cannot be handled correctly. The host kernel will
> execute its_map_vlpi multiple times but only execute its_unmap_vlpi
> once. This may cause guest hang[1].

Why does it hang? As far as it is concerned, it has unmapped the
interrupts it cares about. Where are the calls to its_map_vlpi()
coming from? It should only occur if the guest actively programs the
MSI-X registers. What is your VMM? How can I reproduce this issue?

>
> | WARN_ON(!(irq->hw && irq->host_irq == virq));
> | if (irq->hw) {
> | atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
> | irq->hw = false;
> | ret = its_unmap_vlpi(virq);
> | }
>
> Add a list to struct vgic_irq to record all host irqs mapped to the vlpi.
> When performing an action on the vlpi, traverse the list and perform this
> action on all host irqs.

This makes no sense. You are blindly associating multiple host
interrupts with a single guest interrupt. This is a blatant violation
of the architecture. When unmapping a VLPI from a guest, only this one
should be turned again into an LPI. Not two, not all, just this one.

Maybe you have found an actual issue, but this patch is absolutely
unacceptable. Please fully describe the problem, provide traces, and
if possible a reproducer.

>
> Link: https://lore.kernel.org/all/0d9fdf42-76b1-afc6-85a9-159c5490bbd4@xxxxxxxxxx/#t

I tried to parse this, but it hardly makes sense either. You seem to
imply that the host driver pre-configures the device, which is
completely wrong. The host driver (VFIO) should simply request all
possible physical LPIs, and that's all. It is expected that this
requesting has no other effect on the HW. Also, since your guest
driver only configures a single vLPI, there should be only a single
its_map_vlpi() call.

So it seems to me that your HW and SW are doing things that are not
expected at all.

M.

--
Without deviation from the norm, progress is not possible.