Re: [RFC PATCH] KVM: arm/arm64: GICv4: Support shared VLPI

From: Kunkun Jiang
Date: Mon Nov 06 2023 - 10:35:43 EST


Hi Marc,

On 2023/11/4 18:29, Marc Zyngier wrote:
On Thu, 02 Nov 2023 14:35:07 +0000,
Kunkun Jiang <jiangkunkun@xxxxxxxxxx> wrote:
In some scenarios, the guest virtio-pci driver will request two MSI-X,
one vector for config, one shared for queues. However, the host driver
(vDPA or VFIO) will request a vector for each queue.
Well, VFIO will request *all* available MSI-X. It doesn't know what a
queue is.

In the current implementation of GICv4/4.1 direct injection of vLPI,
pINTID and vINTID have one-to-one correspondence. Therefore, the
This matching is a hard requirement that matches the architecture. You
cannot change it.

above scenario cannot be handled correctly. The host kernel will
execute its_map_vlpi multiple times but only execute its_unmap_vlpi
once. This may cause guest hang[1].
Why does it hang? As far as it is concerned, it has unmapped the
interrupts it cares about. Where are the calls to its_map_vlpi()
coming from? It should only occur if the guest actively programs the
MSI-X registers. What is your VMM? How can I reproduce this issue?

| WARN_ON(!(irq->hw && irq->host_irq == virq));
| if (irq->hw) {
| atomic_dec(&irq->target_vcpu->arch.vgic_cpu.vgic_v3.its_vpe.vlpi_count);
| irq->hw = false;
| ret = its_unmap_vlpi(virq);
| }

Add a list to struct vgic_irq to record all host irqs mapped to the vlpi.
When performing an action on the vlpi, traverse the list and perform this
action on all host irqs.
This makes no sense. You are blindly associating multiple host
interrupts with a single guest interrupt. This is a blatant violation
of the architecture. When unmapping a VLPI from a guest, only this one
should be turned again into an LPI. Not two, not all, just this one.

Maybe you have found an actual issue, but this patch is absolutely
unacceptable. Please fully describe the problem, provide traces, and
if possible a reproducer.

Link: https://lore.kernel.org/all/0d9fdf42-76b1-afc6-85a9-159c5490bbd4@xxxxxxxxxx/#t
I tried to parse this, but it hardly makes sense either. You seem to
imply that the host driver pre-configures the device, which is
completely wrong. The host driver (VFIO) should simply request all
possible physical LPIs, and that's all. It is expected that this
requesting has no other effect on the HW. Also, since your guest
driver only configures a single vLPI, there should be only a single
its_map_vlpi() call.
Sorry to replay so late.

The virtio-scsi device has seven vectors (entry0-6): one for config,
six for queues. In Guest, e.g. centos 7.6 4.19, virtio-pci driver
will request only one vLPI, which is shared for queues.
The entry 0 is used for config. It's not relevant to this issue, so
we're not going to discuss it. The virtio-pci driver write entry1-6
massage.data in the msix-table and trap to QEMU for processing. The
massage.data is as follow:
entry-0 0
entry-1 1
entry-2 1
entry-3 1
entry-4 1
entry-5 1
entry-6 1

The calling process of kvm is as follows. its_map_vlpi_will be
executed 6 times. Six host irqs are mapped to one vLPI.
kvm_irqfd_assign
    irq_bypass_register_consumer
        ...
        kvm_arch_irq_bypass_add_producer
            kvm_vgic_v4_set_forwarding
                its_map_vlpi

When executing the reboot command inside the Guest,
kvm_vgic_v4_unset_forwarding will be execute 6 times. WARN_ON
will also be triggered 6 times. But its_unmap_vlpi will only
be executed the first time.
kvm_arch_irq_bypass_del_producer
    kvm_vgic_v4_unset_forwarding
        WARN_ON(!(irq->hw && irq->host_irq == virq));
        if (irq->hw) {
            irq->hw = false;
its_unmap_vlpi
        }

Therefore, only the mapping between the first host irq and
vLPI is unmapped. When the guest reboots into the BIOS phase,
the remaining 5 host irqs may still send interrupts. This
causes the guest to hang.

Looking forward to your reply.

Thanks,
Kunkun Jiang
So it seems to me that your HW and SW are doing things that are not
expected at all.

M.