Re: [RFC KERNEL PATCH v2 2/3] xen/pvh: Unmask irq for passthrough device in PVH dom0

From: Roger Pau Monné
Date: Thu Nov 30 2023 - 11:02:37 EST


On Wed, Nov 29, 2023 at 07:53:59PM -0800, Stefano Stabellini wrote:
> On Fri, 24 Nov 2023, Jiqian Chen wrote:
> > This patch is to solve two problems we encountered when we try to
> > passthrough a device to hvm domU base on Xen PVH dom0.
> >
> > First, hvm guest will alloc a pirq and irq for a passthrough device
> > by using gsi, before that, the gsi must first has a mapping in dom0,
> > see Xen code pci_add_dm_done->xc_domain_irq_permission, it will call
> > into Xen and check whether dom0 has the mapping. See
> > XEN_DOMCTL_irq_permission->pirq_access_permitted, "current" is PVH
> > dom0 and it return irq is 0, and then return -EPERM.
> > This is because the passthrough device doesn't do PHYSDEVOP_map_pirq
> > when thay are enabled.
> >
> > Second, in PVH dom0, the gsi of a passthrough device doesn't get
> > registered, but gsi must be configured for it to be able to be
> > mapped into a domU.
> >
> > After searching codes, we can find map_pirq and register_gsi will be
> > done in function vioapic_write_redirent->vioapic_hwdom_map_gsi when
> > the gsi(aka ioapic's pin) is unmasked in PVH dom0. So the problems
> > can be conclude to that the gsi of a passthrough device doesn't be
> > unmasked.
> >
> > To solve the unmaske problem, this patch call the unmask_irq when we
> > assign a device to be passthrough. So that the gsi can get registered
> > and mapped in PVH dom0.
>
>
> Roger, this seems to be more of a Xen issue than a Linux issue. Why do
> we need the unmask check in Xen? Couldn't we just do:
>
>
> diff --git a/xen/arch/x86/hvm/vioapic.c b/xen/arch/x86/hvm/vioapic.c
> index 4e40d3609a..df262a4a18 100644
> --- a/xen/arch/x86/hvm/vioapic.c
> +++ b/xen/arch/x86/hvm/vioapic.c
> @@ -287,7 +287,7 @@ static void vioapic_write_redirent(
> hvm_dpci_eoi(d, gsi);
> }
>
> - if ( is_hardware_domain(d) && unmasked )
> + if ( is_hardware_domain(d) )
> {
> /*
> * NB: don't call vioapic_hwdom_map_gsi while holding hvm.irq_lock

There are some issues with this approach.

mp_register_gsi() will only setup the trigger and polarity of the
IO-APIC pin once, so we do so once the guest unmask the pin in order
to assert that the configuration is the intended one. A guest is
allowed to write all kind of nonsense stuff to the IO-APIC RTE, but
that doesn't take effect unless the pin is unmasked.

Overall the question would be whether we have any guarantees that
the hardware domain has properly configured the pin, even if it's not
using it itself (as it hasn't been unmasked).

IIRC PCI legacy interrupts are level triggered and low polarity, so we
could configure any pins that are not setup at bind time?

Thanks, Roger.