Re: [PATCH 5/5] x86/kvm: Add KVM_FEATURE_MSI_EXT_DEST_ID

From: David Woodhouse
Date: Tue Oct 13 2020 - 07:51:12 EST


On Tue, 2020-10-13 at 11:53 +0100, David Woodhouse wrote:
> On Tue, 2020-10-13 at 12:46 +0200, Thomas Gleixner wrote:
> > And after becoming more awake, that wont work anyway because there is
> > more than one IR domain, so there is no way to return an error "You
> > forgot to register" obviously.
> >
> > But the APIC id (32768) valid check is also broken because IR can be
> > enabled even without X2APIC.
>
> Nope, it's perfectly OK to allow HPET and I/OAPIC to be parented in the
> x86_vector_domain in that case, regardless of IR.
>
> The *actual* criterion for x86_vector_select() returning zero to say
> "don't use me" is "could there be CPUs in the system which can't be
> reached through x86_vector_msi_compose_msg()?". It's not really about
> IR at all.
>
> The apic_id_valid(32768) check is checking precisely the right thing.

With that realisation, I've fixed the comment in my ext_dest_id branch
to remove all mention of IRQ remapping. It now looks like this:

static int x86_vector_select(struct irq_domain *d, struct irq_fwspec *fwspec,
enum irq_domain_bus_token bus_token)
{
/*
* HPET and I/OAPIC drivers use irq_find_matching_irqdomain()
* to find their parent irqdomain. For x86_vector_domain to be
* suitable, all CPUs in the system must be reachable by its
* x86_vector_msi_compose_msg() function. Which is only true
* in !x2apic mode, or in x2apic physical mode if APIC IDs were
* restricted to 8 or 15 bits at boot time. In those cases,
* 1<<15 will *not* be a valid APIC ID.
*/
if (apic->apic_id_valid(1<<15))
return 0;

return x86_fwspec_is_ioapic(fwspec) || x86_fwspec_is_hpet(fwspec);
}

That makes it clearer that this isn't just some incestuous interaction
with IRQ remapping — that APIC ID limit really is the basis on which
this irqdomain, all by itself, makes the decision about whether it's
capable of being the parent irqdomain to the requesting device.

It just so happens that *if* you allow higher APIC IDs in the system
and *if* no other irqdomains exist which take ownership of a given HPET
or I/OAPIC, then that child device will fail to find a suitable parent
irqdomain and will fail to initialise. And that's just how we want it.

Sure, we have that special case in x2apic startup where we ensure that
if we don't have IR then we limit the APIC IDs. But that's there, and
*this* code is perfectly self-contained and isolated.

Attachment: smime.p7s
Description: S/MIME cryptographic signature