Re: [PATCH RFC 10/39] KVM: x86/xen: support upcall vector

From: Ankur Arora
Date: Wed Dec 09 2020 - 01:37:06 EST


On 2020-12-08 8:08 a.m., David Woodhouse wrote:
On Wed, 2020-12-02 at 19:02 +0000, David Woodhouse wrote:

I feel we could just accommodate it as subtype in KVM_XEN_ATTR_TYPE_CALLBACK_VIA.
Don't see the adavantage in having another xen attr type.

Yeah, fair enough.

But kinda have mixed feelings in having kernel handling all event channels ABI,
as opposed to only the ones userspace asked to offload. It looks a tad unncessary besides
the added gain to VMMs that don't need to care about how the internals of event channels.
But performance-wise it wouldn't bring anything better. But maybe, the former is reason
enough to consider it.

Yeah, we'll see. Especially when it comes to implementing FIFO event
channels, I'd rather just do it in one place — and if the kernel does
it anyway then it's hardly difficult to hook into that.

But I've been about as coherent as I can be in email, and I think we're
generally aligned on the direction. I'll do some more experiments and
see what I can get working, and what it looks like.


So... I did some more typing, and revived our completely userspace
based implementation of event channels. I wanted to declare that such
was *possible*, and that using the kernel for IPI and VIRQ was just a
very desirable optimisation.

It looks like Linux doesn't use the per-vCPU upcall vector that you
called 'KVM_XEN_CALLBACK_VIA_EVTCHN'. So I'm delivering interrupts via
KVM_INTERRUPT as if they were ExtINT....

... except I'm not. Because the kernel really does expect that to be an
ExtINT from a legacy PIC, and kvm_apic_accept_pic_intr() only returns
true if LVT0 is set up for EXTINT and unmasked.

I messed around with this hack and increasingly desperate variations on
the theme (since this one doesn't cause the IRQ window to be opened to
userspace in the first place), but couldn't get anything working:

Increasingly desperate variations, about sums up my process as well while
trying to get the upcall vector working.


--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2380,6 +2380,9 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
if ((lvt0 & APIC_LVT_MASKED) == 0 &&
GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
r = 1;
+ /* Shoot me. */
+ if (vcpu->arch.pending_external_vector == 243)
+ r = 1;
return r;
}

Eventually I resorted to delivering the interrupt through the lapic
*anyway* (through KVM_SIGNAL_MSI with an MSI message constructed for
the appropriate vCPU/vector) and the following hack to auto-EOI:

--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2416,7 +2419,7 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
*/
apic_clear_irr(vector, apic);
- if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
+ if (vector == 243 || test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
/*
* For auto-EOI interrupts, there might be another pending
* interrupt above PPR, so check whether to raise another


That works, and now my guest finishes the SMP bringup (and gets as far
as waiting on the XenStore implementation that I haven't put back yet).

So I think we need at least a tiny amount of support in-kernel for
delivering event channel interrupt vectors, even if we wanted to allow
for a completely userspace implementation.

Unless I'm missing something?

I did use the auto_eoi hack as well. So, yeah, I don't see any way of
getting around this.

Also, IIRC we had eventually gotten rid of the auto_eoi approach
because that wouldn't work with APICv. At that point we resorted to
direct queuing for vectored callbacks which was a hack that I never
grew fond of...
I will get on with implementing the in-kernel handling with IRQ routing
entries targeting a given { port, vcpu }. And I'm kind of vacillating
about whether the mode/vector should be separately configured, or
whether they might as well be in the IRQ routing table too, even if
it's kind of redundant because it's specified the same for *every* port
targeting the same vCPU. I *think* I prefer that redundancy over having
a separate configuration mechanism to set the vector for each vCPU. But
we'll see what happens when my fingers do the typing...


Good luck to your fingers!

Ankur