Re: [PATCH v3 07/28] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

From: Maxim Levitsky
Date: Wed Sep 28 2022 - 13:40:54 EST


On Wed, 2022-09-28 at 16:33 +0000, Sean Christopherson wrote:
> On Wed, Sep 28, 2022, Maxim Levitsky wrote:
> > On Mon, 2022-09-26 at 17:00 +0000, Sean Christopherson wrote:
> > > Given the SRCU problem, I'd prefer to keep the management of the memslot in common
> > > code, even though I agree it's a bit silly. And KVM_REQ_UNBLOCK is a perfect fit
> > > for dealing with the SRCU issue, i.e. handling this in AVIC code would require
> > > another hook on top of spreading the memslot management across x86 and SVM code.
> >
> > OK, I am not going to argue about this. But what about at least not using an inhibit
> > bit for that but something else like a boolean, or maybe really add 'I am AVIC bit'
> > or rather something like vcpu->arch.apicv_type enum?
> >
> > Or we can make SVM code just call a common function - just put these in a
> > function and call it from avic_set_virtual_apic_mode?
>
> The issue is that kvm_vcpu_update_apicv() is called from kvm_lapic_set_base(),
> which is the one that may or may not hold SRCU.

Makes sense now.

>
> > > > You are about to remove the KVM_REQ_UNBLOCK in other patch series.
> > >
> > > No, KVM_REQ_UNHALT is being removed. KVM_REQ_UNBLOCK needs to stay, although it
> > > has a rather weird name, e.g. KVM_REQ_WORK would probably be better.
> >
> > Roger that!
> > And I guess lets rename it while we are at it.
>
> I'll prep a patch.
>
> > > > How about just raising KVM_REQ_APICV_UPDATE on current vCPU
> > > > and having a special case in kvm_vcpu_update_apicv of
> > > >
> > > > if (apic_access_memslot_enabled == false && apic_access_memslot_allocaed == true) {
> > > > drop srcu lock
> > >
> > > This was my initial thought as well, but the issue is that SRCU may or may not be
> > > held, and so the unlock+lock would need to be conditional. That's technically a
> > > solvable problem, as it's possible to detect if SRCU is held, but I really don't
> > > want to rely on kvm_vcpu.srcu_depth for anything other than proving that KVM doesn't
> > > screw up SRCU.
> >
> > Why though? the KVM_REQ_APICV_UPDATE is only handled AFAIK in vcpu_enter_guest
> > which drops the srcu lock few lines afterwards, and therefore the
> > kvm_vcpu_update_apicv is always called with the lock held and it means that it
> > can drop it for the duration of slot update.
> >
> > The original issue we had was that we tried to drop the srcu lock in
> > 'kvm_set_apicv_inhibit' which indeed is called from various places,
> > with, or without the lock held.
> >
> > Moving the memslot disable code to kvm_vcpu_update_apicv would actually solve
> > that, but it was not possible because kvm_vcpu_update_apicv is called
> > simultaneously on all vCPUs, and created various races, including toggling
> > the memslot twice.
>
> As above, kvm_vcpu_update_apicv() can be called without SRCU held. Oh, but that
> was a recent addition, commit 8fc9c7a3079e ("KVM: x86: Deactivate APICv on vCPU
> with APIC disabled"). I was wary of using KVM_REQ_APICV_UPDATE in kvm_lapic_set_base(),
> e.g. in case there was some dependency on updating _immediately, but since that's
> such a new addition I have no objection to switching to the request.
>
> Similarly, is there a good reason for having nested_svm_vmexit() invoke
> kvm_vcpu_update_apicv() directly? I'm confused by the "so that other vCPUs can
> start to benefit from it right away". The nested inhibit is per-vCPU and so
> should only affect the current vCPU, no? I.e. for all intents and purposes, using
> a request should be functionally equivalent.

It is kind of the other way around:

The mere fact of switching to vmcb02 *inhibits* the AVIC on the current vCPU,
but the AVIC inhibit is there only to set the is_running bits in the physid table
and in IOMMU to prevent its *peers* to try and send interrupts to it via AVIC.

It is the reason why APICv doesn't need it - the posted interrupts still work
just fine when a vCPU doens't use APICv, or uses a different posted interrupt vector
when it uses the nested APICv.

So it makes sense to remove that inhibit as soon as possible that the peers
could stop getting 'unaccellerated IPI' vmexits for nothing.


However back to the discussion, I don't think this is a problem.

We can just call both the kvm_vcpu_update_apicv() and a new function that
does the memslot disable from KVM_REQ_APICV_UPDATE, then
plain kvm_vcpu_update_apicv() won't need to drop the srcu lock.

It is pretty much the same that you proposed, just instead of piggybacking on
KVM_REQ_UNBLOCK, I proposed to piggyback on KVM_REQ_APICV_UPDATE.


Best regards,
Maxim Levitsky


>
> /*
> * Un-inhibit the AVIC right away, so that other vCPUs can start
> * to benefit from it right away.
> */
> if (kvm_apicv_activated(vcpu->kvm))
> kvm_vcpu_update_apicv(vcpu);
>