Re: [PATCH v4 19/19] KVM: VMX: Skip VMCLEAR logic during emergency reboots if CR4.VMXE=0

From: Huang, Kai
Date: Tue Jul 25 2023 - 18:30:39 EST


On Tue, 2023-07-25 at 11:15 -0700, Sean Christopherson wrote:
> On Tue, Jul 25, 2023, Kai Huang wrote:
> > On Fri, 2023-07-21 at 13:18 -0700, Sean Christopherson wrote:
> > > Bail from vmx_emergency_disable() without processing the list of loaded
> > > VMCSes if CR4.VMXE=0, i.e. if the CPU can't be post-VMXON. It should be
> > > impossible for the list to have entries if VMX is already disabled, and
> > > even if that invariant doesn't hold, VMCLEAR will #UD anyways, i.e.
> > > processing the list is pointless even if it somehow isn't empty.
> > >
> > > Assuming no existing KVM bugs, this should be a glorified nop. The
> > > primary motivation for the change is to avoid having code that looks like
> > > it does VMCLEAR, but then skips VMXON, which is nonsensical.
> > >
> > > Suggested-by: Kai Huang <kai.huang@xxxxxxxxx>
> > > Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
> > > ---
> > > arch/x86/kvm/vmx/vmx.c | 12 ++++++++++--
> > > 1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index 5d21931842a5..0ef5ede9cb7c 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -773,12 +773,20 @@ static void vmx_emergency_disable(void)
> > >
> > > kvm_rebooting = true;
> > >
> > > + /*
> > > + * Note, CR4.VMXE can be _cleared_ in NMI context, but it can only be
> > > + * set in task context. If this races with VMX is disabled by an NMI,
> > > + * VMCLEAR and VMXOFF may #UD, but KVM will eat those faults due to
> > > + * kvm_rebooting set.
> > > + */
> >
> > I am not quite following this comment. IIUC this code path is only called from
> > NMI context in case of emergency VMX disable.
>
> The CPU that initiates the emergency reboot can invoke the callback from process
> context, only responding CPUs are guaranteed to be handled via NMI shootdown.
> E.g. `reboot -f` will reach this point synchronously.
>
> > How can it race with "VMX is disabled by an NMI"?
>
> Somewhat theoretically, a different CPU could panic() and do a shootdown of the
> CPU that is handling `reboot -f`.

Yeah this is the only case I can think of too.

Anyway, LGTM. Thanks for explaining.