Re: [PATCH v2 9/9] KVM: x86: never write to memory from kvm_vcpu_check_block

From: Maxim Levitsky
Date: Wed Aug 17 2022 - 10:11:59 EST


On Tue, 2022-08-16 at 23:45 +0000, Sean Christopherson wrote:
> On Thu, Aug 11, 2022, Paolo Bonzini wrote:
> > kvm_vcpu_check_block() is called while not in TASK_RUNNING, and therefore
> > it cannot sleep.  Writing to guest memory is therefore forbidden, but it
> > can happen on AMD processors if kvm_check_nested_events() causes a vmexit.
> >
> > Fortunately, all events that are caught by kvm_check_nested_events() are
> > also recognized by kvm_vcpu_has_events() through vendor callbacks such as
> > kvm_x86_interrupt_allowed() or kvm_x86_ops.nested_ops->has_events(), so
> > remove the call and postpone the actual processing to vcpu_block().
> >
> > Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> > ---
> >  arch/x86/kvm/x86.c | 14 +++++++++++---
> >  1 file changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 5e9358ea112b..9226fd536783 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -10639,6 +10639,17 @@ static inline int vcpu_block(struct kvm_vcpu *vcpu)
> >                         return 1;
> >         }
> >  
> > +       if (is_guest_mode(vcpu)) {
> > +               /*
> > +                * Evaluate nested events before exiting the halted state.
> > +                * This allows the halt state to be recorded properly in
> > +                * the VMCS12's activity state field (AMD does not have
> > +                * a similar field and a vmexit always causes a spurious
> > +                * wakeup from HLT).
> > +                */

I assume that the comment refers to the fact that nested_vmx_vmexit due to event
on the HLT instruction, will trigger update of the 'vmcs12->guest_activity_state'
so it should be done before we update the 'vcpu->arch.mp_state'


> > +               kvm_check_nested_events(vcpu);
>
> Formatting nit, I'd prefer the block comment go above the if-statement, that way
> we avoiding debating whether or not the technically-unnecessary braces align with
> kernel/KVM style, and it doesn't have to wrap as aggressively.
>
> And s/vmexit/VM-Exit while I'm nitpicking.
>
>         /*
>          * Evaluate nested events before exiting the halted state.  This allows
>          * the halt state to be recorded properly in the VMCS12's activity
>          * state field (AMD does not have a similar field and a VM-Exit always
>          * causes a spurious wakeup from HLT).
>          */
>         if (is_guest_mode(vcpu))
>                 kvm_check_nested_events(vcpu);
>
> Side topic, the AMD behavior is a bug report waiting to happen.  I know of at least
> one customer failure that was root caused to a KVM bug where KVM caused a spurious
> wakeup.  To be fair, the guest workload was being stupid (execute HLT on vCPU and
> then effectively unmap its code by doing kexec), but it's still an unpleasant gap :-(

Reviewed-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>

Best regards,
Maxim Levitsky

>