Re: [PATCH] KVM: nVMX: remove side effects from nested_vmx_exit_reflected

From: Vitaly Kuznetsov
Date: Wed Mar 18 2020 - 06:52:56 EST


Paolo Bonzini <pbonzini@xxxxxxxxxx> writes:

> The name of nested_vmx_exit_reflected suggests that it's purely
> a test, but it actually marks VMCS12 pages as dirty. Move this to
> vmx_handle_exit, observing that the initial nested_run_pending check in
> nested_vmx_exit_reflected is pointless---nested_run_pending has just
> been cleared in vmx_vcpu_run and won't be set until handle_vmlaunch
> or handle_vmresume.
>
> Suggested-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/vmx/nested.c | 18 ++----------------
> arch/x86/kvm/vmx/nested.h | 1 +
> arch/x86/kvm/vmx/vmx.c | 19 +++++++++++++++++--
> 3 files changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 8578513907d7..4ff859c99946 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3527,7 +3527,7 @@ static void vmcs12_save_pending_event(struct kvm_vcpu *vcpu,
> }
>
>
> -static void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu)
> +void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu)
> {
> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
> gfn_t gfn;
> @@ -5543,8 +5543,7 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
> struct vcpu_vmx *vmx = to_vmx(vcpu);
> struct vmcs12 *vmcs12 = get_vmcs12(vcpu);
>
> - if (vmx->nested.nested_run_pending)
> - return false;
> + WARN_ON_ONCE(vmx->nested.nested_run_pending);
>
> if (unlikely(vmx->fail)) {
> trace_kvm_nested_vmenter_failed(
> @@ -5553,19 +5552,6 @@ bool nested_vmx_exit_reflected(struct kvm_vcpu *vcpu, u32 exit_reason)
> return true;
> }
>
> - /*
> - * The host physical addresses of some pages of guest memory
> - * are loaded into the vmcs02 (e.g. vmcs12's Virtual APIC
> - * Page). The CPU may write to these pages via their host
> - * physical address while L2 is running, bypassing any
> - * address-translation-based dirty tracking (e.g. EPT write
> - * protection).
> - *
> - * Mark them dirty on every exit from L2 to prevent them from
> - * getting out of sync with dirty tracking.
> - */
> - nested_mark_vmcs12_pages_dirty(vcpu);
> -
> trace_kvm_nested_vmexit(kvm_rip_read(vcpu), exit_reason,
> vmcs_readl(EXIT_QUALIFICATION),
> vmx->idt_vectoring_info,
> diff --git a/arch/x86/kvm/vmx/nested.h b/arch/x86/kvm/vmx/nested.h
> index 21d36652f213..f70968b76d33 100644
> --- a/arch/x86/kvm/vmx/nested.h
> +++ b/arch/x86/kvm/vmx/nested.h
> @@ -33,6 +33,7 @@ void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
> int get_vmx_mem_address(struct kvm_vcpu *vcpu, unsigned long exit_qualification,
> u32 vmx_instruction_info, bool wr, int len, gva_t *ret);
> void nested_vmx_pmu_entry_exit_ctls_update(struct kvm_vcpu *vcpu);
> +void nested_mark_vmcs12_pages_dirty(struct kvm_vcpu *vcpu);
> bool nested_vmx_check_io_bitmaps(struct kvm_vcpu *vcpu, unsigned int port,
> int size);
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index b447d66f44e6..07299a957d4a 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5851,8 +5851,23 @@ static int vmx_handle_exit(struct kvm_vcpu *vcpu,
> if (vmx->emulation_required)
> return handle_invalid_guest_state(vcpu);
>
> - if (is_guest_mode(vcpu) && nested_vmx_exit_reflected(vcpu, exit_reason))
> - return nested_vmx_reflect_vmexit(vcpu, exit_reason);
> + if (is_guest_mode(vcpu)) {
> + /*
> + * The host physical addresses of some pages of guest memory
> + * are loaded into the vmcs02 (e.g. vmcs12's Virtual APIC
> + * Page). The CPU may write to these pages via their host
> + * physical address while L2 is running, bypassing any
> + * address-translation-based dirty tracking (e.g. EPT write
> + * protection).
> + *
> + * Mark them dirty on every exit from L2 to prevent them from
> + * getting out of sync with dirty tracking.
> + */
> + nested_mark_vmcs12_pages_dirty(vcpu);
> +
> + if (nested_vmx_exit_reflected(vcpu, exit_reason))
> + return nested_vmx_reflect_vmexit(vcpu, exit_reason);
> + }
>
> if (exit_reason & VMX_EXIT_REASONS_FAILED_VMENTRY) {
> dump_vmcs();

The only functional difference seems to be that we're now doing
nested_mark_vmcs12_pages_dirty() in vmx->fail case too and this seems
superfluous: we failed to enter L2 so 'special' pages should remain
intact (right?) but this should be an uncommon case.

Reviewed-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>

--
Vitaly