Re: [PATCH] KVM: nVMX: allow tests to use bad virtual-APIC page address

From: Sean Christopherson
Date: Sat Apr 20 2019 - 01:35:21 EST


On Mon, Apr 15, 2019 at 03:35:33PM +0200, Paolo Bonzini wrote:
> As mentioned in the comment, there are some special cases where we can simply
> clear the TPR shadow bit from the CPU-based execution controls in the vmcs02.
> Handle them so that we can remove some XFAILs from vmx.flat.
>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/kvm/vmx/nested.c | 25 ++++++++++++++++---------
> arch/x86/kvm/vmx/vmx.c | 2 +-
> arch/x86/kvm/vmx/vmx.h | 2 ++
> 3 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 7ec9bb1dd723..a22af5a85540 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2873,20 +2873,27 @@ static void nested_get_vmcs12_pages(struct kvm_vcpu *vcpu)
> /*
> * If translation failed, VM entry will fail because
> * prepare_vmcs02 set VIRTUAL_APIC_PAGE_ADDR to -1ull.
> - * Failing the vm entry is _not_ what the processor
> - * does but it's basically the only possibility we
> - * have. We could still enter the guest if CR8 load
> - * exits are enabled, CR8 store exits are enabled, and
> - * virtualize APIC access is disabled; in this case
> - * the processor would never use the TPR shadow and we
> - * could simply clear the bit from the execution
> - * control. But such a configuration is useless, so
> - * let's keep the code simple.
> */
> if (!is_error_page(page)) {
> vmx->nested.virtual_apic_page = page;
> hpa = page_to_phys(vmx->nested.virtual_apic_page);
> vmcs_write64(VIRTUAL_APIC_PAGE_ADDR, hpa);
> + } else if (nested_cpu_has(vmcs12, CPU_BASED_CR8_LOAD_EXITING) &&
> + nested_cpu_has(vmcs12, CPU_BASED_CR8_STORE_EXITING) &&
> + !nested_cpu_has2(vmcs12, SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES)) {
> + /*
> + * The processor will never use the TPR shadow, simply
> + * clear the bit from the execution control. Such a
> + * configuration is useless, but it happens in tests.
> + * For any other configuration, failing the vm entry is
> + * _not_ what the processor does but it's basically the
> + * only possibility we have.
> + */
> + vmcs_clear_bits(CPU_BASED_VM_EXEC_CONTROL,
> + CPU_BASED_TPR_SHADOW);
> + } else {
> + printk("bad virtual-APIC page address\n");
> + dump_vmcs();

I don't think we should dump the VMCS here, or have any form of print
at all. dump_vmcs() is especially bad as it allows userspace to spam
the kernel log at the error level. I haven't actually checked, but I
assume the nested consistency check tracing patch would be a better
way to debug something of this nature in production.

> }
> }
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index f8054dc1de65..14cacfd7ffd0 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -5603,7 +5603,7 @@ static void vmx_dump_dtsel(char *name, uint32_t limit)
> vmcs_readl(limit + GUEST_GDTR_BASE - GUEST_GDTR_LIMIT));
> }
>
> -static void dump_vmcs(void)
> +void dump_vmcs(void)
> {
> u32 vmentry_ctl = vmcs_read32(VM_ENTRY_CONTROLS);
> u32 vmexit_ctl = vmcs_read32(VM_EXIT_CONTROLS);
> diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
> index a1e00d0a2482..f879529906b4 100644
> --- a/arch/x86/kvm/vmx/vmx.h
> +++ b/arch/x86/kvm/vmx/vmx.h
> @@ -517,4 +517,6 @@ static inline void decache_tsc_multiplier(struct vcpu_vmx *vmx)
> vmcs_write64(TSC_MULTIPLIER, vmx->current_tsc_ratio);
> }
>
> +void dump_vmcs(void);
> +
> #endif /* __KVM_X86_VMX_H */
> --
> 1.8.3.1
>