Re: [PATCH] KVM: nVMX: Dynamically compute max VMCS index for vmcs12

From: Robert Hoo
Date: Thu Jul 22 2021 - 00:02:24 EST


On Wed, 2021-07-21 at 16:18 +0000, Sean Christopherson wrote:
> On Wed, Jul 21, 2021, Hu, Robert wrote:
> > > > Queued, thanks. Without having checked the kvm-unit-tests
> > > > sources
> > > > very thoroughly, this might be a configuration issue in
> > > > kvm-unit-tests; in theory "-cpu host" (unlike "-cpu
> > > > host,migratable=no") should not enable TSC scaling.
> > >
> > > As noted in the code comments, KVM allows VMREAD/VMWRITE to all
> > > defined
> > > fields, whether or not the field should actually exist for the
> > > vCPU model doesn't
> > > enter into the equation. That's technically wrong as there are a
> > > number of
> > > fields that the SDM explicitly states exist iff a certain feature
> > > is supported.
> >
> > It's right that a number of fields' existence depends on some
> > certain feature.
> > Meanwhile, "IA32_VMX_VMCS_ENUM indicates to software the highest
> > index
> > value used in the encoding of any field *supported* by the
> > processor", rather than
> > *existed*.
>
> Yes.
>
> > So my understanding is no matter what VMCS exec control field's
> > value is set,
> > value of IA32_VMX_VMCS_ENUM shall not be affected, as it reports
> > the physical
> > CPU's capability rather than runtime VMCS configuration.
>
> Yes.
>
> > Back to nested case, L1's VMCS configuration lays the "physical"
> > capability
> > for L2, right?
>
> Yes.
>
> > nested_vmx_msrs or at least nested_vmx_msrs.vmcs_enum shall be put
> > to vcpu
> > scope, rather than current kvm global.
> >
> > Current nested_vmx_calc_vmcs_enum_msr() is invoked at early stage,
> > before
> > vcpu features are settled. I think should be moved to later stage
> > as well.
>
> Just moving the helper (or adding another call) wouldn't fix the
> underlying
> problem that KVM doesn't correctly model the interplay between VMX
> features and
> VMCS fields. It's arguably less wrong than letting userspace stuff
> an incorrect
> value, but it's not 100% correct and ignoring/overriding userspace is
> sketchy at
> best.
I think IA32_VMX_VMCS_ENUM MSR shall not be set by QEMU, it is actually
already indirectly set by user space CPU feature set.

> As suggested below, the full fix is to fail VMREAD/VMWRITE to fields
> that
> shouldn't exist given the vCPU model.
>
> > > To fix that we'd need to add a "feature flag" to
> > > vmcs_field_to_offset_table
> > > that is checked against the vCPU model, though updating the MSR
> > > would
> > > probably fall onto userspace's shoulders?
> > >
> > > And FWIW, this is the QEMU code:
> > >
> > > #define VMCS12_MAX_FIELD_INDEX (0x17)
> > >
> > > static void kvm_msr_entry_add_vmx(X86CPU *cpu, FeatureWordArray
> > > f)
> > > {
> > > ...
> > >
> > > /*
> > > * Just to be safe, write these with constant values. The
> > > CRn_FIXED1
> > > * MSRs are generated by KVM based on the vCPU's CPUID.
> > > */
> > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR0_FIXED0,
> > > CR0_PE_MASK | CR0_PG_MASK | CR0_NE_MASK);
> > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_CR4_FIXED0,
> > > CR4_VMXE_MASK);
> > > kvm_msr_entry_add(cpu, MSR_IA32_VMX_VMCS_ENUM,
> > > VMCS12_MAX_FIELD_INDEX << 1);
> > > }