Re: [PATCH v5 1/2] KVM: VMX: Cleanup VMX basic information defines and usages

From: Sean Christopherson
Date: Tue Feb 13 2024 - 17:54:31 EST


Please send cover letters for series with more than one patch, even if there are
only two patches. At the very least, cover letters are a convenient location to
provide feedback/communication for the series as a whole. Instead, I need to
put it here:

I'll send a v6 with all of my suggestions incorporated. I like the cleanups, but
there are too many process issues to fixup when applying, a few things that I
straight up disagree with, and more aggressive memtype related changes that can
be done in the context of this series.

On Tue, Feb 06, 2024, Xin Li wrote:
> Define VMX basic information fields with BIT_ULL()/GENMASK_ULL(), and
> replace hardcoded VMX basic numbers with these field macros.
>
> Save the full/raw value of MSR_IA32_VMX_BASIC in the global vmcs_config
> as type u64 to get rid of the hi/lo crud, and then use VMX_BASIC helpers
> to extract info as needed.
>
> VMX_EPTP_MT_{WB,UC} values 0x6 and 0x0 are generic x86 memory type
> values, no need to prefix them with VMX_EPTP_.

*sigh*

This obviously, like super duper obviously, should be at least three distinct
patches. The changelog has three paragraphs that have *zero* relation to each
other, and the changelog doesn't even cover all of the opportunistic cleanups
that are being done.

> +/* x86 memory types, explicitly used in VMX only */
> +#define MEM_TYPE_WB 0x6ULL
> +#define MEM_TYPE_UC 0x0ULL

No, this is ridiculous. These values are architectural, there's no reason for
KVM to have yet another copy. The MTRRs #defines have goofy names, and are
incomplete, but it's trivial to move the enums from pat/memtype.c to msr-index.h.

> @@ -505,8 +521,6 @@ enum vmcs_field {
> #define VMX_EPTP_PWL_5 0x20ull
> #define VMX_EPTP_AD_ENABLE_BIT (1ull << 6)
> #define VMX_EPTP_MT_MASK 0x7ull
> -#define VMX_EPTP_MT_WB 0x6ull
> -#define VMX_EPTP_MT_UC 0x0ull

I would strongly prefer to keep the VMX_EPTP_MT_WB and VMX_EPTP_MT_UC defines,
at least so long as KVM is open coding reads and writes to the EPTP. E.g. if
someone wants to do a follow-up series that adds wrappers to decode/encode the
memtype (and other fiels) from/to EPTP values, then I'd be fine dropping these.

But this:


/* Check for memory type validity */
switch (new_eptp & VMX_EPTP_MT_MASK) {
case MEM_TYPE_UC:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_UC_BIT)))
return false;
break;
case MEM_TYPE_WB:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPTP_WB_BIT)))
return false;
break;
default:
return false;
}

looks wrong and is actively confusing, especially when the code below it does:

/* Page-walk levels validity. */
switch (new_eptp & VMX_EPTP_PWL_MASK) {
case VMX_EPTP_PWL_5:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_5_BIT)))
return false;
break;
case VMX_EPTP_PWL_4:
if (CC(!(vmx->nested.msrs.ept_caps & VMX_EPT_PAGE_WALK_4_BIT)))
return false;
break;
default:
return false;
}

> static inline bool cpu_has_virtual_nmis(void)
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 994e014f8a50..80fea1875948 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -1226,23 +1226,29 @@ static bool is_bitwise_subset(u64 superset, u64 subset, u64 mask)
> return (superset | subset) == superset;
> }
>
> +#define VMX_BASIC_FEATURES_MASK \
> + (VMX_BASIC_DUAL_MONITOR_TREATMENT | \
> + VMX_BASIC_INOUT | \
> + VMX_BASIC_TRUE_CTLS)
> +
> +#define VMX_BASIC_RESERVED_BITS \
> + (GENMASK_ULL(63, 56) | GENMASK_ULL(47, 45) | BIT_ULL(31))

Looking at this with fresh eyes, I think #defines are overkill. There is zero
chance anything other than vmx_restore_vmx_basic() will use these, and the feature
bits mask is rather weird. It's not a mask of features that KVM supports, it's
a mask of feature *bits* that KVM knows about.

So rather than add #defines, I think we can keep "const u64" variables, but split
into feature_bits and reserved_bits (the latter will have open coded GENMASK_ULL()
usage, whereas the former will not).

BUILD_BUG_ON() is fancy enough that it can detect overlap.

> @@ -6994,6 +7000,9 @@ static void nested_vmx_setup_misc_data(struct vmcs_config *vmcs_conf,
> msrs->misc_high = 0;
> }
>
> +#define VMX_BSAIC_VMCS12_SIZE ((u64)VMCS12_SIZE << 32)

Typo.

> +#define VMX_BASIC_MEM_TYPE_WB (MEM_TYPE_WB << 50)

I don't see any value in either of these. In fact, I find them both to be far
more confusing, and much more likely to be incorrectly used.

Back in v1, when I said "don't bother with shift #defines", I was very specifically
talking about feature bits where defining the bit shift is an extra, pointless
layer. I even (tried) to clarify that.