Re: [PATCH] x86/fpu/xstate: clear XSAVE features if DISABLED_MASK set

From: Sean Christopherson
Date: Tue Jun 06 2023 - 19:40:50 EST


On Mon, Jun 05, 2023, Jon Kohler wrote:
> > On May 31, 2023, at 5:09 PM, Jon Kohler <jon@xxxxxxxxxxx> wrote:
> >> The CPUID bits that enumerate support for a feature are independent from the CPUID
> >> bits that enumerate what XCR0 bits are supported, i.e. what features can be saved
> >> and restored via XSAVE/XRSTOR.
> >>
> >> KVM does mostly account for host XCR0, but in a very ad hoc way. E.g. MPX is
> >> handled by manually checking host XCR0.
> >>
> >> if (kvm_mpx_supported())
> >> kvm_cpu_cap_check_and_set(X86_FEATURE_MPX);
> >>
> >> PKU manually checks too, but indirectly by looking at whether or not the kernel
> >> has enabled CR4.OSPKE.
> >>
> >> if (!tdp_enabled || !boot_cpu_has(X86_FEATURE_OSPKE))
> >> kvm_cpu_cap_clear(X86_FEATURE_PKU);
> >>
> >> But unless I'm missing something, the various AVX and AMX bits rely solely on
> >> boot_cpu_data, i.e. would break if someone added CONFIG_X86_AVX or CONFIG_X86_AMX.
> >
> > What if we simply moved static unsigned short xsave_cpuid_features[] … into
> > xstate.h, which is already included in arch/x86/kvm/cpuid.c, and do
> > something similar to what I’m proposing in this patch already
> >
> > This would future proof such breakages I’d imagine?
> >
> > void kvm_set_cpu_caps(void)
> > {
> > ...
> > /*
> > * Clear CPUID for XSAVE features that are disabled.
> > */
> > for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
> > unsigned short cid = xsave_cpuid_features[i];
> >
> > /* Careful: X86_FEATURE_FPU is 0! */
> > if ((i != XFEATURE_FP && !cid) || !boot_cpu_has(cid) ||
> > !cpu_feature_enabled(cid))
> > kvm_cpu_cap_clear(cid);
> > }
> > …
> > }
> >
>
> Sean - following up on this rough idea code above, wanted to validate that
> this was the direction you were thinking of having kvm_set_cpu_caps() clear
> caps when a particular xsave feature was disabled?

Ya, more or or less. But for KVM, that should be kvm_cpu_cap_has(), not boot_cpu_has().
And then I think KVM could actually WARN on a feature being disabled, i.e. put up
a tripwire to detect if things change in the future and the kernel lets the user
disable a feature that KVM wants to expose to a guest.

Side topic, I find the "cid" nomenclature super confusing, and the established
name in KVM is x86_feature.

Something like this?

for (i = 0; i < ARRAY_SIZE(xsave_cpuid_features); i++) {
unsigned int x86_feature = xsave_cpuid_features[i];

if (i != XFEATURE_FP && !x86_feature)
continue;

if (!kvm_cpu_cap_has(x86_feature))
continue;

if (WARN_ON_ONCE(!cpu_feature_enabled(x86_feature)))
kvm_cpu_cap_clear(x86_feature);
}