Re: [PATCH] KVM: x86: Allow XSAVES on CPUs where host doesn't use it due to an errata

From: Maciej S. Szmigiero
Date: Mon Nov 27 2023 - 12:48:24 EST


On 27.11.2023 18:24, Sean Christopherson wrote:
On Thu, Nov 23, 2023, Maciej S. Szmigiero wrote:
From: "Maciej S. Szmigiero" <maciej.szmigiero@xxxxxxxxxx>

Since commit b0563468eeac ("x86/CPU/AMD: Disable XSAVES on AMD family 0x17")
kernel unconditionally clears the XSAVES CPU feature bit on Zen1/2 CPUs.

Since KVM CPU caps are initialized from the kernel boot CPU features this
makes the XSAVES feature also unavailable for KVM guests in this case, even
though they might want to decide on their own whether they are affected by
this errata.

Allow KVM guests to make such decision by setting the XSAVES KVM CPU
capability bit based on the actual CPU capability

This is not generally safe, as the guest can make such a decision if and only if
the Family/Model/Stepping information is reasonably accurate.

If one lies to the guest about the CPU it is running on then obviously
things may work non-optimally.

This fixes booting Hyper-V enabled Windows Server 2016 VMs with more than
one vCPU on Zen1/2 CPUs.

How/why does lack of XSAVES break a multi-vCPU setup? Is Windows blindly doing
XSAVES based on FMS?

The hypercall from L2 Windows to L1 Hyper-V asking to boot the first AP
returns HV_STATUS_CPUID_XSAVE_FEATURE_VALIDATION_ERROR.

It's apparently a "should never happen" scenario for Windows since it
crashes soon after.

That's why uniprocessor configurations aren't broken - the BSP
doesn't need to be specifically booted by the L2 guest.

Unfortunately, Windows Server 2016 mainstream support has ended in
Jan 2022 so it is only getting security updates.
And you can't really break into an OS that you can't even start.

Thanks,
Maciej