Re: [PATCH 2/5] KVM: x86: Constrain guest-supported xfeatures only at KVM_GET_XSAVE{2}

From: Dave Hansen
Date: Thu Sep 28 2023 - 10:09:59 EST


On 9/27/23 17:19, Sean Christopherson wrote:
> Mask off xfeatures that aren't exposed to the guest only when saving guest
> state via KVM_GET_XSAVE{2} instead of modifying user_xfeatures directly.
> Preserving the maximal set of xfeatures in user_xfeatures restores KVM's
> ABI for KVM_SET_XSAVE, which prior to commit ad856280ddea ("x86/kvm/fpu:
> Limit guest user_xfeatures to supported bits of XCR0") allowed userspace
> to load xfeatures that are supported by the host, irrespective of what
> xfeatures are exposed to the guest.
>
> There is no known use case where userspace *intentionally* loads xfeatures
> that aren't exposed to the guest, but the bug fixed by commit ad856280ddea
> was specifically that KVM_GET_SAVE{2} would save xfeatures that weren't
> exposed to the guest, e.g. would lead to userspace unintentionally loading
> guest-unsupported xfeatures when live migrating a VM.
>
> Restricting KVM_SET_XSAVE to guest-supported xfeatures is especially
> problematic for QEMU-based setups, as QEMU has a bug where instead of
> terminating the VM if KVM_SET_XSAVE fails, QEMU instead simply stops
> loading guest state, i.e. resumes the guest after live migration with
> incomplete guest state, and ultimately results in guest data corruption.
>
> Note, letting userspace restore all host-supported xfeatures does not fix
> setups where a VM is migrated from a host *without* commit ad856280ddea,
> to a target with a subset of host-supported xfeatures. However there is
> no way to safely address that scenario, e.g. KVM could silently drop the
> unsupported features, but that would be a clear violation of KVM's ABI and
> so would require userspace to opt-in, at which point userspace could
> simply be updated to sanitize the to-be-loaded XSAVE state.
Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>

It's surprising (and nice) that this takes eliminates the !guest check
in fpstate_realloc().