Re: [PATCH v4 09/15] KVM: SEV: sync FPU and AVX state at LAUNCH_UPDATE_VMSA time

From: Michael Roth
Date: Mon Mar 25 2024 - 05:54:28 EST


On Mon, Mar 18, 2024 at 07:33:46PM -0400, Paolo Bonzini wrote:
> SEV-ES allows passing custom contents for x87, SSE and AVX state into the VMSA.
> Allow userspace to do that with the usual KVM_SET_XSAVE API and only mark
> FPU contents as confidential after it has been copied and encrypted into
> the VMSA.
>
> Since the XSAVE state for AVX is the first, it does not need the
> compacted-state handling of get_xsave_addr(). However, there are other
> parts of XSAVE state in the VMSA that currently are not handled, and
> the validation logic of get_xsave_addr() is pointless to duplicate
> in KVM, so move get_xsave_addr() to public FPU API; it is really just
> a facility to operate on XSAVE state and does not expose any internal
> details of arch/x86/kernel/fpu.
>
> Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Signed-off-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> arch/x86/include/asm/fpu/api.h | 3 +++
> arch/x86/kernel/fpu/xstate.h | 2 --
> arch/x86/kvm/svm/sev.c | 36 ++++++++++++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.c | 8 --------
> 4 files changed, 39 insertions(+), 10 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index a8300646a280..800e836a69fb 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> + xsave = &vcpu->arch.guest_fpu.fpstate->regs.xsave;
> + save->x87_dp = xsave->i387.rdp;
> + save->mxcsr = xsave->i387.mxcsr;
> + save->x87_ftw = xsave->i387.twd;
> + save->x87_fsw = xsave->i387.swd;
> + save->x87_fcw = xsave->i387.cwd;
> + save->x87_fop = xsave->i387.fop;
> + save->x87_ds = 0;
> + save->x87_cs = 0;
> + save->x87_rip = xsave->i387.rip;
> +
> + for (i = 0; i < 8; i++) {
> + d = save->fpreg_x87 + i * 10;
> + s = ((u8 *)xsave->i387.st_space) + i * 16;
> + memcpy(d, s, 10);
> + }
> + memcpy(save->fpreg_xmm, xsave->i387.xmm_space, 256);
> +
> + s = get_xsave_addr(xsave, XFEATURE_YMM);
> + if (s)
> + memcpy(save->fpreg_ymm, s, 256);
> + else
> + memset(save->fpreg_ymm, 0, 256);
> +
> pr_debug("Virtual Machine Save Area (VMSA):\n");
> print_hex_dump_debug("", DUMP_PREFIX_NONE, 16, 1, save, sizeof(*save), false);
>
> @@ -657,6 +686,13 @@ static int __sev_launch_update_vmsa(struct kvm *kvm, struct kvm_vcpu *vcpu,
> if (ret)
> return ret;
>
> + /*
> + * SEV-ES guests maintain an encrypted version of their FPU
> + * state which is restored and saved on VMRUN and VMEXIT.
> + * Mark vcpu->arch.guest_fpu->fpstate as scratch so it won't
> + * do xsave/xrstor on it.
> + */
> + fpstate_set_confidential(&vcpu->arch.guest_fpu);
> vcpu->arch.guest_state_protected = true;
> return 0;
> }
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index c22e87ebf0de..03108055a7b0 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1433,14 +1433,6 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
> vmsa_page = snp_safe_alloc_page(vcpu);
> if (!vmsa_page)
> goto error_free_vmcb_page;
> -
> - /*
> - * SEV-ES guests maintain an encrypted version of their FPU
> - * state which is restored and saved on VMRUN and VMEXIT.
> - * Mark vcpu->arch.guest_fpu->fpstate as scratch so it won't
> - * do xsave/xrstor on it.
> - */
> - fpstate_set_confidential(&vcpu->arch.guest_fpu);

There may have be userspaces that previously relied on KVM_SET_XSAVE
being silently ignored when calculating the expected VMSA measurement.
Granted, that's sort of buggy behavior on the part of userspace, but QEMU
for instance does this. In that case, it just so happens that QEMU's reset
values don't appear to affect the VMSA measurement/contents, but there may
be userspaces where it would.

To avoid this, and have parity with the other interfaces where the new
behavior is gated on the new vm_type/KVM_SEV_INIT2 stuff (via
has_protected_state), maybe should limited XSAVE/FPU sync'ing to
has_protected_state as well?

-Mike

> }
>
> err = avic_init_vcpu(svm);
> --
> 2.43.0
>
>