Re: [PATCH] KVM: SEV: Fix handling of EFER_LMA bit when SEV-ES is enabled

From: Paolo Bonzini
Date: Fri Dec 08 2023 - 13:39:36 EST


On Wed, Dec 6, 2023 at 4:28 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> Blech. This is a hack to fix even worse hacks. KVM ignores CR0/CR4/EFER values
> that are set via KVM_SET_SREGS, i.e. KVM is rejecting an EFER value that it will
> never consume, which is ridiculous. And the fact that you're not trying to have
> KVM actually set state further strengthens my assertion that tracking CR0/CR4/EFER
> in KVM is pointless necessary for SEV-ES+ guests[1].

I agree that KVM is not going to consume CR0/CR4/EFER. I disagree that
it's a good idea to have a value of vcpu->arch.efer that is
architecturally impossible (so much so that it would fail vmentry in a
non-SEV-ES guest).

I also agree that changing the source is not particularly useful, but
then changing the destination can be easily done in userspace.

In other words, bugfix or not this can and should be merged as a code
cleanup (though your older "[PATCH 1/2] KVM: SVM: Update EFER software
model on CR0 trap for SEV-ES" is nicer in that it clarifies that
svm->vmcb->save.efer is not used, and that's what I would like to
apply).

> So my very strong preference is to first skip the kvm_is_valid_sregs() check

No, please don't. If you want to add a quirk that, when disabled,
causes all guest state get/set ioctls to fail, go ahead. But invalid
processor state remains invalid, and should be rejected, even when KVM
won't consume it.

> My understanding is that SVM_VMGEXIT_AP_CREATION is going to force KVM to assume
> maximal state anyways since KVM will have no way of verifying what state is actually
> shoved into the VMSA, i.e. emulating INIT is wildly broken[2].

Yes, or alternatively a way to pass CR0/CR4/EFER from the guest should
be included in the VMGEXIT spec.

> Side topic, Peter suspected that KVM _does_ need to let userspace set CR8 since
> that's not captured in the VMSA[3].

Makes sense, and then we would have to apply the 2/2 patch from 2021
as well. But for now I'll leave that aside.

Paolo

> [1] https://lore.kernel.org/all/YJla8vpwqCxqgS8C@xxxxxxxxxx
> [2] https://lore.kernel.org/all/20231016132819.1002933-38-michael.roth@xxxxxxx
> [3] https://lore.kernel.org/all/CAMkAt6oL9tfF5rvP0htbQNDPr50Zk41Q4KP-dM0N+SJ7xmsWvw@xxxxxxxxxxxxxx
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 2c924075f6f1..6fb2b913009e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11620,7 +11620,8 @@ static int __set_sregs_common(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs,
> int idx;
> struct desc_ptr dt;
>
> - if (!kvm_is_valid_sregs(vcpu, sregs))
> + if (!vcpu->arch.guest_state_protected &&
> + !kvm_is_valid_sregs(vcpu, sregs))
> return -EINVAL;
>
> apic_base_msr.data = sregs->apic_base;
>