Re: [PATCH 6/9] KVM: SVM: Add MSR_IA32_XSS to the GHCB for hypervisor kernel

From: John Allen
Date: Tue Feb 20 2024 - 11:34:11 EST


On Tue, Feb 20, 2024 at 08:20:36AM -0800, Sean Christopherson wrote:
> On Thu, Feb 15, 2024, John Allen wrote:
> > On Tue, Nov 07, 2023 at 08:20:52PM +0200, Maxim Levitsky wrote:
> > > On Thu, 2023-11-02 at 16:22 -0700, Sean Christopherson wrote:
> > > > On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> > > > > On Tue, 2023-10-10 at 20:02 +0000, John Allen wrote:
> > > > > > @@ -3032,6 +3037,9 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm)
> > > > > > if (guest_cpuid_has(&svm->vcpu, X86_FEATURE_RDTSCP))
> > > > > > svm_clr_intercept(svm, INTERCEPT_RDTSCP);
> > > > > > }
> > > > > > +
> > > > > > + if (kvm_caps.supported_xss)
> > > > > > + set_msr_interception(vcpu, svm->msrpm, MSR_IA32_XSS, 1, 1);
> > > > >
> > > > > This is not just a virtualization hole. This allows the guest to set MSR_IA32_XSS
> > > > > to whatever value it wants, and thus it might allow XSAVES to access some host msrs
> > > > > that guest must not be able to access.
> > > > >
> > > > > AMD might not yet have such msrs, but on Intel side I do see various components
> > > > > like 'HDC State', 'HWP state' and such.
> > > >
> > > > The approach AMD has taken with SEV-ES+ is to have ucode context switch everything
> > > > that the guest can access. So, in theory, if/when AMD adds more XCR0/XSS-based
> > > > features, that state will also be context switched.
> > > >
> > > > Don't get me wrong, I hate this with a passion, but it's not *quite* fatally unsafe,
> > > > just horrific.
> > > >
> > > > > I understand that this is needed so that #VC handler could read this msr, and
> > > > > trying to read it will cause another #VC which is probably not allowed (I
> > > > > don't know this detail of SEV-ES)
> > > > >
> > > > > I guess #VC handler should instead use a kernel cached value of this msr
> > > > > instead, or at least KVM should only allow reads and not writes to it.
> > > >
> > > > Nope, doesn't work. In addition to automatically context switching state, SEV-ES
> > > > also encrypts the guest state, i.e. KVM *can't* correctly virtualize XSS (or XCR0)
> > > > for the guest, because KVM *can't* load the guest's desired value into hardware.
> > > >
> > > > The guest can do #VMGEXIT (a.k.a. VMMCALL) all it wants to request a certain XSS
> > > > or XCR0, and there's not a damn thing KVM can do to service the request.
> > > >
> > >
> > > Ah, I understand now. Everything makes sense, and yes, this is really ugly.
> >
> > Hi Maxim and Sean,
> >
> > It looks as though there are some broad changes that will need to happen
> > over the long term WRT to SEV-ES/SEV-SNP. In the short term, how would
> > you suggest I proceed with the SVM shstk series? Can we omit the SEV-ES
> > changes for now with an additional patch that disallows guest shstk when
> > SEV-ES is enabled? Subsequently, when we have a proper solution for the
> > concerns discussed here, we could submit another series for SEV-ES
> > support.
>
> The SEV-ES mess was already addressed by commit a26b7cd22546 ("KVM: SEV: Do not
> intercept accesses to MSR_IA32_XSS for SEV-ES guests"). Or is there more that's
> needed for shadow stacks?

Ah, yes, you are right. That patch should address the controversial
change discussed above at least. Patch 5/9 and 7/9 of this series also
address different SEV-ES issues and will still need to included.

Thanks,
John