Re: [PATCH v3 13/21] KVM:VMX: Emulate reads and writes to CET MSRs

From: Sean Christopherson
Date: Mon Jun 26 2023 - 17:15:24 EST


On Mon, Jun 26, 2023, Weijiang Yang wrote:
>
> On 6/24/2023 7:53 AM, Sean Christopherson wrote:
> > On Thu, May 11, 2023, Yang Weijiang wrote:
> > Side topic, what on earth does the SDM mean by this?!?
> >
> > The linear address written must be aligned to 8 bytes and bits 2:0 must be 0
> > (hardware requires bits 1:0 to be 0).
> >
> > I know Intel retroactively changed the alignment requirements, but the above
> > is nonsensical. If ucode prevents writing bits 2:0, who cares what hardware
> > requires?
>
> So do I ;-/

Can you follow-up with someone to get clarification? If writing bit 2 with '1'
does not #GP despite the statement that it "must be aligned", then KVM shouldn't
injected a #GP on that case.

> > > + return 1;
> > > + kvm_set_xsave_msr(msr_info);
> > > + break;
> > > case MSR_IA32_PERF_CAPABILITIES:
> > > if (data && !vcpu_to_pmu(vcpu)->version)
> > > return 1;
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index b6eec9143129..2e3a39c9297c 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -13630,6 +13630,26 @@ int kvm_sev_es_string_io(struct kvm_vcpu *vcpu, unsigned int size,
> > > }
> > > EXPORT_SYMBOL_GPL(kvm_sev_es_string_io);
> > > +bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr)
> > > +{
> > > + if (!kvm_cet_user_supported())
> > This feels wrong. KVM should differentiate between SHSTK and IBT in the host.
> > E.g. if running in a VM with SHSTK but not IBT, or vice versa, KVM should allow
> > writes to non-existent MSRs.
>
> I don't follow you, in this case, which part KVM is on behalf of? guest or
> user space?

Sorry, typo. KVM *shouldn't* allow writes to non-existent MSRs.

> > I.e. this looks wrong:
> >
> > /*
> > * If SHSTK and IBT are available in KVM, clear CET user bit in
> > * kvm_caps.supported_xss so that kvm_cet_user_supported() returns
> > * false when called.
> > */
> > if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK) &&
> > !kvm_cpu_cap_has(X86_FEATURE_IBT))
> > kvm_caps.supported_xss &= ~XFEATURE_MASK_CET_USER;
>
> The comment is wrong, it should be "are not available in KVM". My intent is,�
> if both features are not available in KVM, then clear the precondition bit so
> that all dependent checks will fail quickly.

Checking kvm_caps.supported_xss.CET_USER is worthless in 99% of the cases though.
Unless I'm missing something, the only time it's useful is for CR4.CET, which
doesn't differentiate between SHSTK and IBT. For everything else that KVM cares
about, at some point KVM needs to precisely check for SHSTK and IBT support
anyways

> > and by extension, all dependent code is also wrong. IIRC, there's a virtualization
> > hole, but I don't see any reason why KVM has to make the hole even bigger.
>
> Do you mean the issue that both SHSTK and IBT share one control MSR? i.e.,
> U_CET/S_CET?

I mean that passing through PLx_SSP if the host has IBT but *not* SHSTK is wrong.

> > > + return false;
> > > +
> > > + if (msr->host_initiated)
> > > + return true;
> > > +
> > > + if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
> > > + !guest_cpuid_has(vcpu, X86_FEATURE_IBT))
> > > + return false;
> > > +
> > > + if (msr->index == MSR_IA32_PL3_SSP &&
> > > + !guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
> > I probably asked this long ago, but if I did I since forgot. Is it really just
> > PL3_SSP that depends on SHSTK? I would expect all shadow stack MSRs to depend
> > on SHSTK.
>
> All PL{0,1,2,3}_SSP plus INT_SSP_TAB msr depend on SHSTK. In patch 21, I
> added more MSRs in this helper.

Sure, except that patch 21 never adds handling for PL{0,1,2}_SSP. I see:

if (!kvm_cet_user_supported() &&
!(kvm_cpu_cap_has(X86_FEATURE_IBT) ||
kvm_cpu_cap_has(X86_FEATURE_SHSTK)))
return false;

if (msr->host_initiated)
return true;

if (!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
!guest_cpuid_has(vcpu, X86_FEATURE_IBT))
return false;

/* The synthetic MSR is for userspace access only. */
if (msr->index == MSR_KVM_GUEST_SSP)
return false;

if (msr->index == MSR_IA32_U_CET)
return true;

if (msr->index == MSR_IA32_S_CET)
return guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
kvm_cet_kernel_shstk_supported();

if (msr->index == MSR_IA32_INT_SSP_TAB)
return guest_cpuid_has(vcpu, X86_FEATURE_SHSTK) &&
kvm_cet_kernel_shstk_supported();

if (msr->index == MSR_IA32_PL3_SSP &&
!guest_cpuid_has(vcpu, X86_FEATURE_SHSTK))
return false;

mask = (msr->index == MSR_IA32_PL3_SSP) ? XFEATURE_MASK_CET_USER :
XFEATURE_MASK_CET_KERNEL;
return !!(kvm_caps.supported_xss & mask);

Which means that KVM will allow guest accesses to PL{0,1,2}_SSP regardless of
whether or not X86_FEATURE_SHSTK is enumerated to the guest.

And the above is also wrong for host_initiated writes to SHSTK MSRs. E.g. if KVM
is running on a CPU that has IBT but not SHSTK, then userspace can write to MSRs
that do not exist.

Maybe this confusion is just a symptom of the series not providing proper
Supervisor Shadow Stack support, but that's still a poor excuse for posting
broken code.

I suspect you tried to get too fancy. I don't see any reason to ever care about
kvm_caps.supported_xss beyond emulating writes to XSS itself. Just require that
both CET_USER and CET_KERNEL are supported in XSS to allow IBT or SHSTK, i.e. let
X86_FEATURE_IBT and X86_FEATURE_SHSTK speak for themselves. That way, this can
simply be:

bool kvm_cet_is_msr_accessible(struct kvm_vcpu *vcpu, struct msr_data *msr)
{
if (is_shadow_stack_msr(...))
if (!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
return false;

return msr->host_initiated ||
guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
}

if (!kvm_cpu_cap_has(X86_FEATURE_IBT) &&
!kvm_cpu_cap_has(X86_FEATURE_SHSTK))
return false;

return msr->host_initiated ||
guest_cpuid_has(vcpu, X86_FEATURE_IBT) ||
guest_cpuid_has(vcpu, X86_FEATURE_SHSTK);
}

> > > + * and reload the guest fpu states before read/write xsaves-managed MSRs.
> > > + */
> > > +static inline void kvm_get_xsave_msr(struct msr_data *msr_info)
> > > +{
> > > + fpregs_lock_and_load();
> > KVM already has helpers that do exactly this, and they have far better names for
> > KVM: kvm_fpu_get() and kvm_fpu_put(). Can you convert kvm_fpu_get() to
> > fpregs_lock_and_load() and use those isntead? And if the extra consistency checks
> > in fpregs_lock_and_load() fire, we definitely want to know, as it means we probably
> > have bugs in KVM.
>
> Do you want me to do some experiments to make sure the WARN()� in
> fpregs_lock_and load() would be triggered or not?

Yes, though I shouldn't have to clarify that. The well-documented (as of now)
expectation is that any code that someone posts is tested, unless explicitly
stated otherwise. I.e. you should not have to ask if you should verify the WARN
doesn't trigger, because you should be doing that for all code you post.

> If no WARN() trigger, then replace fpregs_lock_and_load()/fpregs_unlock()
> with kvm_fpu_get()/
>
> kvm_fpu_put()?

Yes.