Re: [PATCH v6 18/25] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled"

From: Maxim Levitsky
Date: Tue Nov 07 2023 - 13:07:03 EST


On Fri, 2023-11-03 at 17:07 -0700, Sean Christopherson wrote:
> On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> > On Wed, 2023-11-01 at 08:46 -0700, Sean Christopherson wrote:
> > > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote:
> > > > > Use the governed feature framework to track whether X86_FEATURE_SHSTK
> > > > > and X86_FEATURE_IBT features can be used by userspace and guest, i.e.,
> > > > > the features can be used iff both KVM and guest CPUID can support them.
> > > > PS: IMHO The whole 'governed feature framework' is very confusing and
> > > > somewhat poorly documented.
> > > >
> > > > Currently the only partial explanation of it, is at 'governed_features',
> > > > which doesn't explain how to use it.
> > >
> > > To be honest, terrible name aside, I thought kvm_governed_feature_check_and_set()
> > > would be fairly self-explanatory, at least relative to all the other CPUID handling
> > > in KVM.
> >
> > What is not self-explanatory is what are the governed_feature and how to query them.
>
> ...
>
> > > > However thinking again about the whole thing:
> > > >
> > > > IMHO the 'governed features' is another quite confusing term that a KVM
> > > > developer will need to learn and keep in memory.
> > >
> > > I 100% agree, but I explicitly called out the terrible name in the v1 and v2
> > > cover letters[1][2], and the patches were on the list for 6 months before I
> > > applied them. I'm definitely still open to a better name, but I'm also not
> > > exactly chomping at the bit to get behind the bikehsed.
> >
> > Honestly I don't know if I can come up with a better name either. Name is
> > IMHO not the underlying problem, its the feature itself that is confusing.
>
> ...
>
> > > Yes and no. For "governed features", probably not. But for CPUID as a whole, there
> > > are legimiate cases where userspace needs to enumerate things that aren't officially
> > > "supported" by KVM. E.g. topology, core crystal frequency (CPUID 0x15), defeatures
> > > that KVM hasn't yet learned about, features that don't have virtualization controls
> > > and KVM hasn't yet learned about, etc. And for things like Xen and Hyper-V paravirt
> > > features, it's very doable to implement features that are enumerate by CPUID fully
> > > in userspace, e.g. using MSR filters.
> > >
> > > But again, it's a moot point because KVM has (mostly) allowed userspace to fully
> > > control guest CPUID for a very long time.
> > >
> > > > Such a feature which is advertised as supported but not really working is a
> > > > recipe of hard to find guest bugs IMHO.
> > > >
> > > > IMHO it would be much better to just check this condition and do
> > > > kvm_vm_bugged() or something in case when a feature is enabled in the guest
> > > > CPUID but KVM can't support it, and then just use guest CPUID in
> > > > 'guest_can_use()'.
> >
> > OK, I won't argue that much over this, however I still think that there are
> > better ways to deal with it.
> >
> > If we put optimizations aside (all of this can surely be optimized such as to
> > have very little overhead)
> >
> > How about we have 2 cpuids: Guest visible CPUID which KVM will never use directly
> > other than during initialization and effective cpuid which is roughly
> > what governed features are, but will include all features and will be initialized
> > roughly like governed features are initialized:
> >
> > effective_cpuid = guest_cpuid & kvm_supported_cpuid
> >
> > Except for some forced overrides like for XSAVES and such.
> >
> > Then we won't need to maintain a list of governed features, and guest_can_use()
> > for all features will just return the effective cpuid leafs.
> >
> > In other words, I want KVM to turn all known CPUID features to governed features,
> > and then remove all the mentions of governed features except 'guest_can_use'
> > which is a good API.
> >
> > Such proposal will use a bit more memory but will make it easier for future
> > KVM developers to understand the code and have less chance of introducing bugs.
>
> Hmm, two _full_ CPUID arrays would be a mess and completely unnecessary. E.g.
> we'd have to sort out Hyper-V and KVM PV, which both have their own caches. And
> a duplicate entry for things like F/M/S would be ridiculous.
>
> But maintaining a per-vCPU version of the CPU caps is definitely doable. I.e. a
> vCPU equivalent to kvm_cpu_caps and the per-CPU capabilities. There are currently
> 25 leafs that are tracked by kvm_cpu_caps, so relative to "governed" features,
> the cost will be 96 bytes per vCPU. I agree that 96 bytes is worth eating, we've
> certainly taken on more for a lot, lot less.
>
> It's a lot of churn, and there are some subtle nasties, e.g. MWAIT and other
> CPUID bits that changed based on MSRs or CR4, but most of the churn is superficial
> and the result is waaaaay less ugly than governed features and for the majority of
> features will Just Work.
>
> I'll get a series posted next week (need to write changelogs and do a _lot_ more
> testing). If you want to take a peek at where I'm headed before then:
>
> https://github.com/sean-jc/linux x86/guest_cpufeatures

This looks very good, looking forward to see the patches on the mailing list.

Best regards,
Maxim Levitsky

>