Re: [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs

From: Maxim Levitsky
Date: Tue Nov 07 2023 - 13:13:07 EST


On Thu, 2023-11-02 at 16:58 -0700, Sean Christopherson wrote:
> On Thu, Nov 02, 2023, Maxim Levitsky wrote:
> > On Wed, 2023-11-01 at 09:31 -0700, Sean Christopherson wrote:
> > > On Tue, Oct 31, 2023, Maxim Levitsky wrote:
> > > > On Thu, 2023-09-14 at 02:33 -0400, Yang Weijiang wrote:
> > > > > Add emulation interface for CET MSR access. The emulation code is split
> > > > > into common part and vendor specific part. The former does common check
> > > > > for MSRs and reads/writes directly from/to XSAVE-managed MSRs via the
> > > > > helpers while the latter accesses the MSRs linked to VMCS fields.
> > > > >
> > > > > Signed-off-by: Yang Weijiang <weijiang.yang@xxxxxxxxx>
> > > > > ---
> > >
> > > ...
> > >
> > > > > + case MSR_IA32_PL0_SSP ... MSR_IA32_PL3_SSP:
> > > > > + case MSR_KVM_SSP:
> > > > > + if (host_msr_reset && kvm_cpu_cap_has(X86_FEATURE_SHSTK))
> > > > > + break;
> > > > > + if (!guest_can_use(vcpu, X86_FEATURE_SHSTK))
> > > > > + return 1;
> > > > > + if (index == MSR_KVM_SSP && !host_initiated)
> > > > > + return 1;
> > > > > + if (is_noncanonical_address(data, vcpu))
> > > > > + return 1;
> > > > > + if (index != MSR_IA32_INT_SSP_TAB && !IS_ALIGNED(data, 4))
> > > > > + return 1;
> > > > > + break;
> > > > Once again I'll prefer to have an ioctl for setting/getting SSP, this will
> > > > make the above code simpler (e.g there will be no need to check that write
> > > > comes from the host/etc).
> > >
> > > I don't think an ioctl() would be simpler overall, especially when factoring in
> > > userspace. With a synthetic MSR, we get the following quite cheaply:
> > >
> > > 1. Enumerating support to userspace.
> > > 2. Save/restore of the value, e.g. for live migration.
> > > 3. Vendor hooks for propagating values to/from the VMCS/VMCB.
> > >
> > > For an ioctl(),
> > > #1 would require a capability, #2 (and #1 to some extent) would
> > > require new userspace flows, and #3 would require new kvm_x86_ops hooks.
> > >
> > > The synthetic MSR adds a small amount of messiness, as does bundling
> > > MSR_IA32_INT_SSP_TAB with the other shadow stack MSRs. The bulk of the mess comes
> > > from the need to allow userspace to write '0' when KVM enumerated supported to
> > > userspace.
> >
> > Let me put it this way - all hacks start like that, and in this case this is API/ABI hack
> > so we will have to live with it forever.
>
> Eh, I don't view it as a hack, at least the kind of hack that has a negative
> connotation. KVM effectively has ~240 MSR indices reserved for whatever KVM
> wants.
This is exactly the problem. These indices are reserved for PV features, not
for fake msrs, and my fear is that once we mix it up, it will be a mess.

If that was not API/ABI, I wouldn't complain, but since this is API/ABI, I'm afraid
to make a mistake and then be sorry.


> The only weird thing about this one is that it's not accessible from the
> guest. Which I agree is quite weird, but from a code perspective I think it
> works quite well.
>
> > Once there is a precedent, trust me there will be 10s of new 'fake' msrs added, and the
> > interface will become one big mess.
>
> That suggests MSRs aren't already one big mess. :-) I'm somewhat joking, but also
> somewhat serious. I really don't think that adding one oddball synthetic MSR is
> going to meaningfully move the needle on the messiness of MSRs.
>
> Hmm, there probably is a valid slippery slope argument though. As you say, at
> some point, enough state will get shoved into hardware that KVM will need an ever
> growing number of synthetic MSRs to keep pace.

Yes, exactly what I mean - Honestly though I don't expect many new x86 registers/states
that are not msrs, but we don't know what x86 designers will do next,
and APIs are something that can't be fixed later.

>
> > As I suggested, if you don't want to add new capability/ioctl and vendor
> > callback per new x86 arch register, then let's implement
> > KVM_GET_ONE_REG/KVM_SET_ONE_REG and then it will be really easy to add new
> > regs without confusing users, and without polluting msr namespace with msrs
> > that don't exist.
>
> I definitely don't hate the idea of KVM_{G,S}ET_ONE_REG, what I don't want is to
> have an entirely separate path in KVM for handling the actual get/set.
>
> What if we combine the approaches? Add KVM_{G,S}ET_ONE_REG support so that the
> uAPI can use completely arbitrary register indices without having to worry about
> polluting the MSR space and making MSR_KVM_SSP ABI.
Sounds like a reasonable idea but might be overkill.

>
> Ooh, if we're clever, I bet we can extend KVM_{G,S}ET_ONE_REG to also work with
> existing MSRs, GPRs, and other stuff,

Not sure if we want to make it work with MSRs. MSRs are a very well defined thing
on x86, and we already have an API to read/write them. Other registers maybe,
don't know.

> i.e. not force userspace through the funky
> KVM_SET_MSRS just to set one reg, and not force a RMW of all GPRs just to set
> RIP or something.
Setting one GPR like RIP does sound like a valid use case of KVM_SET_ONE_REG.

> E.g. use bits 39:32 of the id to encode the register class,
> bits 31:0 to hold the index within a class, and reserve bits 63:40 for future
> usage.
>
> Then for KVM-defined registers, we can route them internally as needed, e.g. we
> can still define MSR_KVM_SSP so that internal it's treated like an MSR, but its
> index isn't ABI and so can be changed at will. And future KVM-defined registers
> wouldn't _need_ to be treated like MSRs, i.e. we could route registers through
> the MSR APIs if and only if it makes sense to do so.

I am not sure that even internally I'll treat MSR_KVM_SSP as MSR.
An MSR IMHO is a msr, a register is a register, mixing this up will
just add to the confusion.


Honestly if I were to add support for the SSP register, I'll just add a new
ioctl/capability and vendor callback. All of this code is just harmless
boilerplate code.
Even using KVM_GET_ONE_REG/KVM_SET_ONE_REG is probably overkill, although using
it for new registers is reasonable.

At the end I am not going to argue much about this - I just voiced my option that currently
MSR read/write interface is pure in regard that it only works on either real msrs or
at least PV msrs that the guest can read/write.

All other guest's state is set via separate ioctls/callbacks/etc, and thus it's more consistent
from API POV to add SSP here.


>
> Side topic, why on earth is the data value of kvm_one_reg "addr"?

I don't know, probably something ARM related.

>


Best regards,
Maxim Levitsky