Re: [PATCH 02/19] x86/fpu: Prepare KVM for dynamically enabled states

From: Paolo Bonzini
Date: Mon Dec 13 2021 - 07:45:59 EST


On 12/13/21 13:00, Thomas Gleixner wrote:
On Mon, Dec 13 2021 at 10:12, Paolo Bonzini wrote:
On 12/8/21 01:03, Yang Zhong wrote:
- user_xfeatures

Track which features are currently enabled for the vCPU

Please rename to alloc_xfeatures

That name makes no sense at all. This has nothing to do with alloc.

Isn't that the features for which space is currently allocated?

fpstate_realloc does

+ if (guest_fpu) {
+ newfps->is_guest = true;
+ newfps->is_confidential = curfps->is_confidential;
+ guest_fpu->user_xfeatures |= xfeatures;
+ }
+

and kvm_check_guest_realloc_fpstate does


+ if ((guest_fpu->user_xfeatures & request) != request) {
+ vcpu->arch.guest_fpu.realloc_request |= request;
+ return true;
+ }

Reading "user_xfeatures" in there is cryptic, it seems like it's something related to the userspace thread or group that has invoked the KVM ioctl. If it's renamed to alloc_xfeatures, then this:

+ missing = request & ~guest_fpu->alloc_xfeatures;
+ if (missing) {
+ vcpu->arch.guest_fpu.realloc_request |= missing;
+ return true;
+ }

makes it obvious that the allocation is for features that are requested but haven't been allocated in the xstate yet.

- user_perm

Copied from guest_perm of the group leader thread. The first
vCPU which does the copy locks the guest_perm

Please rename to perm_xfeatures.

All of that is following the naming conventions in the FPU code related
to permissions etc.

perm or guest_perm is just fine as well; but user_perm is not (there's no preexisting reference to user_perm in the code, in fact, as far as I can see).

- realloc_request

KVM sets this field to request dynamically-enabled features
which require reallocation of @fpstate

This field should be in vcpu->arch, and there is no need for
fpu_guest_realloc_fpstate. Rename __xfd_enable_feature to
fpu_enable_xfd_feature and add it to the public API, then just do

if (unlikely(vcpu->arch.xfd_realloc_request)) {
u64 request = vcpu->arch.xfd_realloc_request;
ret = fpu_enable_xfd(request, enter_guest);
}

to kvm_put_guest_fpu.

Why? Yet another export of FPU internals just because?

It's one function more and one field less. I prefer another export of FPU internals, to a write to a random field with undocumented invariants.

For example, why WARN_ON_ONCE if enter_guest == true? If you enter the guest after the host has restored MSR_IA32_XFD with KVM_SET_MSR, the WARN_ON_ONCE can actually fire. It would be just fine to skip the reallocation until enter_guest == false, which is you actually XSAVE into the guest_fpu.

As an aside, realloc_request (if it survives, see below) shouldn't be added until patch 6.

Also what clears the reallocation request and what is the @enter_guest
argument supposed to help with?

Nothing---make it

if (unlikely(vcpu->arch.xfd_realloc_request)) {
u64 request = vcpu->arch.xfd_realloc_request;
ret = fpu_enable_xfd(request, &vcpu->arch.guest_fpu);
if (!ret)
vcpu->arch.xfd_realloc_request = 0;
}

but in fact, I'm not sure why the request has to be delayed at all. The obvious implementation of a write to XFD, after all the validity checks, is just

/* This function just calls xfd_enable_feature. */
r = fpu_guest_alloc_xfeatures(&vcpu->arch.guest_fpu,
vcpu->arch.xcr0 & ~xfd);
/*
* An error means that userspace has screwed up by not doing
* arch_prctl(ARCH_REQ_XCOMP_GUEST_PERM). If we are here coming
* from a ioctl fail it, if the guest has done WRMSR or XSETBV
* it will inject a #GP.
*/
if (r < 0)
return 1;

vcpu->arch.xfd = xfd;

with none of the kvm_check_guest_realloc_fpstate or KVM_EXIT_FPU_REALLOC business. No field and a simple, self-contained external API. The same code works for __kvm_set_xcr as well, just with "xcr0 & ~vcpu->arch.xfd" as the second argument instead.

(Maybe I'm missing why KVM_EXIT_FPU_REALLOC is needed, but I'm not ashamed to say that, given that new userspace API was added with documentation or tests. The only comment in the code is:

+ /*
+ * Check if fpstate reallocate is required. If yes, then
+ * let the fpu core do reallocation and update xfd;
+ * otherwise, update xfd here.
+ */
+ if (kvm_check_guest_realloc_fpstate(vcpu, data)) {
+ vcpu->run->exit_reason = KVM_EXIT_FPU_REALLOC;
+ vcpu->arch.complete_userspace_io =
+ kvm_skip_emulated_instruction;
+ return KVM_MSR_RET_USERSPACE;
+ }
+

which has nothing to do with the actual content of either kvm_check_guest_realloc_fpstate or the "then" branch. Userspace can just do ARCH_REQ_XCOMP_GUEST_PERM based on the guest CPUID, before KVM_SET_CPUID2, KVM_SET_MSR or KVM_SET_XCR).

Paolo