Re: [RESEND PATCH v6 1/2] x86/fpu: Allow PKRU to be (once again) written by ptrace.

From: Sean Christopherson
Date: Fri Nov 11 2022 - 12:00:51 EST


On Thu, Nov 10, 2022, Dave Hansen wrote:
> On 11/10/22 16:03, Kyle Huey wrote:
> > On Tue, Nov 8, 2022 at 10:23 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
> BTW, I'd love to know if KVM *REALLY* depends on this.

Unlikely, but nearly impossible to know for sure. Copy+pasting my response[1] to
an earlier version.

: Hrm, the current behavior has been KVM ABI for a very long time.
:
: It's definitely odd because all other components will be initialized due to their
: bits being cleared in the header during kvm_load_guest_fpu(), and it probably
: wouldn't cause problems in practice as most VMMs likely do "all or nothing" loads.
: But, in theory, userspace could save/restore a subset of guest XSTATE and rely on
: the kernel not overwriting guest PKRU when its bit is cleared in the header.
:
: All that said, I don't see any reason to force KVM to change at this time, it's
: trivial enough to handle KVM's oddities while providing sane behavior for others.
: Nullify the pointer in the guest path and then update copy_uabi_to_xstate() to
: play nice with a NULL pointer, e.g.
:
: /*
: * Nullify @vpkru to preserve its current value if PKRU's bit isn't set
: * in the header. KVM's odd ABI is to leave PKRU untouched in this
: * case (all other components are eventually re-initialized).
: */
: if (!(kstate->regs.xsave.header.xfeatures & XFEATURE_MASK_PKRU))
: vpkru = NULL;
:
: return copy_uabi_from_kernel_to_xstate(kstate, ustate, vpkru);

> It'd be nice to kill if not.

I don't disagree, my hesitation is purely that doing so might subtly break
userspace.

That said, I'm 99.9% certain no traditional VMM, e.g. QEMU, is relying on this
behavior, as doing KVM_SET_XSAVE with anything except the guest's xfeatures mask
would corrupt guest XSAVE state for everything except PKRU. I.e. for all intents
and purposes, a traditional VMM must do KVM_GET_SAVE => KVM_SET_XSAVE without
touching the xfeatures mask.

And for non-traditional usage of KVM, I would be quite surprised if any of those
use cases utilize PKRU in the guest, let alone play games with KVM_{G,S}SET_XSAVE.

So, I'm not completely opposed to "fixing" KVM's ABI, but it should be done as a
separate patch that is tagged "KVM: x86:" and clearly states that it's changing
KVM's ABI in a way that could theoretically break userspace.

> >> Would something like this be more clear?
> >>
> >> if (hdr.xfeatures & XFEATURE_MASK_PKRU) {
> >> struct pkru_state *xpkru;
> >>
> >> xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU);
> >> *pkru = xpkru->pkru;
> >> } else {
> >> /*
> >> * KVM may pass a NULL 'pkru' to indicate
> >> * that it does not need PKRU updated.
> >> */
> >> if (pkru)
> >> *pkru = 0;
> >> }
> >
> > Yeah, Sean Christopherson suggested this (with the else and if
> > collapsed into a single level) when I submitted this previously.
>
> I generally agree with Sean, but he's also been guilty of an atrocity or
> two over the years. :)

Heh, just one or two? I'll call that a win.

> While I generally like low levels of indentation I also think my version is
> much more clear in this case.

I've no objection to a standalone if. My suggestion[2] was in response to code
that zeroed @pkru before the XFEATURE_MASK_PKRU check.

if (pkru)
*pkru = 0;

if (hdr.xfeatures & XFEATURE_MASK_PKRU) {
struct pkru_state *xpkru;
xpkru = __raw_xsave_addr(xsave, XFEATURE_PKRU);
*pkru = xpkru->pkru;
}

[1] https://lore.kernel.org/all/Yv6szXuKGv75wWmm@xxxxxxxxxx
[2] https://lore.kernel.org/all/YxDP6jie4cwzZIHp@xxxxxxxxxx