Re: [PATCH] x86/fpu: verify xstate buffer size according with requested features

From: Dave Hansen
Date: Wed Jan 17 2024 - 18:52:26 EST


On 1/17/24 14:30, Andrei Vagin wrote:
> On Wed, Jan 17, 2024 at 11:34 AM Dave Hansen <dave.hansen@xxxxxxxxx> wrote:
>> Seems like the real problem here is that the fault_in_readable() doesn't
>> match the XRSTOR. It's going to continue to be a problem as long as we
>> don't know what memory XRSTOR tried to access. We can try all day long
>> to precalculate what XRSTOR _will_ do, but that seems a bit silly
>
> I don't understand this part. The behavior of XRSTOR is well-defined
> by CPU specifications, allowing us to easily precalculate the memory it
> will attempt to access. What does it mean "we don't know what memory
> XRSTOR tried to access"?
>
> xrstor restores only features that are set in fx_sw->xfeatures.

XSAVE is a big old pile of fun. Someone is confused here and that
someone might be me. Let me walk through my logic a bit. Maybe you can
point out where I've gone wrong. :)

Take a look at the "Legacy Region of an XSAVE Area" in the SDM.

The x87 state component comprises bytes 23:0 and bytes 159:32.
The SSE state component comprises bytes 31:24 and bytes 415:160.
The XSAVE feature set does not use bytes 511:416; bytes 463:416
are reserved.

and the next section:

The XSAVE header of an XSAVE area comprises the 64 bytes
starting at offset 512 from the area’s base address:
...
Bytes 7:0 of the XSAVE header is a state-component bitmap (see
Section 13.1) called XSTATE_BV.

XRSTOR accesses memory based on XSTATE_BV (and XCOMP_BV which should be
irrelevant here). So we're looking for something at 512 bytes up in the
XSAVE buffer.

Let's have gdb help us out a bit. Where is sw_reserved?

(gdb) print/d &((struct fxregs_state *)0)->sw_reserved
$4 = 464

Where does XRSTOR itself look?

(gdb) print/d &((union fpregs_state *)0)->xsave->header.xfeatures
$5 = 512

(xfeatures aka. XSTATE_BV)

There _was_ a reason once upon a time that the kernel started sticking a
copy of XSTATE_BV in 'sw_reserved'. I just forget what it is at the
moment. It's horribly confusing to it laid out like this, but the SDM
is pretty clear: "the XSAVE feature set does not use bytes 511:416".

"fx_sw" is actually a software-defined and software-only-consumed area
of the XSAVE buffer, thus the '_sw'. Nothing in the '_sw' section tells
us how the hardware will behave.

>> because the CPU knows where the fault happened. It told us in CR2 and
>> all we have to do is plumb that back to fault_in_readable().
>
> I considered this option as well, but then I decided that this approach
> is better. The most important aspect is that it always rejects bad
> buffers, allowing a user space to detect an issue even when a fault
> isn't triggered. I believe proper handling of xrstor page faults could
> be a valuable additional improvement to this change. If we detect a
> fault outside of a provided buffer, we can print a warning to signal
> that check_xstate_in_sigframe is incomplete.

I'm not really following the logic there. What's the downside of taking
the fault?