[PATCH] x86/fpu: Remove dynamic features from xcomp_bv for init_fpstate

From: Dave Hansen
Date: Tue Oct 11 2022 - 18:24:34 EST


From: Yuan Yao <yuan.yao@xxxxxxxxx>

This was found a couple of months ago in a big old AMX
backport. But, it looks to be a problem in mainline too.
Please let me know if this looks OK. I'd also especially
appreciate some testing from folks that have AMX hardware
handy.

Builds and survives a quick boot test on non-AMX hardware.

--

== Background ==

'init_fpstate' is a sort of template for all of the fpstates
that come after it. It is copied when new processes are
execve()'d or XRSTOR'd to get fpregs into a known state.

That means that it represents the *starting* state for a
process's fpstate which includes only the 'default' features.
Dynamic features can, of course, be acquired later, but
processes start with only default_features.

During boot the kernel decides whether all fpstates will be
kept in the compacted or uncompacted format. This choice is
communicated to the hardware via the XCOMP_BV field in all
XSAVE buffers, including 'init_fpstate'.

== Problem ==

But, the existing XCOMP_BV calculation is incorrect. It uses
the set of 'max_features', not the default features.

As a result, when XRSTOR operates on buffers derived from
'init_fpstate', it may attempt to "tickle" dynamic features which
are at offsets for which there is no space allocated in the
fpstate.

== Scope ==

This normally results in a relatively harmless out-of-bounds
memory read. It's harmless because it never gets consumed. But,
if the fpstate is next to some unmapped memory, this "tickle" can
cause a page fault and an oops.

This only causes issues on systems when dynamic features are
available and when an XSAVE buffer is next to uninitialized
memory. In other words, it only affects recent Intel server
CPUs, and in relatively few memory locations.

Those two things are why it took relatively long to catch this.

== Solution ==

Use 'default_features' to establish the init_fpstate
xcomp_bv value. Reset individual fpstate xcomp_bv values
when the rest of the fpstate is reset.

[ dhansen: add reset code from tglx, rewrites
commit message and comments ]

Fixes: 1c253ff2287f ("x86/fpu: Move xstate feature masks to fpu_*_cfg")
Suggested-by: Dave Hansen <dave.hansen@xxxxxxxxx>
Suggested-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Signed-off-by: Yuan Yao <yuan.yao@xxxxxxxxx>
Cc: stable@xxxxxxxxxxxxxxx
---
arch/x86/kernel/fpu/core.c | 3 +++
arch/x86/kernel/fpu/xstate.c | 7 ++++++-
2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 3b28c5b25e12..4d64de34da12 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -526,6 +526,9 @@ static void __fpstate_reset(struct fpstate *fpstate, u64 xfd)
fpstate->xfeatures = fpu_kernel_cfg.default_features;
fpstate->user_xfeatures = fpu_user_cfg.default_features;
fpstate->xfd = xfd;
+
+ /* Ensure that xcomp_bv matches ->xfeatures */
+ xstate_init_xcomp_bv(&fpstate->regs.xsave, fpstate->xfeatures);
}

void fpstate_reset(struct fpu *fpu)
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c8340156bfd2..f9f45610c72f 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -360,7 +360,12 @@ static void __init setup_init_fpu_buf(void)

print_xstate_features();

- xstate_init_xcomp_bv(&init_fpstate.regs.xsave, fpu_kernel_cfg.max_features);
+ /*
+ * 'init_fpstate' is sized for the default feature
+ * set without any of the dynamic features.
+ */
+ xstate_init_xcomp_bv(&init_fpstate.regs.xsave,
+ fpu_kernel_cfg.default_features);

/*
* Init all the features state with header.xfeatures being 0x0
--
2.34.1