Re: [PATCH] fpu: xstate: Keep xfd_state always in-sync with IA32_XFD MSR

From: Thomas Gleixner
Date: Thu May 11 2023 - 15:12:51 EST


On Thu, May 11 2023 at 15:28, Adamos Ttofari wrote:

> Commit 672365477ae8 ("x86/fpu: Update XFD state where required") and
> commit 8bf26758ca96 ("x86/fpu: Add XFD state to fpstate") introduced a
> per_cpu variable xfd_state to keep the IA32_XFD MSR value cached. In
> order to avoid unnecessary writes to the MSR.
>
> xfd_state might not be always synced with the MSR. Eventually affecting
> MSR writes. xfd_state is initialized with 0, meanwhile the MSR is
> initialized with the XFEATURE_MASK_USER_DYNAMIC to make XFD fire. Then
> later on reschedule to a different CPU, when a process that uses extended
> xfeatures and handled the #NM (by allocating the additional space in task's
> fpstate for extended xfeatures) it will skip the MSR update in
> restore_fpregs_from_fpstate because the value might match to already cached
> xfd_state (meanwhile it is not the same with the MSR). Eventually calling a
> XRSTOR to set the new state (that caries extended xfeatures) and fire a #NM
> from kernel context. The XFD is expected to fire from user-space context,
> but not in this case and the kernel crashes.

I'm completely confused.

So after reading the patch I think I know what you are trying to
explain:

On CPU hotplug MSR_IA32_XFD is reset to the init_fpstate.xfd, which
wipes out any stale state. But the per CPU cached xfd value is not
reset, which brings them out of sync.

As a consequence a subsequent xfd_update_state() might fail to update
the MSR which in turn can result in XRSTOR raising a #NM in kernel
space, which crashes the kernel.

Right?

> To address the issue mentioned initialize xfd_state with the current MSR
> value and update the XFD MSR always with xfd_update_state to avoid
> un-sync cases.
>
> Fixes: 672365477ae8 ("x86/fpu: Update XFD state where required")
>
> Signed-off-by: Adamos Ttofari <attofari@xxxxxxxxx>
> ---
> arch/x86/kernel/fpu/xstate.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index 0bab497c9436..36ed27ac0ecd 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -179,8 +179,14 @@ void fpu__init_cpu_xstate(void)
> * key as that does not work on the boot CPU. This also ensures
> * that any stale state is wiped out from XFD.
> */
> - if (cpu_feature_enabled(X86_FEATURE_XFD))
> - wrmsrl(MSR_IA32_XFD, init_fpstate.xfd);
> + if (cpu_feature_enabled(X86_FEATURE_XFD)) {
> + u64 xfd;
> +
> + rdmsrl(MSR_IA32_XFD, xfd);
> + __this_cpu_write(xfd_state, xfd);
> +
> + xfd_update_state(&init_fpstate);
> + }

This does not compile on 32bit. You want something like the uncompiled
below.

> /*
> * XCR_XFEATURE_ENABLED_MASK (aka. XCR0) sets user features
> @@ -915,7 +921,7 @@ void fpu__resume_cpu(void)
> }
>
> if (fpu_state_size_dynamic())
> - wrmsrl(MSR_IA32_XFD, current->thread.fpu.fpstate->xfd);
> + xfd_update_state(&init_fpstate);

On suspend per CPU xfd_state == current->thread.fpu.fpstate->xfd so it's
correct to restore the exact state which was active _before_ suspend.
xfd_state can't be out of sync in that case, no?

Thanks,

tglx
---
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 0bab497c9436..70785a722759 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -177,10 +177,11 @@ void fpu__init_cpu_xstate(void)
* Must happen after CR4 setup and before xsetbv() to allow KVM
* lazy passthrough. Write independent of the dynamic state static
* key as that does not work on the boot CPU. This also ensures
- * that any stale state is wiped out from XFD.
+ * that any stale state is wiped out from XFD. Reset the per CPU
+ * xfd cache too.
*/
if (cpu_feature_enabled(X86_FEATURE_XFD))
- wrmsrl(MSR_IA32_XFD, init_fpstate.xfd);
+ xfd_reset_state();

/*
* XCR_XFEATURE_ENABLED_MASK (aka. XCR0) sets user features
diff --git a/arch/x86/kernel/fpu/xstate.h b/arch/x86/kernel/fpu/xstate.h
index a4ecb04d8d64..6cfaf72228f4 100644
--- a/arch/x86/kernel/fpu/xstate.h
+++ b/arch/x86/kernel/fpu/xstate.h
@@ -159,9 +159,16 @@ static inline void xfd_update_state(struct fpstate *fpstate)
}
}

+static inline void xfd_reset_state(void)
+{
+ wrmsrl(MSR_IA32_XFD, init_fpstate.xfd);
+ __this_cpu_write(xfd_state, init_fpstate.xfd);
+}
+
extern int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu);
#else
static inline void xfd_update_state(struct fpstate *fpstate) { }
+static inline void xfd_reset_state(void) { }

static inline int __xfd_enable_feature(u64 which, struct fpu_guest *guest_fpu) {
return -EPERM;