[PATCH v2] x86/fpu: set X86_FEATURE_OSXSAVE feature after enabling OSXSAVE in CR4

From: Feng Tang
Date: Thu Aug 24 2023 - 02:43:28 EST


0Day found a 34.6% regression in stress-ng's 'af-alg' test case [1], and
bisected it to commit b81fac906a8f ("x86/fpu: Move FPU initialization
into arch_cpu_finalize_init()"), which optimizes the FPU init order,
and moves the CR4_OSXSAVE enabling into a later place:

arch_cpu_finalize_init
identify_boot_cpu
identify_cpu
generic_identify
get_cpu_cap --> setup cpu capability
...
fpu__init_cpu
fpu__init_cpu_xstate
cr4_set_bits(X86_CR4_OSXSAVE);

'X86_FEATURE_OSXSAVE' feature bit maps to bit 27 of output in ECX from
cpuid(0x00000001), which will be '1' once CR4.OSXSAVE is set. From the
call sequence above, CR4.OSXSAVE is set after cpu capability setup,
causing 'X86_FEATURE_OSXSAVE' feature bit not being set.

Many security module like 'camellia_aesni_avx_x86_64' depends on
this feature, and will fail to be loaded after the commit, causing the
regression.

So set X86_FEATURE_OSXSAVE feature right after OSXSAVE enabling to fix it.

[1]. https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@xxxxxxxxx/

Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()")
Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
Closes: https://lore.kernel.org/lkml/202307192135.203ac24e-oliver.sang@xxxxxxxxx/
Signed-off-by: Feng Tang <feng.tang@xxxxxxxxx>
---
Changelog:

since v1:
* Add more background info to commit log and code comments (Rick Edgecombe)

arch/x86/kernel/fpu/xstate.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 0bab497c9436..9de551662624 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -173,6 +173,13 @@ void fpu__init_cpu_xstate(void)

cr4_set_bits(X86_CR4_OSXSAVE);

+ /*
+ * CPUID bit for X86_FEATURE_OSXSAVE value will change once
+ * CR4.OSXSAVE is set, so update it manually.
+ */
+ if (!boot_cpu_has(X86_FEATURE_OSXSAVE))
+ setup_force_cpu_cap(X86_FEATURE_OSXSAVE);
+
/*
* Must happen after CR4 setup and before xsetbv() to allow KVM
* lazy passthrough. Write independent of the dynamic state static
--
2.27.0