Re: FAILED: patch "[PATCH] x86/fpu: Invalidate FPU state after a failed XRSTOR from a" failed to apply to 5.4-stable tree

From: Borislav Petkov
Date: Mon Jun 21 2021 - 10:29:39 EST


On Mon, Jun 21, 2021 at 12:51:46PM +0200, gregkh@xxxxxxxxxxxxxxxxxxx wrote:
>
> The patch below does not apply to the 5.4-stable tree.
> If someone wants it applied there, or to any other stable or longterm
> tree, then please email the backport, including the original git commit
> id to <stable@xxxxxxxxxxxxxxx>.
>
> thanks,
>
> greg k-h
>
> ------------------ original commit in Linus's tree ------------------
>
> From d8778e393afa421f1f117471144f8ce6deb6953a Mon Sep 17 00:00:00 2001
> From: Andy Lutomirski <luto@xxxxxxxxxx>
> Date: Tue, 8 Jun 2021 16:36:19 +0200
> Subject: [PATCH] x86/fpu: Invalidate FPU state after a failed XRSTOR from a
> user buffer
>
> Both Intel and AMD consider it to be architecturally valid for XRSTOR to
> fail with #PF but nonetheless change the register state. The actual
> conditions under which this might occur are unclear [1], but it seems
> plausible that this might be triggered if one sibling thread unmaps a page
> and invalidates the shared TLB while another sibling thread is executing
> XRSTOR on the page in question.
>
> __fpu__restore_sig() can execute XRSTOR while the hardware registers
> are preserved on behalf of a different victim task (using the
> fpu_fpregs_owner_ctx mechanism), and, in theory, XRSTOR could fail but
> modify the registers.
>
> If this happens, then there is a window in which __fpu__restore_sig()
> could schedule out and the victim task could schedule back in without
> reloading its own FPU registers. This would result in part of the FPU
> state that __fpu__restore_sig() was attempting to load leaking into the
> victim task's user-visible state.
>
> Invalidate preserved FPU registers on XRSTOR failure to prevent this
> situation from corrupting any state.
>
> [1] Frequent readers of the errata lists might imagine "complex
> microarchitectural conditions".
>
> Fixes: 1d731e731c4c ("x86/fpu: Add a fastpath to __fpu__restore_sig()")
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Signed-off-by: Borislav Petkov <bp@xxxxxxx>
> Acked-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
> Acked-by: Rik van Riel <riel@xxxxxxxxxxx>
> Cc: stable@xxxxxxxxxxxxxxx
> Link: https://lkml.kernel.org/r/20210608144345.758116583@xxxxxxxxxxxxx
>
> diff --git a/arch/x86/kernel/fpu/signal.c b/arch/x86/kernel/fpu/signal.c
> index d5bc96a536c2..4ab9aeb9a963 100644
> --- a/arch/x86/kernel/fpu/signal.c
> +++ b/arch/x86/kernel/fpu/signal.c
> @@ -369,6 +369,25 @@ static int __fpu__restore_sig(void __user *buf, void __user *buf_fx, int size)
> fpregs_unlock();
> return 0;
> }
> +
> + /*
> + * The above did an FPU restore operation, restricted to
> + * the user portion of the registers, and failed, but the
> + * microcode might have modified the FPU registers
> + * nevertheless.
> + *
> + * If the FPU registers do not belong to current, then
> + * invalidate the FPU register state otherwise the task might
> + * preempt current and return to user space with corrupted
> + * FPU registers.
> + *
> + * In case current owns the FPU registers then no further
> + * action is required. The fixup below will handle it
> + * correctly.
> + */
> + if (test_thread_flag(TIF_NEED_FPU_LOAD))
> + __cpu_invalidate_fpregs_state();
> +
> fpregs_unlock();
> } else {

So I'm looking at this and 5.4.127 has:

if (!ret) {
fpregs_mark_activate();
fpregs_unlock();
return 0;
}
fpregs_deactivate(fpu); <---
fpregs_unlock();

i.e., an unconditional fpu invalidation there. Which got removed by:

98265c17efa9 ("x86/fpu/xstate: Preserve supervisor states for the slow path in __fpu__restore_sig()")

in 5.7.

so that Fixes: commit above which points to a 5.1 kernel is probably wrong-ish.

amluto?

--
Regards/Gruss,
Boris.

SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG Nürnberg