Re: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't used at runtime

From: Josh Poimboeuf
Date: Thu Oct 19 2023 - 02:59:35 EST


On Wed, Oct 18, 2023 at 11:35:30PM -0700, Josh Poimboeuf wrote:
> On Wed, Oct 18, 2023 at 10:37:47PM +0200, Borislav Petkov wrote:
> > +++ b/arch/x86/kernel/alternative.c
> > @@ -748,14 +748,20 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end)
> > continue;
> >
> > op = insn.opcode.bytes[0];
> > - if (op == JMP32_INSN_OPCODE)
> > + if (op == JMP32_INSN_OPCODE || op == JMP8_INSN_OPCODE)
> > dest = addr + insn.length + insn.immediate.value;
>
> I can recreate (with my GCC 12) by disabling CONFIG_CALL_DEPTH_TRACKING
> and CONFIG_CPU_SRSO, which puts __x86_return_thunk() close enough to the
> retpolines to enable the two-byte JMP in the last retpoline. And then
> booting with spectre_v2=retpoline.
>
> (Then to force two-byte JMPs for more retpolines, I cheated and just
> moved __x86_return_thunk() to right after the retpolines.)
>
> Your WARN patch didn't seem to fix the no-output hang for me, maybe due
> to recursive warnings?
>
> I was able to get more output by changing the WARN to (ahem) WARN_ONCE,
> but it's still getting into some kind of stack corruption. Full output
> below. I haven't had a chance to look further, but it's worrisome that
> even the WARN_ONCE isn't being recovered from.
>
> Regardless of if we revert e92626af3234 ("x86/retpoline: Remove
> .text..__x86.return_thunk section"), or do the above patch, we still
> need to figure out why even WARN_ONCE() would be borking things.
>
> Off to bed...

One last idea, since the return thunk is used everywhere (even non-ABI
compliant functions) it might be possible the "call check_thunks" (and
its call to warn_printk) is clobbering some registers which some code
(exception handling entry code?) doesn't appreciate.

FWIW, I changed to a WARN_ON_ONCE and it booted fine.

--
Josh