Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

From: Ard Biesheuvel
Date: Wed Jan 31 2024 - 08:57:38 EST


On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <bp@xxxxxxxxx> wrote:
>
> On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > From: Ard Biesheuvel <ardb@xxxxxxxxxx>
> >
> > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > before calling into the startup code shared between primary and
> > secondary boot.
> >
> > This means a simple indirect call is sufficient here.
> >
> > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > ---
> > arch/x86/kernel/head_64.S | 35 ++------------------
> > 1 file changed, 3 insertions(+), 32 deletions(-)
> >
> > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > index d4918d03efb4..4017a49d7b76 100644
> > --- a/arch/x86/kernel/head_64.S
> > +++ b/arch/x86/kernel/head_64.S
> > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> > movq %r15, %rdi
> >
> > .Ljump_to_C_code:
> > - /*
> > - * Jump to run C code and to be on a real kernel address.
> > - * Since we are running on identity-mapped space we have to jump
> > - * to the full 64bit address, this is only possible as indirect
> > - * jump. In addition we need to ensure %cs is set so we make this
> > - * a far return.
> > - *
> > - * Note: do not change to far jump indirect with 64bit offset.
> > - *
> > - * AMD does not support far jump indirect with 64bit offset.
> > - * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > - * JMP FAR mem16:16 FF /5 Far jump indirect,
> > - * with the target specified by a far pointer in memory.
> > - * JMP FAR mem16:32 FF /5 Far jump indirect,
> > - * with the target specified by a far pointer in memory.
> > - *
> > - * Intel64 does support 64bit offset.
> > - * Software Developer Manual Vol 2: states:
> > - * FF /5 JMP m16:16 Jump far, absolute indirect,
> > - * address given in m16:16
> > - * FF /5 JMP m16:32 Jump far, absolute indirect,
> > - * address given in m16:32.
> > - * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > - * address given in m16:64.
> > - */
> > - pushq $.Lafter_lret # put return address on stack for unwinder
> > xorl %ebp, %ebp # clear frame pointer
> > - movq initial_code(%rip), %rax
> > - pushq $__KERNEL_CS # set correct cs
> > - pushq %rax # target address in negative space
> > - lretq
> > -.Lafter_lret:
> > - ANNOTATE_NOENDBR
> > + ANNOTATE_RETPOLINE_SAFE
> > + callq *initial_code(%rip)
> > + int3
> > SYM_CODE_END(secondary_startup_64)
> >
> > #include "verify_cpu.S"
>
> objtool doesn't like it yet:
>
> vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
>
> Once we've solved this, I'll take this one even now - very nice cleanup!
>

s/int3/RET seems to do the trick.

As long as there is an instruction that follows the callq, the
unwinder will see secondary_startup_64 at the base of the call stack.
We never return here anyway.