Re: [PATCH v3 03/19] x86/startup_64: Drop long return to initial_code pointer

From: Ard Biesheuvel
Date: Wed Jan 31 2024 - 09:09:05 EST


On Wed, 31 Jan 2024 at 14:57, Ard Biesheuvel <ardb@xxxxxxxxxx> wrote:
>
> On Wed, 31 Jan 2024 at 14:45, Borislav Petkov <bp@xxxxxxxxx> wrote:
> >
> > On Mon, Jan 29, 2024 at 07:05:06PM +0100, Ard Biesheuvel wrote:
> > > From: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > >
> > > Since commit 866b556efa12 ("x86/head/64: Install startup GDT"), the
> > > primary startup sequence sets the code segment register (CS) to __KERNEL_CS
> > > before calling into the startup code shared between primary and
> > > secondary boot.
> > >
> > > This means a simple indirect call is sufficient here.
> > >
> > > Signed-off-by: Ard Biesheuvel <ardb@xxxxxxxxxx>
> > > ---
> > > arch/x86/kernel/head_64.S | 35 ++------------------
> > > 1 file changed, 3 insertions(+), 32 deletions(-)
> > >
> > > diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> > > index d4918d03efb4..4017a49d7b76 100644
> > > --- a/arch/x86/kernel/head_64.S
> > > +++ b/arch/x86/kernel/head_64.S
> > > @@ -428,39 +428,10 @@ SYM_INNER_LABEL(secondary_startup_64_no_verify, SYM_L_GLOBAL)
> > > movq %r15, %rdi
> > >
> > > .Ljump_to_C_code:
> > > - /*
> > > - * Jump to run C code and to be on a real kernel address.
> > > - * Since we are running on identity-mapped space we have to jump
> > > - * to the full 64bit address, this is only possible as indirect
> > > - * jump. In addition we need to ensure %cs is set so we make this
> > > - * a far return.
> > > - *
> > > - * Note: do not change to far jump indirect with 64bit offset.
> > > - *
> > > - * AMD does not support far jump indirect with 64bit offset.
> > > - * AMD64 Architecture Programmer's Manual, Volume 3: states only
> > > - * JMP FAR mem16:16 FF /5 Far jump indirect,
> > > - * with the target specified by a far pointer in memory.
> > > - * JMP FAR mem16:32 FF /5 Far jump indirect,
> > > - * with the target specified by a far pointer in memory.
> > > - *
> > > - * Intel64 does support 64bit offset.
> > > - * Software Developer Manual Vol 2: states:
> > > - * FF /5 JMP m16:16 Jump far, absolute indirect,
> > > - * address given in m16:16
> > > - * FF /5 JMP m16:32 Jump far, absolute indirect,
> > > - * address given in m16:32.
> > > - * REX.W + FF /5 JMP m16:64 Jump far, absolute indirect,
> > > - * address given in m16:64.
> > > - */
> > > - pushq $.Lafter_lret # put return address on stack for unwinder
> > > xorl %ebp, %ebp # clear frame pointer
> > > - movq initial_code(%rip), %rax
> > > - pushq $__KERNEL_CS # set correct cs
> > > - pushq %rax # target address in negative space
> > > - lretq
> > > -.Lafter_lret:
> > > - ANNOTATE_NOENDBR
> > > + ANNOTATE_RETPOLINE_SAFE
> > > + callq *initial_code(%rip)
> > > + int3
> > > SYM_CODE_END(secondary_startup_64)
> > >
> > > #include "verify_cpu.S"
> >
> > objtool doesn't like it yet:
> >
> > vmlinux.o: warning: objtool: verify_cpu+0x0: stack state mismatch: cfa1=4+8 cfa2=-1+0
> >
> > Once we've solved this, I'll take this one even now - very nice cleanup!
> >
>
> s/int3/RET seems to do the trick.
>

or ud2, even better,