Re: [PATCH 2/2] x86: Rewrite ret_from_fork() in C

From: Brian Gerst
Date: Thu Jun 22 2023 - 13:33:39 EST


On Thu, Jun 22, 2023 at 12:33 PM H. Peter Anvin <hpa@xxxxxxxxx> wrote:
>
> On June 22, 2023 9:04:03 AM PDT, Brian Gerst <brgerst@xxxxxxxxx> wrote:
> >On Thu, Jun 22, 2023 at 9:29 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >>
> >> On Thu, Jun 22, 2023 at 08:07:50AM -0400, Brian Gerst wrote:
> >> > When kCFI is enabled, special handling is needed for the indirect call
> >> > to the kernel thread function. Rewrite the ret_from_fork() function in
> >> > C so that the compiler can properly handle the indirect call.
> >> >
> >> > Suggested-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> >> > Signed-off-by: Brian Gerst <brgerst@xxxxxxxxx>
> >>
> >> This is much nicer indeed. I'll take these patches into my series and
> >> repost later today if you don't mind.
> >
> >Yes, that's fine.
> >
> >> One little niggle below..
> >>
> >> > ---
> >>
> >> > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> >> > index f31e286c2977..5ee32e7e29e8 100644
> >> > --- a/arch/x86/entry/entry_64.S
> >> > +++ b/arch/x86/entry/entry_64.S
> >> > @@ -284,36 +284,21 @@ SYM_FUNC_END(__switch_to_asm)
> >> > * r12: kernel thread arg
> >> > */
> >> > .pushsection .text, "ax"
> >> > +SYM_CODE_START(ret_from_fork_asm)
> >> > UNWIND_HINT_END_OF_STACK
> >> > ANNOTATE_NOENDBR // copy_thread
> >> > CALL_DEPTH_ACCOUNT
> >> >
> >> > + /* return address for the stack unwinder */
> >> > + pushq $swapgs_restore_regs_and_return_to_usermode
> >> > + UNWIND_HINT_FUNC
> >> >
> >> > + movq %rax, %rdi /* prev */
> >> > + movq %rsp, %rsi /* regs */
> >> > + movq %rbx, %rdx /* fn */
> >> > + movq %r12, %rcx /* fn_arg */
> >> > + jmp ret_from_fork
> >> > +SYM_CODE_END(ret_from_fork_asm)
> >> > .popsection
> >> >
> >> > .macro DEBUG_ENTRY_ASSERT_IRQS_OFF
> >>
> >> > diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
> >> > index dac41a0072ea..f5dbfebac076 100644
> >> > --- a/arch/x86/kernel/process.c
> >> > +++ b/arch/x86/kernel/process.c
> >> > @@ -28,6 +28,7 @@
> >> > #include <linux/static_call.h>
> >> > #include <trace/events/power.h>
> >> > #include <linux/hw_breakpoint.h>
> >> > +#include <linux/entry-common.h>
> >> > #include <asm/cpu.h>
> >> > #include <asm/apic.h>
> >> > #include <linux/uaccess.h>
> >> > @@ -134,6 +135,25 @@ static int set_new_tls(struct task_struct *p, unsigned long tls)
> >> > return do_set_thread_area_64(p, ARCH_SET_FS, tls);
> >> > }
> >> >
> >> > +__visible noinstr void ret_from_fork(struct task_struct *prev, struct pt_regs *regs,
> >> > + int (*fn)(void *), void *fn_arg)
> >>
> >> So I had noinstr in my initial patch, but it leads to objtool
> >> complaints. I suppose we can actually handle tracing and all the other
> >> gunk at this point, so I've removed it.
> >
> >I'm not an expert on noinstr usage, but looking at the other syscall
> >functions, instrumentation needs to be disabled before
> >syscall_exit_to_user_mode() is called. Perhaps adding an
> >instrumentation_begin()/instrumentation_end() pair to this function is
> >needed?
> >
> >Brian Gerst
> >
>
> I don't have the code in front of me right now, but how does this affect FRED enabling? In the case of FRED, the exit path is much simpler; in the FRED enabling patchset we simply deal with it by alternatives-patching the terminal jump after resetting the stack pointer to the standard FRED user space exit stub (which simply pops the user space registers and executes ERETU.)
>
> I'm assuming this is still valid/possible after your patches, since resetting the stack pointer isn't possible in C, but I wanted to double check.

It should just need to change the "pushq
$swapgs_restore_regs_and_return_to_usermode" to some other address via
an alternative.

Brian Gerst