Re: [RFC 11/14] x86: add support for Dynamic Kernel Stacks

From: Thomas Gleixner
Date: Wed Mar 13 2024 - 06:24:32 EST


On Mon, Mar 11 2024 at 16:46, Pasha Tatashin wrote:
> @@ -413,6 +413,9 @@ DEFINE_IDTENTRY_DF(exc_double_fault)
> }
> #endif
>
> + if (dynamic_stack_fault(current, address))
> + return;
> +
> irqentry_nmi_enter(regs);
> instrumentation_begin();
> notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
> diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> index d6375b3c633b..651c558b10eb 100644
> --- a/arch/x86/mm/fault.c
> +++ b/arch/x86/mm/fault.c
> @@ -1198,6 +1198,9 @@ do_kern_addr_fault(struct pt_regs *regs, unsigned long hw_error_code,
> if (is_f00f_bug(regs, hw_error_code, address))
> return;
>
> + if (dynamic_stack_fault(current, address))
> + return;

T1 schedules out with stack used close to the fault boundary.

switch_to(T2)

Now T1 schedules back in

switch_to(T1)
__switch_to_asm()
...
switch_stacks() <- SP on T1 stack
! ...
! jmp __switch_to()
! __switch_to()
! ...
! raw_cpu_write(pcpu_hot.current_task, next_p);

After switching SP to T1's stack and up to the point where
pcpu_hot.current_task (aka current) is updated to T1 a stack fault will
invoke dynamic_stack_fault(T2, address) which will return false here:

/* check if address is inside the kernel stack area */
stack = (unsigned long)tsk->stack;
if (address < stack || address >= stack + THREAD_SIZE)
return false;

because T2's stack does obviously not cover the faulting address on T1's
stack. As a consequence double fault will panic the machine.

Thanks,

tglx