Re: [PATCH 2/2] arm64: Clear the stack

From: Mark Rutland
Date: Mon May 14 2018 - 01:16:18 EST


On Sun, May 13, 2018 at 11:40:07AM +0300, Alexander Popov wrote:
> It seems that previously I was very "lucky" to accidentally have those MIN_STACK_LEFT,
> call trace depth and oops=panic together to experience a hang on stack overflow
> during BUG().
>
>
> When I run my test in a loop _without_ VMAP_STACK, I manage to corrupt the neighbour
> processes with BUG() handling overstepping the stack boundary. It's a pity, but
> I have an idea.

I think that in the absence of VMAP_STACK, there will always be cases where we
*could* corrupt a neighbouring stack, but I agree that trying to minimize that
possibility would be good.

> In kernel/sched/core.c we already have:
>
> #ifdef CONFIG_SCHED_STACK_END_CHECK
> if (task_stack_end_corrupted(prev))
> panic("corrupted stack end detected inside scheduler\n");
> #endif
>
> So what would you think if I do the following in check_alloca():
>
> if (size >= stack_left) {
> #if !defined(CONFIG_VMAP_STACK) && defined(CONFIG_SCHED_STACK_END_CHECK)
> panic("alloca over the kernel stack boundary\n");
> #else
> BUG();
> #endif

Given this is already out-of-line, how about we always use panic(), regardless
of VMAP_STACK and SCHED_STACK_END_CHECK? i.e. just

if (unlikely(size >= stack_left))
panic("alloca over the kernel stack boundary");

If we have VMAP_STACK selected, and overflow during the panic, it's the same as
if we overflowed during the BUG(). It's likely that panic() will use less stack
space than BUG(), and the compiler can put the call in a slow path that
shouldn't affect most calls, so in all cases it's likely preferable.

Thanks,
Mark.