Re: [PATCH] fork: Allow stack to be wiped on fork

From: Andy Lutomirski
Date: Tue Feb 20 2018 - 20:57:02 EST


On Wed, Feb 21, 2018 at 12:31 AM, Andrew Morton
<akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> On Tue, 16 Jan 2018 21:50:15 -0800 Kees Cook <keescook@xxxxxxxxxxxx> wrote:
>
>> One of the classes of kernel stack content leaks is exposing the contents
>> of prior heap or stack contents when a new process stack is allocated.
>> Normally, those stacks are not zeroed, and the old contents remain in
>> place. With some types of stack content exposure flaws, those contents
>> can leak to userspace. Kernels built with CONFIG_CLEAR_STACK_FORK will
>> no longer be vulnerable to this, as the stack will be wiped each time
>> a stack is assigned to a new process. There's not a meaningful change
>> in runtime performance; it almost looks like it provides a benefit.
>>
>> Performing back-to-back kernel builds before:
>> Run times: 157.86 157.09 158.90 160.94 160.80
>> Mean: 159.12
>> Std Dev: 1.54
>>
>> With CONFIG_CLEAR_STACK_FORK=y:
>> Run times: 159.31 157.34 156.71 158.15 160.81
>> Mean: 158.46
>> Std Dev: 1.46
>>
>> ...
>>
>> --- a/arch/Kconfig
>> +++ b/arch/Kconfig
>> @@ -904,6 +904,14 @@ config VMAP_STACK
>> the stack to map directly to the KASAN shadow map using a formula
>> that is incorrect if the stack is in vmalloc space.
>>
>> +config CLEAR_STACK_FORK
>> + bool "Clear the kernel stack at each fork"
>> + help
>> + To resist stack content leak flaws, this clears newly allocated
>> + kernel stacks to keep previously freed heap or stack contents
>> + from being present in the new stack. This has almost no
>> + measurable performance impact.
>> +
>
> It would be much nicer to be able to control this at runtime rather
> than compile-time. Why not a /proc tunable? We could always use more
> of those ;)

/proc/sys/kernel/hardening_features_that_cost_essentially_nothing?

Seriously, though, why don't we just enable it unconditionally? It
wouldn't surprise me if it really is a speedup on more workloads than
it slows down -- it'll fill the kernel stack into the CPU cache with
exclusive ownership very quickly (streamily and without actually
reading from memory, I imagine, at least on new enough CPUs) rather
than grabbing each cache line one by one as they get used.