Re: [PATCH v2 4/5] x86/entry: Enable random_kstack_offset support

From: Kees Cook
Date: Sat Mar 28 2020 - 18:26:48 EST


On Tue, Mar 24, 2020 at 01:32:30PM -0700, Kees Cook wrote:
> Allow for a randomized stack offset on a per-syscall basis, with roughly
> 5 bits of entropy.
>
> Signed-off-by: Kees Cook <keescook@xxxxxxxxxxxx>
> ---
> arch/x86/Kconfig | 1 +
> arch/x86/entry/common.c | 12 +++++++++++-
> 2 files changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index beea77046f9b..b9d449581eb6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -150,6 +150,7 @@ config X86
> select HAVE_ARCH_TRANSPARENT_HUGEPAGE
> select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
> select HAVE_ARCH_VMAP_STACK if X86_64
> + select HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET
> select HAVE_ARCH_WITHIN_STACK_FRAMES
> select HAVE_ASM_MODVERSIONS
> select HAVE_CMPXCHG_DOUBLE
> diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
> index 9747876980b5..086d7af570af 100644
> --- a/arch/x86/entry/common.c
> +++ b/arch/x86/entry/common.c
> @@ -26,6 +26,7 @@
> #include <linux/livepatch.h>
> #include <linux/syscalls.h>
> #include <linux/uaccess.h>
> +#include <linux/randomize_kstack.h>
>
> #include <asm/desc.h>
> #include <asm/traps.h>
> @@ -189,6 +190,13 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
> lockdep_assert_irqs_disabled();
> lockdep_sys_exit();
>
> + /*
> + * x86_64 stack alignment means 3 bits are ignored, so keep
> + * the top 5 bits. x86_32 needs only 2 bits of alignment, so
> + * the top 6 bits will be used.
> + */
> + choose_random_kstack_offset(rdtsc() & 0xFF);
> +
> cached_flags = READ_ONCE(ti->flags);
>
> if (unlikely(cached_flags & EXIT_TO_USERMODE_LOOP_FLAGS))
> @@ -283,6 +291,7 @@ __visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
> {
> struct thread_info *ti;
>
> + add_random_kstack_offset();
> enter_from_user_mode();
> local_irq_enable();
> ti = current_thread_info();

So, I got an email from 0day that this caused a performance regression[1]
with things _turned off_. On closer examination:

Before (objdump -dS vmlinux):

__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
ffffffff81003920: 41 54 push %r12
ffffffff81003922: 55 push %rbp
ffffffff81003923: 48 89 f5 mov %rsi,%rbp
ffffffff81003926: 53 push %rbx
ffffffff81003927: 48 89 fb mov %rdi,%rbx
struct thread_info *ti;

enter_from_user_mode();
local_irq_enable();
...


After:

__visible void do_syscall_64(unsigned long nr, struct pt_regs *regs)
{
ffffffff81003960: 55 push %rbp
ffffffff81003961: 48 89 e5 mov %rsp,%rbp
ffffffff81003964: 41 55 push %r13
ffffffff81003966: 41 54 push %r12
ffffffff81003968: 49 89 f4 mov %rsi,%r12
ffffffff8100396b: 53 push %rbx
ffffffff8100396c: 48 89 fb mov %rdi,%rbx
ffffffff8100396f: 48 83 ec 08 sub $0x8,%rsp
ffffffff81003973: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
ffffffff8100397a: 00 00
ffffffff8100397c: 48 89 45 e0 mov %rax,-0x20(%rbp)
ffffffff81003980: 31 c0 xor %eax,%eax
asm_volatile_goto("1:"
ffffffff81003982: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
struct thread_info *ti;

add_random_kstack_offset();
enter_from_user_mode();
local_irq_enable();

The "nopl" there is the static_branch code that I'd expect. However, the
preample is quite different. *drum roll* Anyone else recognize %gs:0x28?

That's the stack canary. :P It seems that GCC views this as an array:

char *ptr = __builtin_alloca(offset & 0x3FF);
asm volatile("" : "=m"(*ptr));

because it's locally allocated on the stack. *face palm*

I'll go figure out a way to fix this...

-Kees

[1] 202003281505.0F481D3@keescook/">https://lore.kernel.org/lkml/202003281505.0F481D3@keescook/

--
Kees Cook