[RFC PATCH v3 4/8] x86/arch_prctl: add ARCH_DISABLE_PTI_{NOW,NEXT} to enable/disable PTI

From: Willy Tarreau
Date: Wed Jan 10 2018 - 14:29:20 EST


These two prctls adjust the current task flags allowing to disable page
table isolation respectively for the current process or for the one
resulting from the next execve().

Both settings depend on CONFIG_PER_PROCESS_PTI. It is not possible to
set the flags if the pti_adjust sysctl is lower than 1, nor if the task
isn't capable of CAP_SYS_RAWIO, though it is still possible to disable
them.

Setting the flags is not allowed anymore once the task has created new
threads, but it's still possible to disable them.

Signed-off-by: Willy Tarreau <w@xxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>

v3:
- depend on CONFIG_PER_PROCESS_PTI
- switch back to task flags
- use one task flag for the immediate task (config-based setting) and
one task flag for the task resulting from the next execve (wrapper-based
setting)
- check the pti_adjust sysctl

v2:
- use {set,clear}_thread_flag() as recommended by Peter
- use task->mm->context.pti_disable instead of task flag
- check for mm_users == 1
- check for CAP_SYS_RAWIO only when setting, not clearing
- make the code depend on CONFIG_PAGE_TABLE_ISOLATION
---
arch/x86/include/uapi/asm/prctl.h | 3 +++
arch/x86/kernel/process_64.c | 30 ++++++++++++++++++++++++++++++
2 files changed, 33 insertions(+)

diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h
index 5a6aac9..1564f98 100644
--- a/arch/x86/include/uapi/asm/prctl.h
+++ b/arch/x86/include/uapi/asm/prctl.h
@@ -10,6 +10,9 @@
#define ARCH_GET_CPUID 0x1011
#define ARCH_SET_CPUID 0x1012

+#define ARCH_DISABLE_PTI_NOW 0x1021
+#define ARCH_DISABLE_PTI_NEXT 0x1022
+
#define ARCH_MAP_VDSO_X32 0x2001
#define ARCH_MAP_VDSO_32 0x2002
#define ARCH_MAP_VDSO_64 0x2003
diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
index c754662..b4de8aa 100644
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -654,7 +654,37 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2)
ret = put_user(base, (unsigned long __user *)arg2);
break;
}
+#ifdef CONFIG_PER_PROCESS_PTI
+ case ARCH_DISABLE_PTI_NOW:
+ if (!task->mm || atomic_read(&task->mm->mm_users) > 1)
+ return -EPERM;
+
+ if (arg2 && (!capable(CAP_SYS_RAWIO) || pti_adjust < 1))
+ return -EPERM;
+
+ if (doit) {
+ if (arg2)
+ set_thread_flag(TIF_DISABLE_PTI_NOW);
+ else
+ clear_thread_flag(TIF_DISABLE_PTI_NOW);
+ }
+ break;

+ case ARCH_DISABLE_PTI_NEXT:
+ if (!task->mm || atomic_read(&task->mm->mm_users) > 1)
+ return -EPERM;
+
+ if (arg2 && (!capable(CAP_SYS_RAWIO) || pti_adjust < 1))
+ return -EPERM;
+
+ if (doit) {
+ if (arg2)
+ set_thread_flag(TIF_DISABLE_PTI_NEXT);
+ else
+ clear_thread_flag(TIF_DISABLE_PTI_NEXT);
+ }
+ break;
+#endif
#ifdef CONFIG_CHECKPOINT_RESTORE
# ifdef CONFIG_X86_X32_ABI
case ARCH_MAP_VDSO_X32:
--
1.7.12.1