[patch 56/60] x86/mm/kpti: Disable native VSYSCALL

From: Thomas Gleixner
Date: Mon Dec 04 2017 - 11:57:30 EST


From: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>

The KERNEL_PAGE_TABLE_ISOLATION code attempts to "poison" the user
portion of the kernel page tables. It detects entries that it wants that it
wants to poison in two ways:

* Looking for addresses >= PAGE_OFFSET

* Looking for entries without _PAGE_USER set

But, to allow the _PAGE_USER check to work, it must never be set on
init_mm entries, and an earlier patch in this series ensured that it
will never be set.

The VDSO is at a address >= PAGE_OFFSET and it is also mapped by init_mm.
Because of the earlier, KERNEL_PAGE_TABLE_ISOLATION-enforced restriction,
_PAGE_USER is never set which makes the VDSO unreadable to userspace.

This makes the "NATIVE" case totally unusable since userspace can not even
see the memory any more. Disable it whenever KERNEL_PAGE_TABLE_ISOLATION
is enabled.

Also add some help text about how KERNEL_PAGE_TABLE_ISOLATION might
affect the emulation case as well.

Signed-off-by: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Rik van Riel <riel@xxxxxxxxxx>
Cc: keescook@xxxxxxxxxx
Cc: Denys Vlasenko <dvlasenk@xxxxxxxxxx>
Cc: moritz.lipp@xxxxxxxxxxxxxx
Cc: linux-mm@xxxxxxxxx
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Brian Gerst <brgerst@xxxxxxxxx>
Cc: hughd@xxxxxxxxxx
Cc: daniel.gruss@xxxxxxxxxxxxxx
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Cc: michael.schwarz@xxxxxxxxxxxxxx
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: richard.fellner@xxxxxxxxxxxxxxxxx
Link: https://lkml.kernel.org/r/20171123003513.10CAD896@xxxxxxxxxxxxxxxxxx

---
arch/x86/Kconfig | 8 ++++++++
1 file changed, 8 insertions(+)

--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2249,6 +2249,9 @@ choice

config LEGACY_VSYSCALL_NATIVE
bool "Native"
+ # The VSYSCALL page comes from the kernel page tables
+ # and is not available when KERNEL_PAGE_TABLE_ISOLATION is enabled.
+ depends on !KERNEL_PAGE_TABLE_ISOLATION
help
Actual executable code is located in the fixed vsyscall
address mapping, implementing time() efficiently. Since
@@ -2266,6 +2269,11 @@ choice
exploits. This configuration is recommended when userspace
still uses the vsyscall area.

+ When KERNEL_PAGE_TABLE_ISOLATION is enabled, the vsyscall area will become
+ unreadable. This emulation option still works, but KERNEL_PAGE_TABLE_ISOLATION
+ will make it harder to do things like trace code using the
+ emulation.
+
config LEGACY_VSYSCALL_NONE
bool "None"
help