[RFC UKL 06/10] x86/fault: Skip checking kernel mode access to user address space for UKL

From: Ali Raza
Date: Mon Oct 03 2022 - 18:22:34 EST


Normally, this check ensures that a kernel task has not ended up somehow
raising a page fault in the user part of address space. This is done by
checking if the CS value on stack. UKL always has the kernel value so this
check will always fail. This change makes sure that this check is only done
for non-UKL tasks by checking the in_user flag.

Cc: Jonathan Corbet <corbet@xxxxxxx>
Cc: Masahiro Yamada <masahiroy@xxxxxxxxxx>
Cc: Michal Marek <michal.lkml@xxxxxxxxxxx>
Cc: Nick Desaulniers <ndesaulniers@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: "H. Peter Anvin" <hpa@xxxxxxxxx>
Cc: Andy Lutomirski <luto@xxxxxxxxxx>
Cc: Eric Biederman <ebiederm@xxxxxxxxxxxx>
Cc: Kees Cook <keescook@xxxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Alexander Viro <viro@xxxxxxxxxxxxxxxxxx>
Cc: Arnd Bergmann <arnd@xxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx>
Cc: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>

Signed-off-by: Ali Raza <aliraza@xxxxxx>
---
arch/x86/mm/fault.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index fa71a5d12e87..26de3556ca2c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -1328,7 +1328,9 @@ void do_user_addr_fault(struct pt_regs *regs,
* on well-defined single instructions listed in the exception
* tables. But, an erroneous kernel fault occurring outside one of
* those areas which also holds mmap_lock might deadlock attempting
- * to validate the fault against the address space.
+ * to validate the fault against the address space. However, if we
+ * are configured as a unikernel and the fauling thread is the UKL
+ * application code we can proceed as normal.
*
* Only do the expensive exception table search when we might be at
* risk of a deadlock. This happens if we
@@ -1336,7 +1338,8 @@ void do_user_addr_fault(struct pt_regs *regs,
* 2. The access did not originate in userspace.
*/
if (unlikely(!mmap_read_trylock(mm))) {
- if (!user_mode(regs) && !search_exception_tables(regs->ip)) {
+ if (!user_mode(regs) && !search_exception_tables(regs->ip) &&
+ !is_ukl_thread()) {
/*
* Fault from code in kernel from
* which we do not expect faults.
--
2.21.3