[PATCH RESEND] x86/entry: Don't write to CR3 when restoring to kernel CR3

From: Brendan Jackman
Date: Thu Aug 17 2023 - 08:16:29 EST


From: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx>

Skip resuming KERNEL pages since it is already KERNEL CR3

Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxxxxx>
Signed-off-by: Brendan Jackman <jackmanb@xxxxxxxxxx>
---

While staring at paranoid_exit I was confused about why we had this CR3
write, avoiding it seems like a free optimisation. The original commit
21e94459110252 ("x86/mm: Optimize RESTORE_CR3") says "Most NMI/paranoid
exceptions will not in fact change pagetables" but I didn't't understand
what the "most" was referring to. I then discovered this patch on the
mailing list, Andy said[1] that it looks correct so maybe now is the
time to merge it?

Note there's another patch in [1] as well, the benefit of that one is
not obvious to me though.

We've tested an equivalent patch in our internal kernel.

[1] https://lore.kernel.org/lkml/20200526043507.51977-3-laijs@xxxxxxxxxxxxxxxxx/
-- >8 --
arch/x86/entry/calling.h | 13 ++++---------
1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index f6907627172b..b2458685d56e 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -236,14 +236,13 @@ For 32-bit we have the following conventions - kernel is built with
.macro RESTORE_CR3 scratch_reg:req save_reg:req
ALTERNATIVE "jmp .Lend_\@", "", X86_FEATURE_PTI

- ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID
-
/*
- * KERNEL pages can always resume with NOFLUSH as we do
- * explicit flushes.
+ * Skip resuming KERNEL pages since it is already KERNEL CR3.
*/
bt $PTI_USER_PGTABLE_BIT, \save_reg
- jnc .Lnoflush_\@
+ jnc .Lend_\@
+
+ ALTERNATIVE "jmp .Lwrcr3_\@", "", X86_FEATURE_PCID

/*
* Check if there's a pending flush for the user ASID we're
@@ -261,10 +260,6 @@ For 32-bit we have the following conventions - kernel is built with
SET_NOFLUSH_BIT \save_reg

.Lwrcr3_\@:
- /*
- * The CR3 write could be avoided when not changing its value,
- * but would require a CR3 read *and* a scratch register.
- */
movq \save_reg, %cr3
.Lend_\@:
.endm
--
2.41.0.694.ge786442a9b-goog