[RFC PATCH 2/3] x86/mm: make sure LAM is up-to-date during context switching

From: Yosry Ahmed
Date: Thu Mar 07 2024 - 08:43:13 EST


During context switching, if we are not switching to new mm and no TLB
flush is needed, we do not write CR3. However, it is possible that a
user thread enables LAM while a kthread is running on a different CPU
with the old LAM CR3 mask. If the kthread context switches into any
thread of that user process, it may not write CR3 with the new LAM mask,
which would cause the user thread to run with a misconfigured CR3 that
disables LAM on the CPU.

Fix this by making sure we write a new CR3 if LAM is not up-to-date. No
problems were observed in practice, this was found by code inspection.

Not that it is possible that mm->context.lam_cr3_mask changes throughout
switch_mm_irqs_off(). But since LAM can only be enabled by a
single-threaded process on its own behalf, in that case we cannot be
switching to a user thread in that same process, we can only be
switching to another kthread using the borrowed mm or a different user
process, which should be fine.

Fixes: 82721d8b25d7 ("x86/mm: Handle LAM on context switch")
Signed-off-by: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
---
arch/x86/mm/tlb.c | 50 ++++++++++++++++++++++++++++-------------------
1 file changed, 30 insertions(+), 20 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 2975d3f89a5de..3610c23499085 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -503,11 +503,12 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
{
struct mm_struct *prev = this_cpu_read(cpu_tlbstate.loaded_mm);
u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid);
+ u64 cpu_tlb_gen = this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen);
bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy);
+ bool need_flush = false, need_lam_update = false;
unsigned cpu = smp_processor_id();
unsigned long new_lam;
u64 next_tlb_gen;
- bool need_flush;
u16 new_asid;

/* We don't want flush_tlb_func() to run concurrently with us. */
@@ -570,32 +571,41 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
!cpumask_test_cpu(cpu, mm_cpumask(next))))
cpumask_set_cpu(cpu, mm_cpumask(next));

+ /*
+ * tlbstate_lam_cr3_mask() may be outdated if a different thread
+ * has enabled LAM while we were borrowing its mm on this CPU.
+ * Make sure we update CR3 in case we are switching to another
+ * thread in that process.
+ */
+ if (tlbstate_lam_cr3_mask() != mm_lam_cr3_mask(next))
+ need_lam_update = true;
+
/*
* If the CPU is not in lazy TLB mode, we are just switching
* from one thread in a process to another thread in the same
* process. No TLB flush required.
*/
- if (!was_lazy)
- return;
+ if (was_lazy) {
+ /*
+ * Read the tlb_gen to check whether a flush is needed.
+ * If the TLB is up to date, just use it. The barrier
+ * synchronizes with the tlb_gen increment in the TLB
+ * shootdown code.
+ */
+ smp_mb();
+ next_tlb_gen = atomic64_read(&next->context.tlb_gen);
+ if (cpu_tlb_gen < next_tlb_gen) {
+ /*
+ * TLB contents went out of date while we were
+ * in lazy mode.
+ */
+ new_asid = prev_asid;
+ need_flush = true;
+ }
+ }

- /*
- * Read the tlb_gen to check whether a flush is needed.
- * If the TLB is up to date, just use it.
- * The barrier synchronizes with the tlb_gen increment in
- * the TLB shootdown code.
- */
- smp_mb();
- next_tlb_gen = atomic64_read(&next->context.tlb_gen);
- if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
- next_tlb_gen)
+ if (!need_flush && !need_lam_update)
return;
-
- /*
- * TLB contents went out of date while we were in lazy
- * mode. Fall through to the TLB switching code below.
- */
- new_asid = prev_asid;
- need_flush = true;
} else {
/*
* Apply process to process speculation vulnerability
--
2.44.0.278.ge034bb2e1d-goog