[tip:x86/urgent] x86: mm: Read cr2 before prefetching the mmap_lock

From: tip-bot for Ingo Molnar
Date: Tue Jun 16 2009 - 04:38:07 EST


Commit-ID: 5dfaf90f8052327c92fbe3c470a2e6634be296c0
Gitweb: http://git.kernel.org/tip/5dfaf90f8052327c92fbe3c470a2e6634be296c0
Author: Ingo Molnar <mingo@xxxxxxx>
AuthorDate: Tue, 16 Jun 2009 10:23:32 +0200
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Tue, 16 Jun 2009 10:23:32 +0200

x86: mm: Read cr2 before prefetching the mmap_lock

Prefetch instructions can generate spurious faults on certain
models of older CPUs. The faults themselves cannot be stopped
and they can occur pretty much anywhere - so the way we solve
them is that we detect certain patterns and ignore the fault.

There is one small path of code where we must not take faults
though: the #PF handler execution leading up to the reading
of the CR2 (the faulting address). If we take a fault there
then we destroy the CR2 value (with that of the prefetching
instruction's) and possibly mishandle user-space or
kernel-space pagefaults.

It turns out that in current upstream we do exactly that:

prefetchw(&mm->mmap_sem);

/* Get the faulting address: */
address = read_cr2();

This is not good.

So turn around the order: first read the cr2 then prefetch
the lock address. Reading cr2 is plenty fast (2 cycles) so
delaying the prefetch by this amount shouldnt be a big issue
performance-wise.

[ And this might explain a mystery fault.c warning that sometimes
occurs on one an old AMD/Semptron based test-system i have -
which does have such prefetch problems. ]

Cc: Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Nick Piggin <npiggin@xxxxxxx>
Cc: Pekka Enberg <penberg@xxxxxxxxxxxxxx>
Cc: Vegard Nossum <vegard.nossum@xxxxxxxxx>
Cc: Jeremy Fitzhardinge <jeremy@xxxxxxxx>
Cc: Hugh Dickins <hugh.dickins@xxxxxxxxxxxxx>
LKML-Reference: <20090616030522.GA22162@Krystal>
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>


---
arch/x86/mm/fault.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index c6acc63..0482fa6 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -951,11 +951,11 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
tsk = current;
mm = tsk->mm;

- prefetchw(&mm->mmap_sem);
-
/* Get the faulting address: */
address = read_cr2();

+ prefetchw(&mm->mmap_sem);
+
if (unlikely(kmmio_fault(regs, address)))
return;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/