[PATCH] mm: remove unintentional voluntary preemption in get_mmap_lock_carefully

From: Mateusz Guzik
Date: Sun Aug 20 2023 - 06:47:40 EST


Should the trylock succeed (and thus blocking was avoided), the routine
wants to ensure blocking was still legal to do. However, the method
used ends up calling __cond_resched injecting a voluntary preemption
point with the freshly acquired lock.

One can hack around it using __might_sleep instead of mere might_sleep,
but since threads keep going off CPU here, I figured it is better to
accomodate it.

Drop the trylock, do the read lock which does the job prior to lock
acquire.

Found by checking off-CPU time during kernel build (like so:
"offcputime-bpfcc -Ku"), sample backtrace:
finish_task_switch.isra.0
__schedule
__cond_resched
lock_mm_and_find_vma
do_user_addr_fault
exc_page_fault
asm_exc_page_fault
- sh (4502)
10

Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>
---
mm/memory.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index 1ec1ef3418bf..f31d5243272b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5257,12 +5257,6 @@ EXPORT_SYMBOL_GPL(handle_mm_fault);

static inline bool get_mmap_lock_carefully(struct mm_struct *mm, struct pt_regs *regs)
{
- /* Even if this succeeds, make it clear we *might* have slept */
- if (likely(mmap_read_trylock(mm))) {
- might_sleep();
- return true;
- }
-
if (regs && !user_mode(regs)) {
unsigned long ip = instruction_pointer(regs);
if (!search_exception_tables(ip))
--
2.39.2