Re: [PATCH] mm: do not rely on preempt_count in print_vma_addr

From: Yang Shi
Date: Mon Nov 06 2017 - 11:17:35 EST




On 11/6/17 5:40 AM, Michal Hocko wrote:
On Mon 06-11-17 13:12:22, Michal Hocko wrote:
On Mon 06-11-17 13:00:25, Peter Zijlstra wrote:
On Mon, Nov 06, 2017 at 11:43:54AM +0100, Michal Hocko wrote:
Yes the comment is very much accurate.

Which suggests that print_vma_addr might be problematic, right?
Shouldn't we do trylock on mmap_sem instead?

Yes that's complete rubbish. trylock will get spurious failures to print
when the lock is contended.

Yes, but I guess that it is acceptable to to not print the state under
that condition.

So what do you think about this? I think this is more robust than
playing tricks with the explicit preempt count checks and less tedious
than checking to make it conditional on the context. This is on top of
Linus tree and if accepted it should replace the patch discussed here.
---
From 0de6d57cbc54ee2686d1f1e4ffcc4ed490ded8aa Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko@xxxxxxxx>
Date: Mon, 6 Nov 2017 14:31:20 +0100
Subject: [PATCH] mm: do not rely on preempt_count in print_vma_addr

The preempt count check on print_vma_addr has been added by e8bff74afbdb
("x86: fix "BUG: sleeping function called from invalid context" in
print_vma_addr()") and it relied on the elevated preempt count from
preempt_conditional_sti because preempt_count check doesn't work on
non preemptive kernels by default. The code has evolved though and
d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag
handling") has replaced preempt_conditional_sti by an explicit
preempt_disable which is noop on !PREEMPT so the check in print_vma_addr
is broken.

Fix the issue by using trylock on mmap_sem rather than chacking the

s/chacking/checking

preempt count. The allocation we are relying on has to be GFP_NOWAIT
as well. There is a chance that we won't dump the vma state if the lock
is contended or the memory short but this is acceptable outcome and much
less fragile than the not working preemption check or tricks around it.

Fixes: d99e1bd175f4 ("x86/entry/traps: Refactor preemption and interrupt flag handling")
Signed-off-by: Michal Hocko <mhocko@xxxxxxxx>

Acked-by: Yang Shi <yang.s@xxxxxxxxxxxxxxx>

Regards,
Yang

---
mm/memory.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/memory.c b/mm/memory.c
index a728bed16c20..1e308ac8ca0a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4457,17 +4457,15 @@ void print_vma_addr(char *prefix, unsigned long ip)
struct vm_area_struct *vma;
/*
- * Do not print if we are in atomic
- * contexts (in exception stacks, etc.):
+ * we might be running from an atomic context so we cannot sleep
*/
- if (preempt_count())
+ if (!down_read_trylock(&mm->mmap_sem))
return;
- down_read(&mm->mmap_sem);
vma = find_vma(mm, ip);
if (vma && vma->vm_file) {
struct file *f = vma->vm_file;
- char *buf = (char *)__get_free_page(GFP_KERNEL);
+ char *buf = (char *)__get_free_page(GFP_NOWAIT);
if (buf) {
char *p;