Re: REGRESSION: Performance regressions from switchinganon_vma->lock to mutex

From: Peter Zijlstra
Date: Fri Jun 17 2011 - 14:29:02 EST


On Fri, 2011-06-17 at 20:18 +0200, Peter Zijlstra wrote:
> On Fri, 2011-06-17 at 11:01 -0700, Linus Torvalds wrote:
>
> > So I do think that "page_referenced_anon()" should do a trylock, and
> > return "referenced" if the trylock fails. Comments?
>
> The only problem I can immediately see with that is when a single
> process' anon memory is dominant, then such an allocation will never
> succeed in freeing these pages because the one lock will pin pretty much
> all anon. Then again, there's always a few file pages to drop.
>
> That said, its rather unlikely, and iirc people were working on removing
> direct reclaim, or at least rely less on it.
>
>

something like so I guess, completely untested etc..

Also, there's a page_lock_anon_vma() user in split_huge_page(), which is
used in mm/swap_state.c:add_to_swap(), which is also in the reclaim
path, not quite sure what to do there.

Aside from the THP thing there's a user in memory-failure.c, which looks
to be broken as it is because its calling things under tasklist_lock
which isn't preemptible, but it looks like we can simply swap the
tasklist_lock vs page_lock_anon_vma.

---
kernel/Makefile | 1 +
mm/rmap.c | 39 +++++++++++++++++++++++++++++++++++++--
2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/kernel/Makefile b/kernel/Makefile
index 2d64cfc..f6d05de 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -101,6 +101,7 @@ obj-$(CONFIG_RING_BUFFER) += trace/
obj-$(CONFIG_TRACEPOINTS) += trace/
obj-$(CONFIG_SMP) += sched_cpupri.o
obj-$(CONFIG_IRQ_WORK) += irq_work.o
+obj-m += test.o

obj-$(CONFIG_PERF_EVENTS) += events/

diff --git a/mm/rmap.c b/mm/rmap.c
index 0eb463e..40cd399 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -400,6 +400,41 @@ out:
return anon_vma;
}

+struct anon_vma *page_trylock_anon_vma(struct page *page)
+{
+ struct anon_vma *anon_vma = NULL;
+ struct anon_vma *root_anon_vma;
+ unsigned long anon_mapping;
+
+ rcu_read_lock();
+ anon_mapping = (unsigned long) ACCESS_ONCE(page->mapping);
+ if ((anon_mapping & PAGE_MAPPING_FLAGS) != PAGE_MAPPING_ANON)
+ goto out;
+ if (!page_mapped(page))
+ goto out;
+
+ anon_vma = (struct anon_vma *) (anon_mapping - PAGE_MAPPING_ANON);
+ root_anon_vma = ACCESS_ONCE(anon_vma->root);
+ if (!mutex_trylock(&root_anon_vma->mutex)) {
+ anon_vma = NULL;
+ goto out;
+ }
+
+ /*
+ * If the page is still mapped, then this anon_vma is still
+ * its anon_vma, and holding the mutex ensures that it will
+ * not go away, see anon_vma_free().
+ */
+ if (!page_mapped(page)) {
+ mutex_unlock(&root_anon_vma->mutex);
+ anon_vma = NULL;
+ }
+
+out:
+ rcu_read_unlock();
+ return anon_vma;
+}
+
/*
* Similar to page_get_anon_vma() except it locks the anon_vma.
*
@@ -694,7 +729,7 @@ static int page_referenced_anon(struct page *page,
struct anon_vma_chain *avc;
int referenced = 0;

- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return referenced;

@@ -1396,7 +1431,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
struct anon_vma_chain *avc;
int ret = SWAP_AGAIN;

- anon_vma = page_lock_anon_vma(page);
+ anon_vma = page_trylock_anon_vma(page);
if (!anon_vma)
return ret;



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/