Re: FWD: [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone

From: Rik van Riel
Date: Fri Dec 18 2009 - 12:44:23 EST


On 12/18/2009 11:23 AM, Andrea Arcangeli wrote:
On Thu, Dec 17, 2009 at 09:05:23PM +0000, Hugh Dickins wrote:

An rwlock there has been proposed on several occasions, but
we resist because that change benefits this case but performs
worse on more common cases (I believe: no numbers to back that up).

I think rwlock for anon_vma is a must. Whatever higher overhead of the
fast path with no contention is practically zero, and in large smp it
allows rmap on long chains to run in parallel, so very much worth it
because downside is practically zero and upside may be measurable
instead in certain corner cases. I don't think it'll be enough, but I
definitely like it.

I agree, changing the anon_vma lock to an rwlock should
work a lot better than what we have today. The tradeoff
is a tiny slowdown in medium contention cases, at the
benefit of avoiding catastrophic slowdown in some cases.

With Nick Piggin's fair rwlocks, there should be no issue
at all.

Rik suggested to me to have a cowed newly allocated page to use its
own anon_vma. Conceptually Rik's idea is fine one, but the only
complication then is how to chain the same vma into multiple anon_vma
(in practice insert/removal will be slower and more metadata will be
needed for additional anon_vmas and vams queued in more than
anon_vma). But this only will help if the mapcount of the page is 1,
if the mapcount is 10000 no change to anon_vma or prio_tree will solve
this,

It's even more complex than this for anonymous pages.

Anonymous pages get COW copied in child (and parent)
processes, potentially resulting in one page, at each
offset into the anon_vma, for every process attached
to the anon_vma.

As a result, with 10000 child processes, page_referenced
can end up searching through 10000 VMAs even for pages
with a mapcount of 1!

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/