Re: [PATCH] 2.3.99-pre6-3+ VM rebalancing

From: Stephen C. Tweedie (
Date: Wed Apr 26 2000 - 08:00:31 EST


On Wed, Apr 26, 2000 at 04:25:23AM -0700, David S. Miller wrote:
> Getting the VM to respond properly in a way which doesn't freak out
> in the mass-filescan case is non-trivial. Simple LRU over all pages
> simply doesn't cut it.
> I believe this is not true at all. Clean pages will be preferred to
> toss simply because they are easier to get rid of.

As soon as you differentiate between clean and dirty page again, you
no longer have pure LRU. We're agreeing here --- LRU on its own is not
enough, you need _some_ mechanism to give preference to the eviction of
clean, pure cache pages. Whether it's different queues, or separate
mechanisms for swapout as we have now, is a different issue --- the one
thing we cannot afford is blind LRU without any feedback on the
properties of the pages themselves.

> I am of the opinion that vmscan.c:swap_out() is one of our biggest
> problems, because it kills us in the case where a few processes have
> a pagecache page mapped, haven't accessed it in a long time, and
> swap_out doesn't unmap those pages in time for the LRU shrink_mmap
> code to fully toss it.


> This happens even though these pages are
> excellant candidates for freeing. So here is where I came to the
> conclusion that LRU needs to have the capability of tossing arbitrary
> pages from process address spaces. This is why in my experiental
> hacks I just killed swap_out() completely, and taught LRU how to
> do all of the things swap_out did. I could do this because the
> LRU scanner could go from a page to all mappings of that page, even
> for anonymous and swap pages.

Doing it isn't the problem. Doing it efficiently is, if you have
fork() and mremap() in the picture. With mremap(), you cannot assume
that the virtual address of an anonymous page is the same in all
processes which have the page mapped.

So, basically, to find all the ptes for a given page, you have to
walk every single vma in every single mm which is a fork()ed
ancestor or descendent of the mm whose address_space you indexed
the page against.

Granted, it's probably faster than the current swap_out mechanism, but
the worst case is still not much fun if you have fragmented address
spaces with a lot of vmas.

Detecting the right vma isn't hard, because the vma's vm_pgoff is
preserved over mremap(). It's the linear scan that is the danger.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
Please read the FAQ at

This archive was generated by hypermail 2b29 : Sun Apr 30 2000 - 21:00:11 EST