Re: [PATCH -mm] mm: more likely reclaim MADV_SEQUENTIAL mappings

From: Nick Piggin
Date: Mon Jul 21 2008 - 01:49:20 EST


On Monday 21 July 2008 11:48, Andrew Morton wrote:
> On Mon, 21 Jul 2008 09:09:26 +0900 "KOSAKI Motohiro"
<kosaki.motohiro@xxxxxxxxxxxxxx> wrote:
> > Hi Johannes,
> >
> > > File pages accessed only once through sequential-read mappings between
> > > fault and scan time are perfect candidates for reclaim.
> > >
> > > This patch makes page_referenced() ignore these singular references and
> > > the pages stay on the inactive list where they likely fall victim to
> > > the next reclaim phase.
> > >
> > > Already activated pages are still treated normally. If they were
> > > accessed multiple times and therefor promoted to the active list, we
> > > probably want to keep them.
> > >
> > > Benchmarks show that big (relative to the system's memory)
> > > MADV_SEQUENTIAL mappings read sequentially cause much less kernel
> > > activity. Especially less LRU moving-around because we never activate
> > > read-once pages in the first place just to demote them again.
> > >
> > > And leaving these perfect reclaim candidates on the inactive list makes
> > > it more likely for the real working set to survive the next reclaim
> > > scan.
> >
> > looks good to me.
> > Actually, I made similar patch half year ago.
> >
> > in my experience,
> > - page_referenced_one is performance critical point.
> > you should test some benchmark.
> > - its patch improved mmaped-copy performance about 5%.
> > (Of cource, you should test in current -mm. MM code was changed
> > widely)
> >
> > So, I'm looking for your test result :)
>
> The change seems logical and I queued it for 2.6.28.
>
> But yes, testing for what-does-this-improve is good and useful, but so
> is testing for what-does-this-worsen. How do we do that in this case?

It's OK, but as always I worry about adding "cool new bells and
whistles" to make already-bad code work a bit faster. It slows
things down. A mispredicted branch btw is about as costly as an
atomic operation.

It is already bad because: if you are doing a big streaming copy
which you know is going to blow the cache and not be used again,
then you should be unmapping behind you as you go. If you do not
do this, then page reclaim has to do the rmap walk, page table
walk, and then the (unbatched, likely IPI delivered) TLB shootdown
for every page. Not to mention churning through the LRU and
chucking other things out just to find these pages.

So what you actually should do is use direct IO, or do page
unmappings and fadvise thingies to throw out the cache.

Adding code and branches to speed up by 5% an already terribly
suboptimal microbenchmark is not very good.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/