Re: [PATCH 2/3] mm: filemap: only do access activations on reads

From: Andrew Morton
Date: Mon Apr 04 2016 - 17:56:07 EST


On Mon, 04 Apr 2016 17:39:47 -0400 Rik van Riel <riel@xxxxxxxxxx> wrote:

> On Mon, 2016-04-04 at 14:22 -0700, Andrew Morton wrote:
> > On Mon,____4 Apr 2016 13:13:37 -0400 Johannes Weiner <hannes@xxxxxxxxxx
> > g> wrote:
> >
> > >
> > > Andres Freund observed that his database workload is struggling
> > > with
> > > the transaction journal creating pressure on frequently read pages.
> > >
> > > Access patterns like transaction journals frequently write the same
> > > pages over and over, but in the majority of cases those pages are
> > > never read back. There are no caching benefits to be had for those
> > > pages, so activating them and having them put pressure on pages
> > > that
> > > do benefit from caching is a bad choice.
> > Read-after-write is a pretty common pattern: temporary files for
> > example.____What are the opportunities for regressions here?
> >
> > Did you consider providing userspace with a way to hint "this file is
> > probably write-then-not-read"?
>
> I suspect the opportunity for regressions is fairly small,
> considering that temporary files usually have a very short
> life span, and will likely be read-after-written before they
> get evicted from the inactive list.

The opportunity for regressions in the current code is fairly small,
but Andres found one :( If there's any possibility at all, someone will
hit it.

One possible way to move forward is to write testcases to deliberately
hit the predicted problem, gain an understanding of how hard it is to
hit, how bad the effects are.

> As for hinting, I suspect it may make sense to differentiate
> between whole page and partial page writes, where partial
> page writes use FGP_ACCESSED, and whole page writes do not,
> under the assumption that if we write a partial page, there
> may be a higher chance that other parts of the page get
> accessed again for other writes (or reads).

hm, the FGP_foo documentation is a mess. There's some placed randomly
at pagecache_get_page() and FGP_WRITE got missed altogether.

The ext4 journal would be a decent (but not very significant) candidate
for a "this is never read from" interface. I guess the fs could
manually deactivate (or even free?) the pages.