Re: [PATCH 1/9] readahead: introduce PG_readahead

From: Fengguang Wu
Date: Sat May 19 2007 - 23:09:20 EST


On Sat, May 19, 2007 at 08:25:04AM -0700, Andrew Morton wrote:
> On Sat, 19 May 2007 20:30:31 +0800 Fengguang Wu <wfg@xxxxxxxxxxxxxxxx> wrote:
>
> > On Fri, May 18, 2007 at 11:28:24PM -0700, Andrew Morton wrote:
> > > On Thu, 17 May 2007 06:47:53 +0800 Fengguang Wu <wfg@xxxxxxxxxxxxxxxx> wrote:
> > >
> > > > Introduce a new page flag: PG_readahead.
> > >
> > > Is there any way in which we can avoid adding a new page flag?
> > >
> > > We have the advantage that if the kernel very occasionally gets the wrong
> > > result for PageReadahead(page), nothing particularly bad will happen, so we
> > > can do racy things.
> > >
> > > >From a quick peek, it appears that PG_readahead is only ever set against
> > > non-uptodate pages. If true we could perhaps exploit that: say,
> > > PageReadahead(page) == PG_referenced && !PG_uptodate?
> >
> > PG_uptodate will flip to 1 before the reader touches the page :(
>
> OK.
>
> > However, it may be possible to share the same bit with PG_reclaim or PG_booked.
> > Which one would be preferred?
>
> I'd like PG_booked to go away too - I don't think we've put that under the
> microscope yet. If it remains then its scope will be "defined by the
> filesystem", so readahead shouldn't use it. PG_reclaim belongs to core VFS
> so that's much better.
>
> Let's not do anything ugly, slow or silly in there, but please do take a
> look, see if there is an opportunity here.

The reuse code would look like the attached one.
It still needs more testing, and would fail if Christoph reuses
PG_reclaim in higher order pagecache in the future.

Fengguang Wu
---

Subject: mm: share PG_readahead and PG_reclaim

Share the same page flag bit for PG_readahead and PG_reclaim.

One is used only on file reads, another is only for emergency writes.
One is used mostly for fresh/young pages, another is for old pages.

Combinations of possible interactions are:

a) clear PG_reclaim => implicit clear of PG_readahead
it will delay an asynchronous readahead into a synchronous one
it actually does _good_ for readahead:
the pages will be reclaimed soon, it's readahead thrashing!
in this case, synchronous readahead makes more sense.

b) clear PG_readahead => implicit clear of PG_reclaim
one(and only one) page will not be reclaimed in time
it can be avoided by checking PageWriteback(page) in readahead first

c) set PG_reclaim => implicit set of PG_readahead
will confuse readahead and make it restart the size rampup process
it's a trivial problem, and can mostly be avoided by checking
PageWriteback(page) first in readahead

d) set PG_readahead => implicit set of PG_reclaim
PG_readahead will never be set on already cached pages.
PG_reclaim will always be cleared on dirtying a page.
so not a problem.

In summary,
a) we get better behavior
b,d) possible interactions can be avoided
c) racy condition exists that might affect readahead, but the chance
is _really_ low, and the hurt on readahead is trivial.

Compound pages also use PG_reclaim, but for now they do not interact with
reclaim/readahead code.

Signed-off-by: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx>
---
include/linux/page-flags.h | 4 +++-
mm/page-writeback.c | 1 +
mm/readahead.c | 4 ++++
3 files changed, 8 insertions(+), 1 deletion(-)

--- linux-2.6.22-rc1-mm1.orig/include/linux/page-flags.h
+++ linux-2.6.22-rc1-mm1/include/linux/page-flags.h
@@ -92,7 +92,9 @@
#define PG_lazyfree 20 /* MADV_FREE potential throwaway */

#define PG_booked 21 /* Has blocks reserved on-disk */
-#define PG_readahead 22 /* Reminder to do async read-ahead */
+
+/* PG_readahead is only used for file reads; PG_reclaim is only for writes */
+#define PG_readahead PG_reclaim /* Reminder to do async read-ahead */

/* PG_owner_priv_1 users should have descriptive aliases */
#define PG_checked PG_owner_priv_1 /* Used by some filesystems */
--- linux-2.6.22-rc1-mm1.orig/mm/page-writeback.c
+++ linux-2.6.22-rc1-mm1/mm/page-writeback.c
@@ -922,6 +922,7 @@ int clear_page_dirty_for_io(struct page

BUG_ON(!PageLocked(page));

+ ClearPageReclaim(page);
if (mapping && mapping_cap_account_dirty(mapping)) {
/*
* Yes, Virginia, this is indeed insane.
--- linux-2.6.22-rc1-mm1.orig/mm/readahead.c
+++ linux-2.6.22-rc1-mm1/mm/readahead.c
@@ -447,6 +447,10 @@ page_cache_readahead_ondemand(struct add
if (!ra->ra_pages)
return 0;

+ /* It's PG_reclaim! */
+ if (PageWriteback(page))
+ return 0;
+
if (page) {
ClearPageReadahead(page);


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/