Re: [PATCH] mm: implement WasActive page flag (for improvingcleancache)

From: Andrew Morton
Date: Thu Jan 26 2012 - 20:15:55 EST


On Thu, 26 Jan 2012 16:56:34 -0800 (PST)
Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> wrote:

> > > I'll find the place to add the call to ClearPageWasActive() for v2.
> >
> > AFAICT this patch consumes our second-last page flag, or close to it.
> > We'll all be breaking out in hysterics when the final one is gone.
>
> I'd be OK with only using this on 64-bit systems, though there
> are ARM folks playing with zcache that might disagree.

64-bit only is pretty lame and will reduce the appeal of cleancache and
will increase the maintenance burden by causing different behavior on
different CPU types. Most Linux machines are 32-bit! (My cheerily
unsubstantiated assertion of the day).

> Am I
> correct in assuming that your "second-last page flag" concern
> applies only to 32-bit systems?

Sort-of. Usually a flag which is 64-bit-only causes the above issues.

> > This does appear to be a make or break thing for cleancache - if we
> > can't fix https://lkml.org/lkml/2012/1/22/61 then cleancache is pretty
> > much a dead duck.
>
> Hmmm... is that URL correct? If so, there is some subtlety in
> that thread that I am missing as I don't understand the relationship
> to cleancache at all?

err, sorry, I meant your https://lkml.org/lkml/2011/8/17/351.

> > And I'm afraid that neither I nor other MM developers are likely to
> > help you with "fix cleancache via other means" because we weren't
> > provided with any description of what the problem is within cleancache,
> > nor how it will be fixed. All we are given is the assertion "cleancache
> > needs this".
>
> The patch comment says:
>
> The patch resolves issues reported with cleancache which occur
> especially during streaming workloads on older processors,
> see https://lkml.org/lkml/2011/8/17/351
>
> I can see that may not be sufficient, so let me expand on it.
>
> First, just as page replacement worked prior to the active/inactive
> redesign at 2.6.27, cleancache works without the WasActive page flag.
> However, just as pre-2.6.27 page replacement had problems on
> streaming workloads, so does cleancache. The WasActive page flag
> is an attempt to pass the same active/inactive info gathered by
> the post-2.6.27 kernel into cleancache, with the same objectives and
> presumably the same result: improving the "quality" of pages preserved
> in memory thus reducing refaults.
>
> Is that clearer? If so, I'll do better on the description at v2.

It really didn't tell us anything, apart from referring to vague
"problems on streaming workloads", which forces everyone to go off and
do an hour or two's kernel archeology, probably in the area of
readahead.

Just describe the problem! Why is it slow? Where's the time being
spent? How does the proposed fix (which we haven't actually seen)
address the problem? If you inform us of these things then perhaps
someone will have a useful suggestion. And as a side-effect, we'll
understand cleancache better.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/