Re: [PATCH] splitlru: BDI_CAP_SWAP_BACKED

From: Rik van Riel
Date: Mon Jun 30 2008 - 17:18:51 EST


On Mon, 30 Jun 2008 21:55:13 +0100 (BST)
Hugh Dickins <hugh@xxxxxxxxxxx> wrote:
> On Mon, 30 Jun 2008, Rik van Riel wrote:
> >
> > Tmpfs is often in the same boat as anonymous memory.
> > Used for shared memory segments, or for files that
> > are temporary and will be gone soon.
>
> Anonymous memory, and temporary files, are often soon gone,
> okay. But I don't find that generalization compelling; and
> if they're soon gone, does it matter which lru they go on?

Temporary files are often soon gone.

Anonymous memory and shmem segments tend to stick around
for longer.

> > If swap space runs out, tmpfs pages should not be
> > scanned.
>
> That point I like. But I hope they'd go to the Unevictable
> on systems with no swap at all (of course, as with mlocking,
> that can change soon after).

If we have them on the *_ANON LRUs, we will automatically
not scan them when swap space runs out. Just like we do
not scan anonymous pages when there is no swap space left.

> > To me, this suggests they should probably continue
> > to live on the *_ANON LRUs. Worst case we make
> > tmpfs pages in files that are not mmaped (/tmp use)
> > start out on the inactive list, so they get evicted
> > first.
>
> Tweaking in/active I'll gladly leave to you! Whatever
> proves best. What's worrying me is that we have always treated
> shmem/tmpfs pages as file pages

This is a performance problem for database systems,
where the system ends up swapping out the shared
memory segment.

> (e.g. in /proc/meminfo as Cached
> not as SwapCached), up until the point that we retire them to
> swap; but in splitlru you're sending them down another path;
> then mem cgroups seem to want them as something else again.

Having the mem cgroups consistent with the global LRU
implementation would be good, indeed. That will make
balancing a bit easier.

> Your SwapBacked may indeed turn out to be the only implementable
> distinction, but it does worry me. A more useful distinction,
> my gut tells me, would be separate LRUs for page_mapped() and
> !page_mapped(), which reflects the existing swappiness notion.
>
> But that immediately hits the difficulty we have in switching LRU
> midstream, which your SwapBacked-throughout tmpfs neatly sidesteps.
>
> I'd really like to be able to try page_mapped/!page_mapped versus
> swap-backed/file-backed, but it would need some LRU-switching
> infrastructure (which might come at a prohibitive performance
> cost, since it's the batching that poses the problem).

Agreed, we understand the page_mapped/!page_mapped distinction
quite well. On the other hand, we do not understand LRU
acrobatics (moving pages between lists on mmap/munmap) and the
consequences of that...

I suspect we'll just have to tweak the swap-backed/file-backed
code until it works right. Once Andrew comes out with a new
-mm (with all the stability fixes), I will create a kernel RPM
with the latest split LRU code for Fedora 9, so we can get some
wider testing.

I have some performance tweaks in mind already, but not enough
data yet to justify them. I will continue working on that.

--
All rights reversed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/