Re: why swap at all?

From: Nick Piggin
Date: Wed May 26 2004 - 06:06:23 EST


Buddy Lumpkin wrote:
Hi Buddy,
Even for systems that don't *need* the extra memory space, swap can
actually provide performance improvements by allowing unused memory
to be replaced with often-used memory.


For example, I have 57MB swapped right now. It allows me to instantly
grep the kernel tree. If I turned swap off, each grep would probably
take 30 seconds.


Your analogy is flawed. There are many reasons why this doesn't work in the
real world.


It is not an analogy.

[snip]

I understand the basics of how Linux's memory management works.

Your grep analogy incorrectly assumes that you have a bunch of vacant memory
just waiting to store those filesystem pages, but that simply isn't the
case. Rather 57MB of anonymous memory was evicted to make room for 57MB of
anonymous or file system backed pages. Unless you have freed anonymous
memory on the system by closing applications. Your physical memory pages are
still mostly occupied.


Yes the 57MB of anonymous memory *was* evicted to make room for 57MB
of file system backed pages that grep pulled in presumably.

I tend to use grep rather often. I'm very glad that crud from mozilla,
XFree86, nautilus, gnome-settin, x-session-ma, etc has been paged out.
It allows me to grep the kernel source instantly.

This means your grep is only going to run faster if you already read those
files recently and they are already in the pagecache. You still have the
burdon of pushing pages that have not been used recently out of ram before
you can read in the new ones. And as long as you are performing a sufficient
amount of file system I/O, this is guaranteed to happen.


What would you have it do? Push out pages that have been recently used?

One thing that can be done to minimize the problem where heavy filesystem
I/O flushes important pages from memory like pages from shared libraries and
executables only for them to fault back in as soon as they become runnable,
is to implement something similar to what Sun implemented in Solaris 8
called the cyclical page cache. The idea is that the pagecache pages against
itself and is actually considered free memory from an anonymous memory
perspective. The pagecache is free to grow all it wants, but since it is
counted as free memory, anonymous memory allocation will cause the pagecache
to shrink because it is considered free memory.


"the pagecache pages against itself", what does that mean?

As these pages are evicted from the pagecache, they are placed on the
opposite side of the cachelist (linked list that stores pages that have a
vnode+offset already) than the side where pages are being overwritten. This
way frequently re-accessed pages that were placed on the cache list and were
eligible to be reclaimed, are found when the next minor fault occurs for
that vnode+offset and moved back to the opposite side of the list so that
they are not evicted.


I failed to grasp the mechanics of the cachelist and its opposite sides.
And why does one side have pages being overwritten? Sounds strange. But
I don't know Solaris.

Linux has an approximately-LRU ordered list. Newly accessed pages go in
the top and come out the bottom where they are reclaimed (or in the front
and out the back).

Since the cache list is counted as free memory, there is no way to wake up
the LRU mechanism to scan physical memory until 1/64 of physical memory is
consumed by anonymous memory.


That assumes that file backed cache is worth zero compared to
anonymous memory, which is not the case.

In Linux, we actually do the replacement in terms of mapped and
unmapped pages and bias replacement toward unmapped pages. We
will still evict long term inactive mapped pages though, which is
a good thing.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/