Swapping (Was: Werner's latest patch and my News server)

Dr. Werner Fink (werner@suse.de)
Wed, 23 Jul 1997 16:02:07 +0200


>
> I just installed 2.0.31-pre2 plus Werner's July-21 version of his patches
> on our News server.
>
> The result is rather ugly:
>
> total used free shared buffers cached
> Mem: 95804 94052 1752 46416 4456 18180
> -/+ buffers: 71416 24388
> Swap: 104416 21400 83016
>
> As you can see, the number of buffers is much too low. The system isn't
> caching enough directody blocks and averages about two Netnews articles per
> second. Not nearly enough, I'm afraid. :-(

Hmmm ... the current code (2.0/2.1) has some design problems. On one hand you
can use your own try_to_free_page() with the your state fix ... and you run
in deep trouble on a higher load than 1 due to exponential growing swap I/O.
Any system running this becomes unusable if the load goes higher than 1 with
a memory consuming job.

On the other hand direct swapping for buffer/cache is complicated. If one does
this (e.g applying David's Millers buffer/swapping patch _without_ the state
fix) there _must_ be a limit in try_to_free_page() for stopping the
intensity/deep of freeing a page (e.g setting `stop' to 2 for priority ==
GFP_BUFFER or something similar). The reason is very simple: process/shared
pages should have a small precedence over buffer/cache pages or in some
cases the swap I/O grows over any (software) limit.

Note: If one combines your state fix in try_to_free_page() with the
buffer/swapping of David any system becomes unusable by simply running
and using the system. This is the conclusion out of the early patch attempts
I've done.

kswapd: What I'm missing is a fair and global usage score system or something
similar for _virtual_ pages or cluster of virtual pages. The current age on
demand system only ages physical pages which are reached until
try_to_free_page() frees a page. This is really a fast solution for a mostly
idle system ... but unfair, slow, and risky on high stress. With a usage
score system, or in other words, a system which counts page usages between
one or more kswapd wakeup's and a try_to_free_page which directly swaps out
the pages with the lowest usage count in comparison, the swap I/O would be
minimised to the necessary amount. The importance sequence of cache/buffer,
(dentry,) shared, and process pages is clearly useful too.

>
> What to do? I'm playing around with various tuning parameters in
> /proc/sys/vm/*, but no luck so far. Any help appreciated.
>
> NB: Of the RAM in this machine, admittedly, half (RSS output of ps) is
> taken up by squid and half of the rest is taken by INN. On a whim, I
> SIGSTOPped the cache server. Five minutes later, squid still had a constant
> RSS of 42 MBytes and the first annoyed users were calling in, so I had to
> continue it. :-/ vmstat, too, shows almost no swap activity.
>

... it's trivial but if two processes have more than the half physical
memory over the most time and the system performance needs ram ...
it's a `Binsenweisheit'.

Werner