Re: A good reason to use vfork()

Theodore Y. Ts'o (tytso@mit.edu)
Tue, 2 Nov 1999 16:18:38 -0500


From: Chuck Phillips <cdp@peakpeak.com>
Date: Tue, 2 Nov 1999 06:12:50 -0700 (MST)

Scenario 1: In the case of a server that absolutely depends on a copy of
itself to handle a particular event without, for example, leaving some
critical data file (etc.) in an inconsistent state, the best behavior is to
simply have the "fork()" fail -- *particularly* with copy-on-write
semantics. Even if the child process *never* calls malloc() or requires
additional stack space, it could die in mid-expression by simply assigning
to an *existing* variable and triggering a copy of one page too many. The
child process (which may be in a critical section) is now itself in an
inconsistent state, and it may be far too late or far too difficult for the
parent to fix the problem. For that matter, fixing the problem might
require more additional VM than available. If fork() fails in this case,
the chances of the server being able to maintain consistent state (at least
dieing gracefully) are much improved. At least some (most?) commercial
UNIXen behave this way (by default, at least) for precisely this reason.

Digital Unix does this, by default, although you can turn on "swap
overcommit" mode as well. The problem with the default is at least in
theory, every single text page could get modified if a debugger attached
to the process and set breakpoints everywhere. Therefore, Digital Unix
doesn't allow a process to fork() unless it can commit swap space for
every single text and data page for the process. If you run an ftp
server, with hundreds of ftp processes, you need gobs and gobs of swap
space (most of which is never used) because even though the text pages
of the ftp daemon are shared in memory, each ftp daemon needs to have
enoguh swap space for all of its text pages. This results in an
*extremely* inefficient usage of swap space, and if you don't figure
this out, you'll be scratching your head wondering why forks() are
dieing after the 20th or 30th ftp daemon, even though there's plenty of
swap space.....

The problem is that if you're going to *guarantee* that there's enoguh
swap space for a process, you end up needing to reserve huge amounts of
VM space that might never get used. On the other hand, if you don't
guarantee this, then it's possible for a process to run the system out
of memory just by dirtying a previously shared copy-on-write page, and
the system has to do something and most of the choices (kill the
process, lock up, etc.) aren't particularly desireable. There are no
real easy answers here....

What I normally tell people is that you should never let the system get
anywhere this regime; if the system is swapping, then by definition you
don't have enough memory, and you should buy some more memory. Memory
is cheap, and it will make your system go *much* faster. Swap space,
then, is buffer between the ideal state, and the point where the system
has to take drastic (unfriendly) measures. You should definitely
configure a system to have swap, but the system should be configured so
that it should never need to use it.

- Ted

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/