Re: Some questions about linux kernel.

From: Horst von Brand (vonbrand@sleipnir.valparaiso.cl)
Date: Tue Mar 14 2000 - 22:02:09 EST


Michael Bacarella <mbac@nyct.net> said:

[...]

> > Is not vfork() solution to this particular problem?

Not at all. vfork was a solution for the fork()+exec() case befor COW came
around: The child used the parent's memory space until exec(2), to avoid
copying it. COW (as today in Linux) gives you the same benefits (if not
more), at less cost to the programmer.

[...]

> In the best case, it will be a change that is entirely transparent to
> userspace and works perfectly, or at least well enough that nobody minds
> ignoring the few people who scream in horror when it breaks something dear
> to them :) That's what people naturally want to aim for. It'll be
> interesting to see how this is resolved (if it's resolved).

OK, let's focus the discussion here:

The problem with the current scheme is that a process can request memory,
and get it granted; but that is no guarantee that it can _use_ the memory:
If the system runs out, you might get a SIGSEGV (or SIGKILL) just by
accessing a local variable that happens to be on a COW page which the OS
can't copy for lack of space. Very unpolite behaviour, against which no
program can sanely protect itself. The basic problem is overcommitment, the
OS is promising memory in the hope it doesn't have to keep the promise.

If the OS doesn't use overcommitment, the program will get NULL out of
malloc(3) or its ilk; it can protect against this by using normal paranoid
programming. But as the system must make sure it can fullfill all promises
in the worst case all the time, much memory will be tied up in "just in
case" reserves (many of them computed very pessimistically, i.e., your
ls(1) has requested a maximal stack space of a few Mb in case it has to
handle a huge directory listing, but you called it for a few files only),
and you'll go OOM earlier. Now, as very few programs are really programmed
in the paranoid mode, and even less can even hope to cope sanely with
required memory not being available, the result in practical terms is that
you need much more memory (or swap space) to get almost exactly what you
have in the former case: Processes run out of memory and crash.

Trouble is, that in the Unix model each time your 400Mb emacs process
fork(2)s it creates _2_ 400Mb processes (a responsible OS will have to make
sure it has the spare 400Mb, just in case the child doesn't exec(2)
anything smaller), the new process then exec(2)s ls(1), and most of the
400Mb reserve was for nothing. If you use vfork(2), you don't need 400Mb
for the child process, true. But not every use of fork(2) can be replaced
by vfork(2); in fact, most "interesting" uses can't.

And again: The OOM case is _extremely_ exceptional, it makes absolutely no
sense to "get the real culprit", "optimize the hog finding", "handle a OOM
process killing policy that is user tunable" or any such: If the system
goes OOM, it is dead. OOM process killing might help it limp along so it
can die with some dignity, and not just go down in flames. Expecting more
out of it is simple wishful thinking.

OTOH, if the OOM is caused by a malicious user, that is a very different
scenario, and no amount of clever victim selection schemes will help; a
determined attacker will find a way to circumvent the OOM killer and take
down the system anyway. Other mechanisms have to be put in place to handle
this (and they will probably be of some use in preventing the first case
too).

-- 
Horst von Brand                             vonbrand@sleipnir.valparaiso.cl
Casilla 9G, Viņa del Mar, Chile                               +56 32 672616

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:29 EST