Re: Some questions about linux kernel.

From: Jesse Pollard (pollard@tomcat.admin.navo.hpc.mil)
Date: Wed Mar 15 2000 - 17:48:11 EST


Horst von Brand <vonbrand@sleipnir.valparaiso.cl>:
>
> Michael Bacarella <mbac@nyct.net> said:
>
> [...]
>
> > > Is not vfork() solution to this particular problem?
>
> Not at all. vfork was a solution for the fork()+exec() case befor COW came
> around: The child used the parent's memory space until exec(2), to avoid
> copying it. COW (as today in Linux) gives you the same benefits (if not
> more), at less cost to the programmer.

vfork was a solution in that the simple case for the fork()+exec() does not

>
> [...]
>
> > In the best case, it will be a change that is entirely transparent to
> > userspace and works perfectly, or at least well enough that nobody minds
> > ignoring the few people who scream in horror when it breaks something dear
> > to them :) That's what people naturally want to aim for. It'll be
> > interesting to see how this is resolved (if it's resolved).
>
> OK, let's focus the discussion here:
>
> The problem with the current scheme is that a process can request memory,
> and get it granted; but that is no guarantee that it can _use_ the memory:
> If the system runs out, you might get a SIGSEGV (or SIGKILL) just by
> accessing a local variable that happens to be on a COW page which the OS
> can't copy for lack of space. Very unpolite behaviour, against which no
> program can sanely protect itself. The basic problem is overcommitment, the
> OS is promising memory in the hope it doesn't have to keep the promise.
>
> If the OS doesn't use overcommitment, the program will get NULL out of
> malloc(3) or its ilk; it can protect against this by using normal paranoid
> programming. But as the system must make sure it can fullfill all promises
> in the worst case all the time, much memory will be tied up in "just in
> case" reserves (many of them computed very pessimistically, i.e., your
> ls(1) has requested a maximal stack space of a few Mb in case it has to
> handle a huge directory listing, but you called it for a few files only),
> and you'll go OOM earlier. Now, as very few programs are really programmed
> in the paranoid mode, and even less can even hope to cope sanely with
> required memory not being available, the result in practical terms is that
> you need much more memory (or swap space) to get almost exactly what you
> have in the former case: Processes run out of memory and crash.

That is less objectionable with user resource limits available.

>
> Trouble is, that in the Unix model each time your 400Mb emacs process
> fork(2)s it creates _2_ 400Mb processes (a responsible OS will have to make
> sure it has the spare 400Mb, just in case the child doesn't exec(2)
> anything smaller), the new process then exec(2)s ls(1), and most of the
> 400Mb reserve was for nothing. If you use vfork(2), you don't need 400Mb
> for the child process, true. But not every use of fork(2) can be replaced
> by vfork(2); in fact, most "interesting" uses can't.

This was the only case that vfork was to be used. The "interesting" cases
are not relevent - they don't use vfork. I think the equivalent to vfork
should be available for those cases where fork is followed by exec. It would
make computing maximum swap sizes much easier. The simple: # of concurrent
users * # of processes per user * amount of memory per process + amount
guaranteed for system; is one. If the resource limits are available then
the user may run one very large process or a group of smaller processes,
and the system never runs out of memory. If system + all concurrent users
use up everything, then your system needs to shutdown.

> And again: The OOM case is _extremely_ exceptional, it makes absolutely no
> sense to "get the real culprit", "optimize the hog finding", "handle a OOM
> process killing policy that is user tunable" or any such: If the system
> goes OOM, it is dead. OOM process killing might help it limp along so it
> can die with some dignity, and not just go down in flames. Expecting more
> out of it is simple wishful thinking.

It is exceptional on single user systems. Not so exceptional on servers.
>
> OTOH, if the OOM is caused by a malicious user, that is a very different
> scenario, and no amount of clever victim selection schemes will help; a
> determined attacker will find a way to circumvent the OOM killer and take
> down the system anyway. Other mechanisms have to be put in place to handle
> this (and they will probably be of some use in preventing the first case
> too).

This is the case where user limits prevent the user from killing the system.
A malicious user will only hurt himself. The other users will be unaffected,
other than the normal reduction of CPU time and memory resources alloted
to the user.

It IS still possible to use up all user resources, but only by loging in
for as many times as there are concurrent users permitted. This limit still
won't cause an OOM, since the amount guaranteed for system will allow the
system to identify the user - and administration can direct the user to be
terminated.

I have run systems that allow overcommit, and do not allow it. On those
systems that allow it management has to be told that it rebooted because
someone used too much memory.

On single user workstations it generally isn't a problem. On large multi-user
systems, management tends to get upset over unexplained crashes and random
program aborts.
-------------------------------------------------------------------------
Jesse I Pollard, II
Email: pollard@navo.hpc.mil

Any opinions expressed are solely my own.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:16 EST