Re: Some questions about linux kernel.

From: James Sutherland (jas88@cam.ac.uk)
Date: Wed Mar 15 2000 - 03:55:36 EST


On Tue, 14 Mar 2000, Alex Belits wrote:

> On Mon, 13 Mar 2000, Jason Gunthorpe wrote:
>
> > I'd think there are only two ways you can accidently OOM a box - a
> > single process goes nutz or something like a fork bomb. I'd think the
> > simplest solution is to look for the process using the most memory or
> > the user with the most processes+memory (some weird weight factor here)
> > and blow it away. If this means jonhnie looses their shell, their emacs
> > and whatever else because someone fork bombed a box then too bad.
>
> On all "workstation" boxes the largest process is most likely X server.
> On all workstations, used by graphics artists two largest processes are
> always X server and graphics editor. Since killing X server means that all
> its clients will lose their display and will be killed in random state
> (very likely without having reasonable means to preserve the data in
> them), this kind of policy will make absolutely no sense.

If this is the largest process, then yes, it would be the first to be
blown away. OTOH, the X server and graphics package should largely be in
RAM, rather than paged out - provided you have enough swap, the malloc()
bomb will just fill up swapspace (thereby becoming larger than X and the
gfx package) and then be the first up against the wall come the OOM.

Also, both X and the gfx package will have used much more CPU time than
the malloc()-bomb, so it will usually be killed in preference to them.

> > If this means apache gets blown away because a CGI went insane then too
> > bad.
>
> Killing Oracle, or any other server that depends on some process being
> alive and keeping a valuable, complex, hard to recover data on disk and
> in memory, is in some cases not any better than just blowing up the box.

It may not be any better, but it is certainly no worse - particularly if
we can SIGTERM the process first. In the case of the Oracle server, it
should be able to use this to sync the database, close connections and
exit gracefully.

Later, I would like to add priorities (e.g. try to kill x, y and z first,
then kill all the non-root processes (or the non-dbuser processes, or
whatever), then kill the rest.)

No strategy will ever be perfect, but this system is pretty good, IMO.

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:16 EST