Re: Some questions about linux kernel.

From: James Sutherland (jas88@cam.ac.uk)
Date: Tue Mar 14 2000 - 04:54:54 EST


On Mon, 13 Mar 2000, Jason Gunthorpe wrote:

>
> On Mon, 13 Mar 2000, James Sutherland wrote:
>
> > I want to emphasise this point: the OOM handler should **NEVER** be
> > invoked. If it is ever needed, there is something BADLY wrong with the
> > system - either we do something drastic (process slaughtering) or the
> > machine goes down completely. Worrying about which user processes are
> > killed to avoid an uncontrolled failure is a case of rearranging the
> > deckchairs on the Titanic.
>
> Just as someone who has been bitten by this, I agree with James. The
> number one priority is keeping the machine up - I don't care what you
> kill, just keep it running and preferably allow remote logins still...
>
> I'd think there are only two ways you can accidently OOM a box - a single
> process goes nutz or something like a fork bomb. I'd think the simplest
> solution is to look for the process using the most memory or the user with
> the most processes+memory (some weird weight factor here) and blow it
> away. If this means jonhnie looses their shell, their emacs and whatever
> else because someone fork bombed a box then too bad. If this means apache
> gets blown away because a CGI went insane then too bad. When the
> alternative is having to drive to the site and hit the reset button (which
> had to be done, twice :P) it is a minor trouble.
>
> In my case it turned out a WML process got into an infinite ram using
> loop, on 2.2.6-ac1 the machine just crashed - no kernel log messages,
> nothing, just dead. Unnaceptable :> In this particular case the wml
> process got up to some 400meg of ram usage before things got really out of
> hand. Lots of swap! These days there is a ulimit protecting these wml
> processes, but it was really hit and miss to find out that it was indeed
> wml that was causing this.. (Heck, it was a lucky guess it was even OOM)
>
> IMHO a completely seperate topic is how to prevent a malicious user from
> abusing this to nuke a machine.. That sounds like a very difficult prolem
> indeed.

In the simplest case (while(1){malloc(4096);}) then typically the
malicious code will be the first one blown away, at least on most
multi-user boxen (where there a lots of users, each using a few Mb, then
some DOSer grabs the last 50Mb in one process.

On a big system with few users (running huge simulations, say) then the
one big process might get blown away first. On the other hand, the
malicious users shouldn't have logins on those boxen anyway :-)

OK, a more intelligent malicious user could write code that would find the
biggest process currently running, then allocate slightly less memory than
it, then fork and repeat. Fixing this would need per-user resource
limits...

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:27 EST