Re: Overcommitable memory??

From: James Sutherland (jas88@cam.ac.uk)
Date: Mon Mar 13 2000 - 16:26:50 EST


On Mon, 13 Mar 2000, Michael Bacarella wrote:

>
> > > To make a functioning machine with absolutely NO overcommit would probably
> > > require gigabytes of swap which would never be used, just to back the
> > > stack pages and COW pages of all the processes. Plus, there's no real
> > > gain to be had.
> >
> > Sure there is. Apps would be told that the system is out of memory
> > instead of just getting a SIGKILL'ed out of the blue sky. Apps getting NULL
> > from malloc() can react appropriately, such as saving your files to disk,
> > trying again a little later or just exiting if that is acceptable for what
> > the app was doing. Apps getting SIGKILL will take your unsaved work with
> > them in the fall.
>
> The way I see it, apps that have successfully allocated memory in the past
> wouldn't start dying since there's no malloc() to fail, wheras new apps
> that want to bring down the system will start getting failed
> malloc()/mmap()'s

Except many apps DO allocate memory during run time for buffers, workspace
etc. That's why, at present, lots of processes start dropping dead once
malloc() starts failing.

> There's no reason to tell an application that it has X megs of memory all
> to itself to play with, and then KILL one of it's brothers if the kernel
> finds itself short.

We don't "tell" any application we have X Mb of RAM. It takes the RAM it
needs, if available - if not, it fails, and (typically) dies.

> Although I do admit to the problems: It'd be impossible to log into a
> system being "attacked" like that since any new login attempts would fail.

In fact, under these circumstances, EVERYTHING would fail.

> But on the other hand, malloc() DOES return EAGAIN. Some applications
> would think to retry malloc() in a few seconds, which may have hopes of
> succeeding.

Perhaps some will do this; it appears, though, that most just drop dead.

The best approach, IMO, is just to kill the process which appears to be
hogging all the memory. Now, at a later date, we'll be able to use a more
refined algorithm which will use the RATE of memory allocation, rather
than the absolute level, and also move this into a userspace daemon.

Then, we'll be able to define priorities: You may want to try killing all
Netscape browser processes first, then any batch jobs. OR you could make
it checkpoint certain processes to disk, as someone suggested. Or you
could stick with the current behaviour, and watch crond and friends
expire...

James.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:26 EST