Re: Out Of Memory in v. 2.1

Rik van Riel (H.H.vanRiel@phys.uu.nl)
Tue, 6 Oct 1998 09:12:31 +0200 (CEST)


On Tue, 6 Oct 1998, Andrea Arcangeli wrote:
> On Mon, 5 Oct 1998, Rik van Riel wrote:
> >to wait for the 2nd patch to get something actually working :)
>
> You should try it before speak. There' s no a second patch because the
> first one worked fine so far (I had also some nice report from people).
> And if I know there' s something wrong in one of my work _I_ declare it to
> the public _myself_ ASAP, BTW.

It worked for you, yes. That doesn't mean that it'll work fine
in all situations. Since the original poster claimed to be
somewhat of a newbie, I didn't think it was a Good Idea(tm) to
let him patch his kernel immediately with a patch that's just
made it's first appearance on linux-kernel...

> And my patch fix _bugs_ in the current MM design of Linux 2.1 and
> has nothing to do with your OOM killer.

It also introduces a new one: having kswapd stop when
try_to_free_page() fails. When try_to_free_page() fails,
that could also mean that all pages have been touched
since we ran last -- that just means we'll have to run
again; stopping will make things worse.

> If you want to use an OOM killer to choice which process to kill
> when the system is OOM you first need to know _when_ the system is
> OOM. Right now the kernel is not able to detect when the system is
> OOM after a page fault because __get_free_pages() never returns NULL
> _and_ because kswapd runs forever. My patch fix all such bugs. If
> you want to use your OOM killer you have simply to change the
> force_sig() in the swapin and anonymous page fault with something
> like your kill_the_best_process() (aka OOM_killer()).

Your patch doesn't really fix the bug. It makes sure that
kswapd has complete failures more often, and this can be
considered a bad thing instead of something nice.

> (Maybe you are running your oom killer in a ugly RT (yes must be RT
> to workaround the kswapd starvation _without_ my patch applyed)
> kernel daemon just to waste many good CPU cycles? Or you are killing
> process to early when really there' s still some memory (but yes,
> who cares, because you will never run your OOM killer in a 8mbyte
> 386, right?)?)

I agree that my OOM patch isn't very good yet. Tests at many
different people's places have confirmed that it picks the
'right' process almost every time, but it's not very smart
about detecting OOM yet. (the one problem is the legitimate
case where ram and swap are full but we have a lot of swap cache)

> And btw I don' t like an OOM killer that choice wich process to kill
> when the system is OOM. The only justice I can think is to kill
> _the_ current process: the process that has requested memory via a
> page fault if __get_free_pages() returned NULL (and if you are using
> a kernel daemon you can' t know which is that process btw).

It is not always right to kill the current process. It could
be X or you two-week-old simulation that just needs a few
extra kilobytes do format it's OUTPUT -- bang, there go two
weeks of calculations...
And all just because a newly started Netscape allocated all
but the last few kilobytes of memory. In that case, it might
be _far_ better to kill Netscape or something else that's both
new and large.

> So my thought is that we/_I_ don' t need an OOM killer.
> And instead _you_ need my patch to have your OOM killer working.

Killing the current process equals killing a random process;
most people don't like this, especially not when it is the X
server. This means that quite a lot of people do need an OOM
killer.

With your patch the system will run out of memory more often,
that is a very bad thing to happen.

> Please Rik, apply my patch to 2.1.124 _now_ and find a way to kill the
> machine or find a way where linux is not able to handle a heavy swapout
> and segfault even if there' s some swap avaible. Really really please
> (you are enabled also to use swapoff -a to do that ;-). I can' t.

Killing the machine is easy with your patch: if it happens
to be X that needs to swap something in when the memory is
gone, X will be killed and the machine will be dead (at least
the console will).

My patch never kills the X server or any other IOPL process.

Rik.
+-------------------------------------------------------------------+
| Linux memory management tour guide. H.H.vanRiel@phys.uu.nl |
| Scouting Vries cubscout leader. http://www.phys.uu.nl/~riel/ |
+-------------------------------------------------------------------+

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/