Re: oom killer gone nuts

From: Andries Brouwer
Date: Thu Jan 20 2005 - 15:53:46 EST


On Thu, Jan 20, 2005 at 06:15:44PM +0100, Andrea Arcangeli wrote:

> > Yes, the fact that the oom-killer exists is a serious problem.
> > People work on trying to tune it, instead of just removing it.
>
> I'm working on fixing it, not just tuning it. The bugs in mainline
> aren't about the selection algorithm (which is normally what people
> calls oom killer). The bugs in mainline are about being able to kill a
> task reliably, regardless of which task we pick, and every linux kernel
> out there has always killed some task when it was oom. So the bugs are
> just obvious regressions of 2.6 if compared to 2.4.

Yes, earlier one lost a job once in a great while, these days it is
once in a while - the frequency has gone up.

But let me stress that I also consider the earlier situation
unacceptable. It is really bad to lose a few weeks of computation.

You talk about "when it is oom", as if it would be unavoidable,
an act of nature. But it can be avoided, and should be avoided,
unless the sysadmin explicitly says that oom is OK for him.

(Compare allowing oom with overclocking - there is a trade-off
between speed and reliability. It must be possible to choose
for reliability. Indeed, reliability must be the default.)

Andries

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/