Re: OOM killer firing on 2.6.18 and later during LTP runs

From: Andrew Morton
Date: Sun Nov 26 2006 - 02:27:57 EST


On Sat, 25 Nov 2006 22:00:45 -0500
Dave Jones <davej@xxxxxxxxxx> wrote:

> On Sat, Nov 25, 2006 at 01:28:28PM -0800, Andrew Morton wrote:
> > On Sat, 25 Nov 2006 13:03:45 -0800
> > "Martin J. Bligh" <mbligh@xxxxxxxxxx> wrote:
> >
> > > On 2.6.18-rc7 and later during LTP:
> > > http://test.kernel.org/abat/48393/debug/console.log
> >
> > The traces are a bit confusing, but I don't actually see anything wrong
> > there. The machine has used up all swap, has used up all memory and has
> > correctly gone and killed things. After that, there's free memory again.
>
> We covered this a month or two back. For RHEL5, we've ended up
> reintroducing the oom killer prevention logic that we had up until
> circa 2.6.10. It seemed that there exist circumstances where
> given a little more time, some memory hogging apps will run to completion
> allowing other allocators to succeed instead of being killed.

I _think_ what you're describing here is a false-positive oom-killing? But
Martin appears to be hitting a genuine oom.

But it does appear that some changes are needed, because lots of things got
oom-killed.

I think. Maybe not - there's no timestamping in those logs and it is of
course possible that we're seeing unrelated ooms which happened a long time
apart.

> For reference, here's the patch that Larry Woodman came up with
> for RHEL5.

gulp.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/