Re: [PATCH] Remove OOM killer from try_to_free_pages / all_unreclaimable braindamage

From: Marcelo Tosatti
Date: Sat Nov 06 2004 - 08:19:27 EST


On Sat, Nov 06, 2004 at 02:20:18AM +0100, Andrea Arcangeli wrote:
> On Fri, Nov 05, 2004 at 03:32:50PM -0800, Jesse Barnes wrote:
> > On Friday, November 05, 2004 12:01 pm, Marcelo Tosatti wrote:
> > > In my opinion the correct approach is to trigger the OOM killer
> > > when kswapd is unable to free pages. Once that is done, the number
> > > of tasks inside page reclaim is irrelevant.
> >
> > That makes sense.

Hi Andrea,

> I don't like it, kswapd may fail balancing because there's a GFP_DMA
> allocation that eat the last dma page, but we should not kill tasks if
> we fail to balance in kswapd, we should kill tasks only when no fail
> path exists (i.e. only during page faults, everything else in the kernel
> has a fail path and it should never trigger oom).

The OOM killer is only going to get triggered if kswapd is not able
to make _any_ progress in all zones. So it wont "fail balancing because there's
a GFP_DMA allocation that eat the last dma page".

As long as frees _one_ page during all passes from DEF_PRIORITY till priority=0,
it wont kill any task. See?

I dont get your point.

> If you move it in kswapd there's no way to prevent oom-killing from a
> syscall allocation (I guess even right now it would go wrong in this
> sense, but at least right now it's more fixable).

I dont understand what you mean. "prevent oom-killing from a syscall allocation" ?

> I want to move the oom
> kill outside the alloc_page paths. The oom killing is all about the page
> faults not having a fail path, and in turn the oom killing should be
> moved in the page fault code, not in the allocator. Everything else
> should keep returning -ENOMEM to the caller.

Isnt OOM killing all about the reclaiming efforts not being able to make progress?

> So to me moving the oom killer into kswapd looks a regression.

To me having tasks trigger the OOM kill is fundamentally broken
because it doesnt take into account kswapd page freeing
efforts which are in-progress at the very moment.

That makes senses a lot of sense to me - would love to be proved
wrong.

See, its completly screwed right now. The code inside out_of_memory()
which only triggers OOM if it has happened several times during the
past few seconds is horrible and shows how bad it is.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/