Re: [PATCH] mm,oom: Re-enable OOM killer using timers.

From: David Rientjes
Date: Wed Jan 20 2016 - 18:49:32 EST


On Wed, 20 Jan 2016, Tetsuo Handa wrote:

> > > My goal is to ask the OOM killer not to toss the OOM killer's duty away.
> > > What is important for me is that the OOM killer takes next action when
> > > current action did not solve the OOM situation.
> > >
> >
> > What is the "next action" when there are no more processes on your system,
>
> Just call panic(), as with select_bad_process() from out_of_memory() returned
> NULL.
>

No way is that a possible solution for a system-wide oom condition. We
could have megabytes of memory available in memory reserves and a simple
allocation succeeding could fix the livelock quite easily (and can be
demonstrated with my testcase). A panic is never better than allowing an
allocation to succeed through the use of available memory reserves.

For the memcg case, we wouldn't panic() when there are no more killable
processes, and this livelock problem can easily be exhibited in memcg
hierarchy oom conditions as well (and quite easier since it's in
isolation and doesn't get interferred with by external process freeing
elsewhere on the system). So, again, your approach offers no solution to
this case and you presumably suggest that we should leave the hierarchy
livelocked forever. Again, not a possible solution.

> If we can agree on combining both approaches, I'm OK with it. That will keep
> the OOM reaper simple, for the OOM reaper will not need to clear TIF_MEMDIE
> flag which is unfriendly for wait_event() in oom_killer_disable(), and the
> OOM reaper will not need to care about situations where TIF_MEMDIE flag is
> set when it is not safe to reap.
>

Please, allow us to review and get the oom reaper merged first and then
evaluate the problem afterwards.