[RFC] panic_on_oom_timeout

From: Michal Hocko
Date: Tue Jun 09 2015 - 13:03:26 EST


Hi,
during the last iteration of the timeout based oom killer discussion
(http://marc.info/?l=linux-mm&m=143351457601723) I've proposed to
introduce panic_on_oom_timeout as an extension to panic_on_oom rather
than oom timeout which would allow OOM killer to select another oom
victim and do that until the OOM is resolved or the system panics due to
potential oom victims depletion.

My main rationale for going panic_on_oom_timeout way is that this
approach will lead to much more predictable behavior because the system
will get to a usable state after given amount of time + reboot time.
On the other hand, if the other approach was chosen then there is no
guarantee that another victim would be in any better situation than the
original one. In fact there might be many tasks blocked on a single lock
(e.g. i_mutex) and the oom killer doesn't have any way to find out which
task to kill in order to make the progress. The result would be
N*timeout time period when the system is basically unusable and the N is
unknown to the admin.

I think that it is more appropriate to shut such a system down when such
a corner case is hit rather than struggle for basically unbounded amount
of time.

Thoughts? An RFC implementing this is below. It is quite trivial and
I've tried to test it a bit. I will add the missing pieces if this looks
like a way to go.

There are obviously places in the oom killer and the page allocator path
which could be improved and this patch doesn't try to put them aside. It
is just providing a reasonable the very last resort when things go
really wrong.
---