Re: [PATCH 1/4] oom: Do not panic when OOM killer is sysrq triggered

From: Michal Hocko
Date: Thu Jul 09 2015 - 04:23:53 EST


On Wed 08-07-15 16:36:14, David Rientjes wrote:
> On Wed, 8 Jul 2015, Michal Hocko wrote:
>
> > From: Michal Hocko <mhocko@xxxxxxx>
> >
> > OOM killer might be triggered explicitly via sysrq+f. This is supposed
> > to kill a task no matter what e.g. a task is selected even though there
> > is an OOM victim on the way to exit. This is a big hammer for an admin
> > to help to resolve a memory short condition when the system is not able
> > to cope with it on its own in a reasonable time frame (e.g. when the
> > system is trashing or the OOM killer cannot make sufficient progress)
> >
> > E.g. it doesn't make any sense to obey panic_on_oom setting because
> > a) administrator could have used other sysrqs to achieve the
> > panic/reboot and b) the policy would break an existing usecase to
> > kill a memory hog which would be recoverable unlike the panic which
> > might be configured for the real OOM condition.
> >
> > It also doesn't make much sense to panic the system when there is no
> > OOM killable task because administrator might choose to do additional
> > steps before rebooting/panicking the system.
> >
> > While we are there also add a comment explaining why
> > sysctl_oom_kill_allocating_task doesn't apply to sysrq triggered OOM
> > killer even though there is no explicit check and we subtly rely
> > on current->mm being NULL for the context from which it is triggered.
> >
> > Also be more explicit about sysrq+f behavior in the documentation.
> >
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
>
> Nack, this is already handled by patch 2 in my series. I understand that

I guess you mean patch#3

> the titles were wrong for patches 2 and 3, but it doesn't mean we need to
> add hacks around the code before organizing this into struct oom_control

It is much easier to backport _fixes_ into older kernels (and yes I do
care about that) if they do not depend on other cleanups. So I do not
understand your point here. Besides that the cleanup really didn't make
much change to the actuall fix because one way or another you still have
to add a simple condition to rule out a heuristic/configuration which
doesn't apply to sysrq+f path.

So I am really lost in your argumentation here.

> or completely pointless comments and printks that will fill the kernel
> log.

Could you explain what is so pointless about a comment which clarifies
the fact which is not obviously visible from the current function?

Also could you explain why the admin shouldn't get an information if
sysrq+f didn't kill anything because no eligible task has been found?
--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/