Re: [patch -mm 4/9 v2] oom: remove compulsory panic_on_oom mode

From: David Rientjes
Date: Tue Feb 16 2010 - 02:53:49 EST


On Tue, 16 Feb 2010, Nick Piggin wrote:

> > Because it is inconsistent at the user's expense, it has never panicked
> > the machine for memory controller ooms, so why is a cpuset or mempolicy
> > constrained oom conditions any different?
>
> Well memory controller was added later, wasn't it? So if you think
> that's a bug then a fix to panic on memory controller ooms might
> be in order.
>

But what about the existing memcg users who set panic_on_oom == 2 and
don't expect the memory controller to be influenced by that?

> > It also panics the machine even
> > on VM_FAULT_OOM which is ridiculous,
>
> Why?
>

Because the oom killer was never called for VM_FAULT_OOM before, we simply
sent a SIGKILL to current, i.e. the original panic_on_oom semantics were
not even enforced.

> > the tunable is certainly not being
> > used how it was documented
>
> Why not? The documentation seems to match the implementation.
>

It was meant to panic the machine anytime it was out of memory, regardless
of the constraint, but that obviously doesn't match the memory controller
case. Just because cpusets and mempolicies decide to use the oom killer
as a mechanism for enforcing a user-defined policy does not mean that we
want to panic for them: mempolicies, for example, are user created and do
not require any special capability. Does it seem reasonable that an oom
condition on those mempolicy nodes should panic the machine when killing
the offender is possible (and perhaps even encouraged if the user sets a
high /proc/pid/oom_score_adj?) In other words, is an admin setting
panic_on_oom == 2 really expecting that no application will use
set_mempolicy() or do an mbind()? This is a very error-prone interface
that needs to be dealt with on a case-by-case basis and the perfect way to
do that is by setting the affected tasks to be OOM_DISABLE; that
interface, unlike panic_on_oom == 2, is very well understood by those with
CAP_SYS_RESOURCE.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/