Re: [PATCH 2/4] oom: Do not invoke oom notifiers on sysrq+f

From: Michal Hocko
Date: Thu Jul 09 2015 - 04:55:20 EST


On Wed 08-07-15 16:37:49, David Rientjes wrote:
> On Wed, 8 Jul 2015, Michal Hocko wrote:
>
> > From: Michal Hocko <mhocko@xxxxxxx>
> >
> > A github user rfjakob has reported the following issue via IRC.
> > <rfjakob> Manually triggering the OOM killer does not work anymore in 4.0.5
> > <rfjakob> This is what it looks like: https://gist.github.com/rfjakob/346b7dc611fc3cdf4011
> > <rfjakob> Basically, what happens is that the GPU driver frees some memory, that satisfies the OOM killer
> > <rfjakob> But the memory is allocated immediately again, and in the, no processes are killed no matter how often you trigger the oom killer
> > <rfjakob> "in the end"
> >
> > Quoting from the github:
> > "
> > [19291.202062] sysrq: SysRq : Manual OOM execution
> > [19291.208335] Purging GPU memory, 74399744 bytes freed, 8728576 bytes still pinned.
> > [19291.390767] sysrq: SysRq : Manual OOM execution
> > [19291.396792] Purging GPU memory, 74452992 bytes freed, 8728576 bytes still pinned.
> > [19291.560349] sysrq: SysRq : Manual OOM execution
> > [19291.566018] Purging GPU memory, 75489280 bytes freed, 8728576 bytes still pinned.
> > [19291.729944] sysrq: SysRq : Manual OOM execution
> > [19291.735686] Purging GPU memory, 74399744 bytes freed, 8728576 bytes still pinned.
> > [19291.918637] sysrq: SysRq : Manual OOM execution
> > [19291.924299] Purging GPU memory, 74403840 bytes freed, 8728576 bytes still pinned.
> > "
> >
> > The issue is that sysrq+f (force_kill) gets confused by the regular OOM
> > heuristic which tries to prevent from OOM killer if some of the oom
> > notifier can relase a memory. The heuristic doesn't make much sense for
> > the sysrq+f path because this one is used by the administrator to kill
> > a memory hog.
> >
> > Reported-by: Jakob Unterwurzacher <jakobunt@xxxxxxxxx>
> > Signed-off-by: Michal Hocko <mhocko@xxxxxxx>
>
> Nack, the oom notify list has no place in the oom killer, it should be
> called in the page allocator before calling out_of_memory().

I cannot say I would like oom notifiers interface. Quite contrary, it is
just a crude hack. It is living outside of the shrinker interface which is
what the reclaim is using and it acts like the last attempt before OOM
(e.g. i915_gem_shrinker_init registers both "shrinkers"). So I am not
sure it belongs outside of the oom killer proper.

Besides that out_of_memory already contains shortcuts to prevent killing
a task. Why is this any different? I mean why shouldn't callers of
out_of_memory check whether the task is killed or existing before
calling out_of_memory?

--
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/