Re: can't oom-kill zap the victim's memory?

From: Oleg Nesterov
Date: Sun Sep 20 2015 - 10:59:36 EST


On 09/20, Tetsuo Handa wrote:
>
> Oleg Nesterov wrote:
> > On 09/17, Kyle Walker wrote:
> > >
> > > Currently, the oom killer will attempt to kill a process that is in
> > > TASK_UNINTERRUPTIBLE state. For tasks in this state for an exceptional
> > > period of time, such as processes writing to a frozen filesystem during
> > > a lengthy backup operation, this can result in a deadlock condition as
> > > related processes memory access will stall within the page fault
> > > handler.
> >
> > And there are other potential reasons for deadlock.
> >
> > Stupid idea. Can't we help the memory hog to free its memory? This is
> > orthogonal to other improvements we can do.
>
> So, we are trying to release memory without waiting for arriving at
> exit_mm() from do_exit(), right? If it works, it will be a simple and
> small change that will be easy to backport.
>
> The idea is that since fatal_signal_pending() tasks no longer return to
> user space, we can release memory allocated for use by user space, right?

Yes.

> Then, I think that this approach can be applied to not only OOM-kill case
> but also regular kill(pid, SIGKILL) case (i.e. kick from signal_wake_up(1)
> or somewhere?).

I don't think so... but we might want to do this if (say) we are not going
to kill someone else because fatal_signal_pending(current).

> A dedicated kernel thread (not limited to OOM-kill purpose)
> scans for fatal_signal_pending() tasks and release that task's memory.

Perhaps a dedicated kernel thread makes sense (see other emails),
but I don't think it should scan the killed threads. oom-kill should
kict it.

Anyway, let me repeat there are a lot of details we might want to
discuss. But the initial changes should be simple as possible, imo.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/