Re: [PATCH v3] kthread: Make kthread_create() killable.

From: David Rientjes
Date: Thu Sep 26 2013 - 14:53:56 EST


On Thu, 26 Sep 2013, Tetsuo Handa wrote:

> > wait_for_completion() is scary if that completion requires memory that
> > cannot be allocated because the caller is killed but uninterruptible.
>
> I don't think these lines are specific to wait_for_completion() users.
>
> Currently the OOM killer is disabled throughout from "the moment the OOM killer
> chose a process to kill" to "the moment the task_struct of the chosen process
> becomes unreachable". Any blocking functions which wait in TASK_UNINTERRUPTIBLE
> (e.g. mutex_lock()) can disable the OOM killer if the current thread is chosen
> by the OOM killer. Therefore, any users of blocking functions which wait in
> TASK_UNINTERRUPTIBLE are considered scary if they assume that the current
> thread will not be chosen by the OOM killer.
>

Yeah, that's always been true.

> But it seems to me that re-enabling the OOM killer at some point is more
> realizable than purging all such users.
>
> To re-enable the OOM killer at some point, the OOM killer needs to choose more
> processes if the to-be-killed process cannot be terminated within an adequate
> period.
>
> For example, add "unsigned long memdie_stamp;" to "struct task_struct" and do
> "p->memdie_stamp = jiffies + 5 * HZ;" before "set_tsk_thread_flag(p, TIF_MEMDIE);"
> and do
>
> if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
> if (unlikely(frozen(task)))
> __thaw_task(task);
> + /* Choose more processes if the chosen process cannot die. */
> + if (time_after(jiffies, p->memdie_stamp) &&
> + task->state == TASK_UNINTERRUPTIBLE)
> + return OOM_SCAN_CONTINUE;
> if (!force_kill)
> return OOM_SCAN_ABORT;
> }
>
> in oom_scan_process_thread().
>

There may not be any eligible processes left and then the machine panics.
These time-based delays also have caused a complete depletion of memory
reserves if more than one process is chosen and each consumes an
non-neglible amount of memory which would then cause livelock. We used to
have a jiffies-based rekill in 2.6.18 internally and we finally could
remove it when mm->mmap_sem issues were fixed (mostly by checking for
fatal_signal_pending() and aborting when necessary).

> [PATCH v3] kthread: Make kthread_create() killable.
>
> Any user process callers of wait_for_completion() except global init process
> might be chosen by the OOM killer while waiting for completion() call by some
> other process which does memory allocation.
>
> When such users are chosen by the OOM killer when they are waiting for
> completion() in TASK_UNINTERRUPTIBLE, the system will be kept stressed
> due to memory starvation because the OOM killer cannot kill such users.
>
> kthread_create() is one of such users and this patch fixes the problem for
> kthreadd by making kthread_create() killable.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Cc: Oleg Nesterov <oleg@xxxxxxxxxx>
> Acked-by: David Rientjes <rientjes@xxxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>

Absolutely, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/