Re: [PATCH] mm: skip zombie in OOM-killer

From: David Rientjes
Date: Mon Mar 07 2011 - 15:43:21 EST


On Mon, 7 Mar 2011, Andrew Vagin wrote:

> > Andrey is patching the case where an eligible TIF_MEMDIE process is found
> > but it has already detached its ->mm.  In combination with the patch
> > posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> > which makes select_bad_process() iterate over all threads, it is an
> > effective solution.
>
> Probably you said about the first version of my patch.
> This version is incorrect because of
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03
>
> but my first patch is correct and it has a simple reproducer(I
> attached it). You can execute it and your kernel hangs up, because the
> parent doesn't wait children, but the one child (zombie) will have
> flag TIF_MEMDIE, oom_killer will kill nobody
>

The second version of your patch works fine in combination with the
pending "oom: prevent unnecessary oom kills or kernel panics" patch from
linux-mm (included below). Try your test case with both this patch and
the second version of your patch.

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -292,11 +292,11 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
unsigned long totalpages, struct mem_cgroup *mem,
const nodemask_t *nodemask)
{
- struct task_struct *p;
+ struct task_struct *g, *p;
struct task_struct *chosen = NULL;
*ppoints = 0;

- for_each_process(p) {
+ do_each_thread(g, p) {
unsigned int points;

if (oom_unkillable_task(p, mem, nodemask))
@@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
* the process of exiting and releasing its resources.
* Otherwise we could get an easy OOM deadlock.
*/
- if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+ if ((p->flags & PF_EXITING) && p->mm) {
if (p != current)
return ERR_PTR(-1UL);

@@ -337,7 +337,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
chosen = p;
*ppoints = points;
}
- }
+ } while_each_thread(g, p);

return chosen;
}