Re: More waitpid issues with CLONE_DETACHED/CLONE_THREAD
From: Linus Torvalds
Date: Sat Jan 31 2004 - 23:43:41 EST
On Sat, 31 Jan 2004, Daniel Jacobowitz wrote:
>
> This may be related to the python bug reported today...
Indeed.
Having a "waitpid(x, .., WNOHANG)" return 0 is a very interesting
condition. That condition basically guarantees that:
- the kernel did find the child
- but the kernel decided that the child cannot be reaped right then.
If you see the process as a Zombie in a "ps" listing, then we know that
that isn't the reason why it couldn't be reaped. Can you verify that
/proc/<pid>/status shows it as "Z (zombie)"?
In fact, if we see it as "Z (zombie)", we know even more: it means that
wait_task_zombie() was never called, because that would have started out
with changing the process state to "X (dead)".
And that in turn implies that "eligible_child()" would have returned 2.
Which is a normal occurrence: it happens when a process group leader still
has threads attached to it. At that point it may be a Zombie, but we can't
reap it yet. The threads have to go away before the thing can be reaped.
Can you verify that that process doesn't have any sub-threads? (Again,
that should be easily visible in /proc/<pid>/task/).
Another alternative is that the process is a zombie, but it is being
traced. When that happens, it shows up on the "ptrace_children" list, and
we'll see in in wait4(), but we won't be able to reap it.
Roland, Ingo - have you followed the discussion on linux-kernel? Something
strange does seem to be going on..
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/