Re: [PATCH 09/11] job control: reorganize wait_task_stopped()

From: Oleg Nesterov
Date: Fri May 13 2011 - 13:23:14 EST


Hi,

On 05/13, Tejun Heo wrote:
>
> On Thu, May 12, 2011 at 08:33:26PM +0200, Oleg Nesterov wrote:
>
> > At first glance, do_wait() does
> >
> > wait_task_stopped();
> > wait_task_continued();
> >
> > and the state can be changed CONTINIUED -> STOPPED in between, right?
> > Or something else?
>
> Yeah and exit transitions too.

I am not sure... but probably this depends on definition.

We already checked ->exit_state != ZOMBIE, and we are holding tasklist.
The child can't exit. I mean, it can't change its ->exit_state.

However, SIGKILL can clear SIGNAL_STOP_STOPPED, and we can "miss" it.
But this looks correct, the child is no longer stopped but it is still
not dead. So I think in this case wait(WNOHANG | WEXITED | WSTOPPED)
can fail, notabug.

OTOH, perhaps SIGKILL should set SIGNAL_STOP_CONTINUED in this case?
And keep it if it was already set.

> There simply is no synchronization
> there. We can probably solve it without acquiring siglock by adding
> "clear this before making state transitions" flag followed by a mb()

perhaps even simpler if ->EXIT transition is fine.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/