Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race

From: Oleg Nesterov
Date: Mon Apr 23 2007 - 14:40:34 EST


On 04/23, Gautham R Shenoy wrote:
>
> On Sat, Apr 21, 2007 at 01:12:09AM +0400, Oleg Nesterov wrote:
> > On 04/19, Gautham R Shenoy wrote:
> > >
> > > @@ -63,12 +74,16 @@ void refrigerator(void)
> > > recalc_sigpending(); /* We sent fake signal, clean it up */
> > > spin_unlock_irq(&current->sighand->siglock);
> > >
> > > + task_lock(current);
> > > for (;;) {
> > > set_current_state(TASK_UNINTERRUPTIBLE);
> > > if (!frozen(current))
> > > break;
> > > + task_unlock(current);
> > > schedule();
> > > + task_lock(current);
> > > }
> > > + task_unlock(current);
> > > pr_debug("%s left refrigerator\n", current->comm);
> > > current->state = save;
> >
> > Just curious, why this change?
>
> This can race with hold_freezer_for_task() calling thaw_process. Earlier
> thaw_process(p) was called only after the process 'p' was frozen.
> Now with hold_freezer_for_task(), we can as well call thaw_process(p)
> when 'p' is in the freezing stage. Hence the task_lock.

hold_freezer_for_task()->thaw_process(p) will wake up the task. Or the
caller of refrigerator will notice "!frozen()". Note that refrigerator()
sets PF_FROZEN under task_lock().

In fact we have the same issue when thaw_tasks()->thaw_process(p) happens
when the freezing fails. In that case 'p' may be not frozen.

> > Also, you are planning to add different freezing states (FE_HOTPLUG_CPU,
> > FE_SUSPEND, etc). In that case each of them needs a separate .count, because
> > it should be negative when try_to_freeze_tasks() returns. Now consider
> > the case when we are doing freeze_processes(FE_A | FE_B) ...
>
> So can't we in that case find out the weight of the freeze_event variable and
> subtract that weight from the count (if the count is <=0 ) ?

Probably yes... but if we are speaking about kthrad_stop() only, this could
be afaics solved in more simple way, as Rafael suggests.

Oleg.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/