Re: [RFC PATCH(experimental) 2/2] Fix freezer-kthread_stop race

From: Gautham R Shenoy
Date: Mon Apr 23 2007 - 16:36:39 EST


On Mon, Apr 23, 2007 at 10:39:56PM +0400, Oleg Nesterov wrote:
> On 04/23, Gautham R Shenoy wrote:
> >
> > On Sat, Apr 21, 2007 at 01:12:09AM +0400, Oleg Nesterov wrote:
> > > On 04/19, Gautham R Shenoy wrote:
> > > >
> > > > @@ -63,12 +74,16 @@ void refrigerator(void)
> > > > recalc_sigpending(); /* We sent fake signal, clean it up */
> > > > spin_unlock_irq(&current->sighand->siglock);
> > > >
> > > > + task_lock(current);
> > > > for (;;) {
> > > > set_current_state(TASK_UNINTERRUPTIBLE);
> > > > if (!frozen(current))
> > > > break;
> > > > + task_unlock(current);
> > > > schedule();
> > > > + task_lock(current);
> > > > }
> > > > + task_unlock(current);
> > > > pr_debug("%s left refrigerator\n", current->comm);
> > > > current->state = save;
> > >
> > > Just curious, why this change?
> >
> > This can race with hold_freezer_for_task() calling thaw_process. Earlier
> > thaw_process(p) was called only after the process 'p' was frozen.
> > Now with hold_freezer_for_task(), we can as well call thaw_process(p)
> > when 'p' is in the freezing stage. Hence the task_lock.
>
> hold_freezer_for_task()->thaw_process(p) will wake up the task. Or the
> caller of refrigerator will notice "!frozen()". Note that refrigerator()
> sets PF_FROZEN under task_lock().
>
> In fact we have the same issue when thaw_tasks()->thaw_process(p) happens
> when the freezing fails. In that case 'p' may be not frozen.

Yes. I guess I was being too cautious :)

>
> > > Also, you are planning to add different freezing states (FE_HOTPLUG_CPU,
> > > FE_SUSPEND, etc). In that case each of them needs a separate .count, because
> > > it should be negative when try_to_freeze_tasks() returns. Now consider
> > > the case when we are doing freeze_processes(FE_A | FE_B) ...
> >
> > So can't we in that case find out the weight of the freeze_event variable and
> > subtract that weight from the count (if the count is <=0 ) ?
>
> Probably yes... but if we are speaking about kthrad_stop() only, this could
> be afaics solved in more simple way, as Rafael suggests.

I agree on this one. kthread_stop appears to be the only valid place
where such a correction is required. So Rafael's approach makes sense
here.

>
> Oleg.
>
Thanks and Regards
gautham.

--
Gautham R Shenoy
Linux Technology Center
IBM India.
"Freedom comes with a price tag of responsibility, which is still a bargain,
because Freedom is priceless!"
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/