Re: [RFC PATCH(Experimental) 2/4] Revert changes to workqueue.c

From: Srivatsa Vaddagiri
Date: Fri Feb 16 2007 - 11:48:05 EST


On Fri, Feb 16, 2007 at 06:33:21PM +0300, Oleg Nesterov wrote:
> I take my words back. It is not "ugly" any longer because with this change
> we don't do kthread_stop()->wakeup_process() while cwq->thread may sleep in
> work->func(). Still I don't see (ok, I am biased and probably wrong, please
> correct me) why kthread_stop+wait_to_die is better than cwq_should_stop(),
> see below.

I just like using existing code (kthread_stop) as much as possible and not add
new code (->should_stop). Also the 'while (cwq->thread != NULL)' loop in
cleanup_workqueue_thread is not neat IMHO, compared to kthread_stop+wait_to_die.

Pls compare cleanup_workqueue_thread() in 2.6.20-mm1 and what is
proposed in the patch :-

2.6.20-mm1 (cwq->should_stop)
=============================

static void cleanup_workqueue_thread(struct cpu_workqueue_struct *cwq, int cpu)
{
struct wq_barrier barr;
int alive = 0;

spin_lock_irq(&cwq->lock);
if (cwq->thread != NULL) {
insert_wq_barrier(cwq, &barr, 1);
cwq->should_stop = 1;
alive = 1;
}
spin_unlock_irq(&cwq->lock);

if (alive) {
wait_for_completion(&barr.done);

while (unlikely(cwq->thread != NULL))
cpu_relax();
/*
* Wait until cwq->thread unlocks cwq->lock,
* it won't touch *cwq after that.
*/
smp_rmb();
spin_unlock_wait(&cwq->lock);
}
}



Patch (based on kthread_should_stop)
====================================

static void cleanup_workqueue_thread(struct workqueue_struct *wq, int cpu)
{
struct cpu_workqueue_struct *cwq = per_cpu_ptr(wq->cpu_wq, cpu);

if (cwq->thread != NULL) {
kthread_stop(cwq->thread);
cwq->thread = NULL;
}
}


The version using kthread_should_stop() is more simple IMO.


> > I feel it is nice if the cleanup is synchronous i.e when cpu_down() is
> > complete, all the dead cpu's worker threads would have terminated.
> > Otherwise we expose races between CPU_UP_PREPARE/kthread_create and the
> > (old) thread exiting.
>
> Please look at 2.6.20-mm1, cleanup is synchronous. Probably we misunderstood
> each other looking at different code.

Ok ..I hadnt looked at 2.6.20-mm1 (it wasnt out when we posted the
patch). Neverthless I think most of our intended changes would apply for
2.6.20-mm1 also. We will post a new version (breaking down workqueue changes
as you want) against 2.6.20-mm1.

> > How abt retaining the break above but setting cwq->thread = NULL in
> > create_workqueue_thread in failure case?
>
> Perhaps do it, but why? The failure should be rare, and it is a bit
> dangerous to have workqueue_struct which was not properly initialized.
> Suppose we change CPU_UP_PREPARE so it is called before freeze_processes()
> stage, then we have a problem.

Ok ..no problem. Will not add the 'break' there!

> Srivatsa, don't get we wrong. I can't judge about using freezer for cpu hotplug,
> but yes, we can improve workqueue.c in this case! But this changes should be
> small and understandable. When cpu hotplug is converted, we don't need _any_
> changes in workqueue.c, it should work (except s/CPU_DEAD/CPU_DEAD_KILL_THREADS
> if you insist).

Note with the change proposed in refrigerator, we can avoid
CPU_DEAD_KILL_THREADS and do all cleanup in CPU_DEAD itself.

> No more changes are required, cwq_should_stop() just works
> because it is more flexible than kthread_should_stop().

What is more flexible abt cwq_should_stop()?


--
Regards,
vatsa
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/