Re: [RFC PATCH v2 1/2] workqueue: Unbind workers before sending them to exit()

From: Tejun Heo
Date: Thu Jul 28 2022 - 12:35:48 EST


On Thu, Jul 28, 2022 at 11:54:19AM +0100, Valentin Schneider wrote:
> On 28/07/22 01:13, Lai Jiangshan wrote:
> > Quick review before going to sleep.
> >
>
> Thanks!
>
> > On Wed, Jul 27, 2022 at 7:54 PM Valentin Schneider <vschneid@xxxxxxxxxx> wrote:
> >> @@ -1806,8 +1806,10 @@ static void worker_enter_idle(struct worker *worker)
> >> /* idle_list is LIFO */
> >> list_add(&worker->entry, &pool->idle_list);
> >>
> >> - if (too_many_workers(pool) && !timer_pending(&pool->idle_timer))
> >> - mod_timer(&pool->idle_timer, jiffies + IDLE_WORKER_TIMEOUT);
> >> + if (too_many_workers(pool) && !delayed_work_pending(&pool->idle_reaper_work))
> >> + mod_delayed_work(system_unbound_wq,
> >> + &pool->idle_reaper_work,
> >> + IDLE_WORKER_TIMEOUT);
> >
> > system_unbound_wq doesn't have a rescuer.
> >
> > A new workqueue with a rescuer needs to be created and used for
> > this purpose.
> >
>
> Right, I think it makes sense for those work items to be attached to a
> WQ_MEM_RECLAIM workqueue. Should I add that as a workqueue-internal
> thing?

I don't understand why this would need MEM_RECLAIM when it isn't sitting in
the memory reclaim path. Nothing in mm side can wait on this.

> > Since WORKER_DIE is set, the worker can be possible freed now
> > if there is another source to wake it up.
> >
>
> My understanding for having reap_worker() be "safe" to use outside of
> raw_spin_lock_irq(pool->lock) is that pool->idle_list is never accessed
> outside of the pool->lock, and wake_up_worker() only wakes a worker that
> is in that list. So with destroy_worker() detaching the worker from
> pool->idle_list under pool->lock, I'm not aware of a codepath other than
> reap_worker() that could wake it up.

There actually are spurious wakeups. We can't depend on there being no
wakeups than ours.

Thanks.

--
tejun