Re: [PATCH 9/9] workqueue: Implement system-wide nr_active enforcement for unbound workqueues

From: Tejun Heo
Date: Thu Jan 25 2024 - 11:18:24 EST


Hello, Lai.

On Wed, Jan 24, 2024 at 10:54:26AM +0800, Lai Jiangshan wrote:
> If an active item for a PWQ is canceled, the PWQ will lose a hard-earned
> nr_active and have to wait on the round-robbin queue for another nr_active.
>
> It seems it is unfair for this PWQ. If a user's program pattern is
> queuing-checking-cancelling items, it can cause delays for other users
> sharing the same workqueue.

pwq's get RR-queued iff there's congetion in the workqueue and when an
inactive work item gets activated, it gets dispatched for execution right
away. Note that for unbound workqueues, need_more_worker() is always true
when there are active work items on the worklist.

ie. The only time there's meaningful time window between a work item getting
activated and starting execution is when more workers need to be created and
the system is under significant memory pressure. Note that the former is
always a temporary condition as we retain workers for a while once they're
created.

IOW, there's no meaningful time window in which a work item which was
activated after waiting on node_nr_active pending list and then can get
canceled. It's a possible but fringe scenario which won't happen in any
meaningful frequency and even when that happens the impact isn't much to
worry about. I don't think it's a good idea to add complications for such
cases.

Thanks.

--
tejun