[PATCHSET wq/for-6.9,6.10] workqueue: Implement disable/enable_work()

From: Tejun Heo
Date: Fri Feb 16 2024 - 13:06:23 EST


Hello,

4cb1ef64609f ("workqueue: Implement BH workqueues to eventually replace
tasklets") implemented workqueues that execute work items in the BH context
with the goal of eventually replacing tasklet.

While the existing workqueue API covers the basic queueing and canceling
operations, tasklet also has tasklet_disable*() which blocks the execution
of the tasklet until it's re-enabled with tasklet_enable(). The interface if
fairly widely used and workqueue currently doesn't have a counterpart.

This patchset implements disable/enable_work() and the delayed_work
counterparts to address the gap. The ability to block future executions is
something which some users asked for in the past, and, while not essential
as the caller can and often has to shutdown the queuer anyway, it's a nice
convenience to have. Also, timer grew a similar feature recently with
timer_shutdown().

- tasklet_disable() keeps disable depth so that a tasklet can be disabled
and re-enabled by multiple parties at the same time. Workqueue is updated
to carry 16bit disable count in work->data while the work item is not
queued. When non-zero, attempts to queue the work item fail.

- The cancel_work_sync() path used WORK_OFFQ_CANCELING to synchronize
multiple cancel_sync attempts. This added a completely separate wait
mechanism which wasn't very congruent with the rest of workqueue. This was
because the canceling state was carried in a way which couldn't
accommodate multiple concurrent uses. This mechanism is replaced by
disable - cancel_sync now simply disables the work item, flushes and
re-enables it.

- There is a wart in how tasklet_disable/enable() works. When a tasklet is
disabled, if the tasklet is queued, it keeps the softirq constantly raised
until the tasklet is re-enabled and executed. This makes disabling
unnecessarily expensive and brittle. The only busy looping workqueue's
implementation does is on the party that's trying to cancel_sync or
disable_sync to wait for the completion of the currently executing
instance, which should be safe as long as it's from process and BH
contexts.

- A disabled tasklet remembers whether it was queued while disabled and
starts executing when re-enabled. It turns out doing this with work items
is challenging as there are a lot more to remember and the synchronization
becomes more complicated too. Looking at the use cases and given the
continuity from how cancel_work_sync() works, it seems better to just
ignore queueings which happen while a work item is disabled and require
the users to explicitly queue the work item after re-enabling as
necessary. Most users should be able to re-enqueue unconditionally after
enabling.

This patchset is on top of wq/for-6.9 and contains the following 17 patches:

0001-workqueue-Cosmetic-changes.patch
0002-workqueue-Use-rcu_read_lock_any_held-instead-of-rcu_.patch
0003-workqueue-Rename-__cancel_work_timer-to-__cancel_tim.patch
0004-workqueue-Reorganize-flush-and-cancel-_sync-function.patch
0005-workqueue-Use-variable-name-irq_flags-for-saving-loc.patch
0006-workqueue-Introduce-work_cancel_flags.patch
0007-workqueue-Clean-up-enum-work_bits-and-related-consta.patch
0008-workqueue-Factor-out-work_grab_pending-from-__cancel.patch
0009-workqueue-Remove-clear_work_data.patch
0010-workqueue-Make-flags-handling-consistent-across-set_.patch
0011-workqueue-Preserve-OFFQ-bits-in-cancel-_sync-paths.patch
0012-workqueue-Implement-disable-enable-for-delayed-work-.patch
0013-workqueue-Remove-WORK_OFFQ_CANCELING.patch
0014-workqueue-Remember-whether-a-work-item-was-on-a-BH-w.patch
0015-workqueue-Update-how-start_flush_work-is-called.patch
0016-workqueue-Allow-cancel_work_sync-and-disable_work-fr.patch
0017-r8152-Convert-from-tasklet-to-BH-workqueue.patch

0001-0010 are cleanup and prep patches with the only functional change being
the use of rcu_read_lock_any_held() instead of rcu_read_lock() in 0002. I'll
apply them to wq/for-6.9 unless there are objections. I thought about making
these a separate patch series but the cleanups make more sense as a part of
this series.

For the rest of the series, given how many invasive workqueue changes are
already queued for v6.9 and the subtle nature of these patches, I think it'd
be best to defer them to the one after that so that we can use v6.9 as an
intermediate verification point.

0011-0012 implement disable_work() and enable_work(). At this stage, all
disable[_sync]_work and enable_work operations might_sleep(). disable_work()
and enable_work() due to CANCELING synchronization described above.
disable_work_sync() also needs to wait for the in-flight work item to finish
which requires blocking.

0013 replaces CANCELING with internal use of disble/enable. This removes one
ugliness from workqueue code and allows disable_work() and enable_work() to
be used from atomic contexts.

0014-0016 implement busy-wait for BH work items when they're being canceled
thus allowing cancel_work_sync() and disable_work_sync() to be called from
atomic contexts for them. This makes workqueue interface a superset of
tasklet and also makes BH workqueues easier to live with.

0017 converts drivers/net/r8152.c from tasklet to BH workqueue as a
demonstration. It seems to work fine.

The patchset is on top of wq/for-6.9 fd0a68a2337b ("workqueue, irq_work:
Build fix for !CONFIG_IRQ_WORK") and also available in the following git
branch:

git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git disable_work-v1

diffstat follows. Thanks.

drivers/net/usb/r8152.c | 44 ++--
include/linux/workqueue.h | 69 ++++---
kernel/workqueue.c | 623 ++++++++++++++++++++++++++++++++++++++++----------------------------
3 files changed, 441 insertions(+), 295 deletions(-)

--
tejun