Re: work item still be scheduled to execute after destroy_workqueue?

From: richard clark
Date: Mon Dec 05 2022 - 23:36:03 EST


Hi Lai,

On Tue, Dec 6, 2022 at 12:13 PM Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:
>
> On Mon, Dec 5, 2022 at 2:18 PM richard clark
> <richard.xnu.clark@xxxxxxxxx> wrote:
> >
> > Hi Lai and Tejun,
> >
> > Why can the work still be queued to and executed when queueing it to a
> > wq has been destroyed, for instance, the below code snippet in a
> > kernel module:
> > ---------------------->8---------------------
> >
> > struct workqueue_struct *wq0;
> > #define MAX_ACTIVE_WORKS (3)
> >
> > static void work0_func(struct work_struct *work);
> > static void work1_func(struct work_struct *work);
> > static void work2_func(struct work_struct *work);
> >
> > static DECLARE_WORK(w0, work0_func);
> > static DECLARE_WORK(w1, work1_func);
> > static DECLARE_WORK(w2, work2_func);
> >
> > /* work->func */
> > static void work0_func(struct work_struct *work)
> > {
> > pr_info("+%s begins to sleep\n", __func__);
> > /* sleep for 10s */
> > schedule_timeout_interruptible(msecs_to_jiffies(10000));
> > pr_info("+%s after sleep, begin to queue another work\n", __func__);
> > queue_work_on(1, wq0, &w1);
> > }
> >
> > /* work->func */
> > static void work1_func(struct work_struct *work)
> > {
> > pr_info("+%s scheduled\n", __func__);
> > }
> >
> > /* work->func */
> > static void work2_func(struct work_struct *work)
> > {
> > pr_info("+%s scheduled\n", __func__);
> > }
> >
> > static int destroy_init(void)
> > {
> > wq0 = alloc_workqueue("percpu_wq0", 0, MAX_ACTIVE_WORKS);
> > if (!wq0) {
> > pr_err("alloc_workqueue failed\n");
> > return -1;
> > }
> > queue_work_on(1, wq0, &w0);
> > pr_info("Begin to destroy wq0...\n");
> > destroy_workqueue(wq0);
> > pr_info("queue w2 to the wq0 after destroyed...\n");
> > queue_work_on(1, wq0, &w2);
>
> Hello, Richard.
>
> Nice spot.
>
> It is illegal to use a destroyed structure in the view of any API.
>
> A destroyed workqueue might be directly freed or kept for a while,
> which is up to the code of workqueue.c
>
> Before e2dca7adff8f(workqueue: make the workqueues list RCU walkable),
> the workqueue is directly totally freed when destroyed.
> After the said commit, the workqueue is held for an RCU grace before
> totally freed. And it is a per-cpu workqueue, and the base ref is
> never dropped on per-cpu pwqs, which means it is referencable and
> able to be queued items during the period by accident.
>
> Albeit it is illegal to use a destroyed workqueue, it is definitely bad
> for workqueue code not to complain noisily about the behavior, so I am
> going to set __WQ_DRAINING permanently for the destroyed workqueue, so
> the illegal usage of the destroyed workqueue can result WARN().
>
A WARN is definitely reasonable and has its benefits. Can I try to
submit the patch and you're nice to review as maintainer?

Thanks,
Richard
>
> Thank you for the report.
> Lai
>
> >
> > return 0;
> > }
> >
> > The output on my x86_64 box is:
> >
> > [344702.734480] +destroy_init+
> > [344702.734499] Begin to destroy wq0...
> > [344702.734516] +work0_func begins to sleep
> > [344712.791607] +work0_func after sleep, begin to queue another work
> > [344712.791620] +work1_func scheduled
> > [344712.791649] queue w2 to the wq0 after destroyed...
> > [344712.791663] +work2_func scheduled <------------- work 2 still be scheduled?