Re: [PATCH] workqueue: Don't double assign worker->sleeping

From: Lai Jiangshan
Date: Tue Mar 31 2020 - 23:44:32 EST


On Wed, Apr 1, 2020 at 11:22 AM Lai Jiangshan <jiangshanlai@xxxxxxxxx> wrote:
>
> Hello
>
> If I don't miss all the issues you have listed, it is a good and straightforward
> fix, but I have concern that cmpxchg_local() might have performance impact
> on some non-x86 arch.
>
> The two issues as you have listed:
> 1) WARN_ON_ONCE() on valid condition when interrupted(async-page-faulted)
> 2) wq_worker_running() can be interrupted(async-page-faulted in virtual machine)
> and nr_running would be decreased twice.

would be *increased* twice

I just saw the V2 patch, this issue is not listed, but need to be fixed too.

>
> For fixing issue one, we can just remove WARN_ON_ONCE() as this patch.
> For fixing issue two, ->sleeping in wq_worker_running() can be checked&modified
> under irq-disabled. (we can't use preempt-disabled context here)
>
> thanks,
> Lai
>
> On Sat, Mar 28, 2020 at 1:53 AM Sebastian Andrzej Siewior
> <bigeasy@xxxxxxxxxxxxx> wrote:
> >
> > The kernel test robot triggered a warning with the following race:
> > task-ctx interrupt-ctx
> > worker
> > -> process_one_work()
> > -> work_item()
> > -> schedule();
> > -> sched_submit_work()
> > -> wq_worker_sleeping()
> > -> ->sleeping = 1
> > atomic_dec_and_test(nr_running)
> > __schedule(); *interrupt*
> > async_page_fault()
> > -> local_irq_enable();
> > -> schedule();
> > -> sched_submit_work()
> > -> wq_worker_sleeping()
> > -> if (WARN_ON(->sleeping)) return
> > -> __schedule()
> > -> sched_update_worker()
> > -> wq_worker_running()
> > -> atomic_inc(nr_running);
> > -> ->sleeping = 0;
> >
> > -> sched_update_worker()
> > -> wq_worker_running()
> > if (!->sleeping) return
> >
> > In this context the warning is pointless everything is fine.
> >
> > However, if the interrupt occurs in wq_worker_sleeping() between reading and
> > setting `sleeping' i.e.
> >
> > | if (WARN_ON_ONCE(worker->sleeping))
> > | return;
> > *interrupt*
> > | worker->sleeping = 1;
> >
> > then pool->nr_running will be decremented twice in wq_worker_sleeping()
> > but it will be incremented only once in wq_worker_running().
> >
> > Replace the assignment of `sleeping' with a cmpxchg_local() to ensure
> > that there is no double assignment of the variable. The variable is only
> > accessed from the local CPU. Remove the WARN statement because this
> > condition can be valid.
> >
> > An alternative would be to move `->sleeping' to `->flags' as a new bit
> > but this would require to acquire the pool->lock in wq_worker_running().
> >
> > Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock")
> > Link: 20200327074308.GY11705@shao2-debian">https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian
> > Reported-by: kernel test robot <lkp@xxxxxxxxx>
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> > ---
> > kernel/workqueue.c | 6 ++----
> > 1 file changed, 2 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> > index 4e01c448b4b48..dc477a2a3ce30 100644
> > --- a/kernel/workqueue.c
> > +++ b/kernel/workqueue.c
> > @@ -846,11 +846,10 @@ void wq_worker_running(struct task_struct *task)
> > {
> > struct worker *worker = kthread_data(task);
> >
> > - if (!worker->sleeping)
> > + if (cmpxchg_local(&worker->sleeping, 1, 0) == 0)
> > return;
> > if (!(worker->flags & WORKER_NOT_RUNNING))
> > atomic_inc(&worker->pool->nr_running);
> > - worker->sleeping = 0;
> > }
> >
> > /**
> > @@ -875,10 +874,9 @@ void wq_worker_sleeping(struct task_struct *task)
> >
> > pool = worker->pool;
> >
> > - if (WARN_ON_ONCE(worker->sleeping))
> > + if (cmpxchg_local(&worker->sleeping, 0, 1) == 1)
> > return;
> >
> > - worker->sleeping = 1;
> > spin_lock_irq(&pool->lock);
> >
> > /*
> > --
> > 2.26.0
> >