Re: WARN_ON_ONCE() in process_one_work()?

From: Paul E. McKenney
Date: Tue Jun 13 2017 - 18:31:13 EST


On Tue, Jun 13, 2017 at 04:58:37PM -0400, Tejun Heo wrote:
> Hello, Paul.
>
> On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote:
> > Just following up... I have hit this bug a couple of times over the
> > past few days. Anything I can do to help?
>
> My apologies for dropping the ball on this. I've gone over the hot
> plug code in workqueue several times but can't really find how this
> would happen. Can you please apply the following patch and see what
> it says when the problem happens?

I have fired it up, thank you!

Last time I saw one failure in 21 hours of test runs, so I have kicked
of 42 one-hour test runs. Will see what happens tomorrow morning,
Pacific Time.

Thanx, Paul

> Thanks.
>
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index c74bf39ef764..bd2ce3cbfb41 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1691,13 +1691,20 @@ static struct worker *alloc_worker(int node)
> static void worker_attach_to_pool(struct worker *worker,
> struct worker_pool *pool)
> {
> + int ret;
> +
> mutex_lock(&pool->attach_mutex);
>
> /*
> * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> * online CPUs. It'll be re-applied when any of the CPUs come up.
> */
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> + ret = set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> +
> + WARN(ret && !(pool->flags & POOL_DISASSOCIATED),
> + "set_cpus_allowed_ptr failed, ret=%d pool->cpu/flags=%d/0x%x cpumask=%*pbl online=%*pbl active=%*pbl\n",
> + ret, pool->cpu, pool->flags, cpumask_pr_args(pool->attrs->cpumask),
> + cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask));
>
> /*
> * The pool->attach_mutex ensures %POOL_DISASSOCIATED remains
> @@ -2037,8 +2044,11 @@ __acquires(&pool->lock)
> lockdep_copy_map(&lockdep_map, &work->lockdep_map);
> #endif
> /* ensure we're on the correct CPU */
> - WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> - raw_smp_processor_id() != pool->cpu);
> + if (WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
> + raw_smp_processor_id() != pool->cpu))
> + printk_once("XXX workfn=%pf pool->cpu/flags=%d/0x%x curcpu=%d online=%*pbl active=%*pbl\n",
> + work->func, pool->cpu, pool->flags, raw_smp_processor_id(),
> + cpumask_pr_args(cpu_online_mask), cpumask_pr_args(cpu_active_mask));
>
> /*
> * A single work shouldn't be executed concurrently by
>