Re: WARN_ON_ONCE() in process_one_work()?

From: Tejun Heo
Date: Sat Jun 17 2017 - 07:54:05 EST


Hello,

On Fri, Jun 16, 2017 at 10:36:58AM -0700, Paul E. McKenney wrote:
> And no test failures from yesterday evening. So it looks like we get
> somewhere on the order of one failure per 138 hours of TREE07 rcutorture
> runtime with your printk() in the mix.
>
> Was the above output from your printk() output of any help?

Yeah, if my suspicion is correct, it'd require new kworker creation
racing against CPU offline, which would explain why it's so difficult
to repro. Can you please see whether the following patch resolves the
issue?

Thanks.

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 803c3bc274c4..1500217ce4b4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -980,8 +980,13 @@ struct migration_arg {
static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf,
struct task_struct *p, int dest_cpu)
{
- if (unlikely(!cpu_active(dest_cpu)))
- return rq;
+ if (p->flags & PF_KTHREAD) {
+ if (unlikely(!cpu_online(dest_cpu)))
+ return rq;
+ } else {
+ if (unlikely(!cpu_active(dest_cpu)))
+ return rq;
+ }

/* Affinity changed (again). */
if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed))