failure to boot after dc6e0818bc9a "sched/cpuacct: Optimize away RCU read lock"

From: J. Bruce Fields
Date: Wed Mar 16 2022 - 13:43:31 EST


One of my test VMs has been failing to boot linux-next recently. I
finally got around to a bisect this morning, and it landed on the below.

What other information would be useful to debug this?

--b.

commit dc6e0818bc9a
Author: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
Date: Sun Feb 20 13:14:25 2022 +0800

sched/cpuacct: Optimize away RCU read lock

Since cpuacct_charge() is called from the scheduler update_curr(),
we must already have rq lock held, then the RCU read lock can
be optimized away.

And do the same thing in it's wrapper cgroup_account_cputime(),
but we can't use lockdep_assert_rq_held() there, which defined
in kernel/sched/sched.h.

Suggested-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Link: https://lore.kernel.org/r/20220220051426.5274-2-zhouchengming@xxxxxxxxxxxxx

diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
index 75c151413fda..9a109c6ac0e0 100644
--- a/include/linux/cgroup.h
+++ b/include/linux/cgroup.h
@@ -791,11 +791,9 @@ static inline void cgroup_account_cputime(struct task_struct *task,

cpuacct_charge(task, delta_exec);

- rcu_read_lock();
cgrp = task_dfl_cgroup(task);
if (cgroup_parent(cgrp))
__cgroup_account_cputime(cgrp, delta_exec);
- rcu_read_unlock();
}

static inline void cgroup_account_cputime_field(struct task_struct *task,
diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c
index 307800586ac8..f79f88456d72 100644
--- a/kernel/sched/cpuacct.c
+++ b/kernel/sched/cpuacct.c
@@ -337,12 +337,10 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime)
unsigned int cpu = task_cpu(tsk);
struct cpuacct *ca;

- rcu_read_lock();
+ lockdep_assert_rq_held(cpu_rq(cpu));

for (ca = task_ca(tsk); ca; ca = parent_ca(ca))
*per_cpu_ptr(ca->cpuusage, cpu) += cputime;
-
- rcu_read_unlock();
}

/*