Re: failure to boot after dc6e0818bc9a "sched/cpuacct: Optimize away RCU read lock"

From: J. Bruce Fields
Date: Wed Mar 16 2022 - 19:27:08 EST


On Wed, Mar 16, 2022 at 09:48:06PM +0100, Peter Zijlstra wrote:
> On Wed, Mar 16, 2022 at 01:43:24PM -0400, J. Bruce Fields wrote:
> > One of my test VMs has been failing to boot linux-next recently. I
> > finally got around to a bisect this morning, and it landed on the below.
> >
> > What other information would be useful to debug this?
>
> A more recent -next should have this commit in it:

Ah, yep, it's booting again with today's -next. Thanks.--b.

>
>
> commit f2aa197e4794bf4c2c0c9570684f86e6fa103e8b
> Author: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> Date: Sat Mar 5 11:41:03 2022 +0800
>
> cgroup: Fix suspicious rcu_dereference_check() usage warning
>
> task_css_set_check() will use rcu_dereference_check() to check for
> rcu_read_lock_held() on the read-side, which is not true after commit
> dc6e0818bc9a ("sched/cpuacct: Optimize away RCU read lock"). This
> commit drop explicit rcu_read_lock(), change to RCU-sched read-side
> critical section. So fix the RCU warning by adding check for
> rcu_read_lock_sched_held().
>
> Fixes: dc6e0818bc9a ("sched/cpuacct: Optimize away RCU read lock")
> Reported-by: Linux Kernel Functional Testing <lkft@xxxxxxxxxx>
> Reported-by: syzbot+16e3f2c77e7c5a0113f9@xxxxxxxxxxxxxxxxxxxxxxxxx
> Signed-off-by: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> Acked-by: Tejun Heo <tj@xxxxxxxxxx>
> Tested-by: Zhouyi Zhou <zhouzhouyi@xxxxxxxxx>
> Tested-by: Marek Szyprowski <m.szyprowski@xxxxxxxxxxx>
> Link: https://lore.kernel.org/r/20220305034103.57123-1-zhouchengming@xxxxxxxxxxxxx
>
> diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h
> index 1e356c222756..0d1ada8968d7 100644
> --- a/include/linux/cgroup.h
> +++ b/include/linux/cgroup.h
> @@ -450,6 +450,7 @@ extern struct mutex cgroup_mutex;
> extern spinlock_t css_set_lock;
> #define task_css_set_check(task, __c) \
> rcu_dereference_check((task)->cgroups, \
> + rcu_read_lock_sched_held() || \
> lockdep_is_held(&cgroup_mutex) || \
> lockdep_is_held(&css_set_lock) || \
> ((task)->flags & PF_EXITING) || (__c))