Re: Something is leaking RCU holds from interrupt context

From: Paul E. McKenney
Date: Sun Apr 04 2021 - 12:48:13 EST


On Sun, Apr 04, 2021 at 11:24:57AM +0100, Matthew Wilcox wrote:
> On Sat, Apr 03, 2021 at 09:15:17PM -0700, syzbot wrote:
> > HEAD commit: 2bb25b3a Merge tag 'mips-fixes_5.12_3' of git://git.kernel..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=1284cc31d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=78ef1d159159890
> > dashboard link: https://syzkaller.appspot.com/bug?extid=dde0cc33951735441301
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+dde0cc33951735441301@xxxxxxxxxxxxxxxxxxxxxxxxx
> >
> > WARNING: suspicious RCU usage
> > 5.12.0-rc5-syzkaller #0 Not tainted
> > -----------------------------
> > kernel/sched/core.c:8294 Illegal context switch in RCU-bh read-side critical section!
> >
> > other info that might help us debug this:
> >
> >
> > rcu_scheduler_active = 2, debug_locks = 0
> > no locks held by systemd-udevd/4825.
>
> I think we have something that's taking the RCU read lock in
> (soft?) interrupt context and not releasing it properly in all
> situations. This thread doesn't have any locks recorded, but
> lock_is_held(&rcu_bh_lock_map) is true.
>
> Is there some debugging code that could find this? eg should
> lockdep_softirq_end() check that rcu_bh_lock_map is not held?
> (if it's taken in process context, then BHs can't run, so if it's
> held at softirq exit, then there's definitely a problem).

Something like the (untested) patch below?

Please note that it does not make sense to also check for
either rcu_lock_map or rcu_sched_lock_map because either of
these might be held by the interrupted code.

Thanx, Paul

------------------------------------------------------------------------

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 42f3f8c..e4ad0a6 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -504,6 +504,7 @@ static inline void lockdep_softirq_end(bool in_hardirq)
{
lockdep_softirq_exit();

+ RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map));
if (in_hardirq)
lockdep_hardirq_enter();
}