Re: [PATCH v2 3/6] rcu-tasks: Initialize data to eliminate RCU-tasks/do_exit() deadlocks

From: Frederic Weisbecker
Date: Fri Feb 23 2024 - 06:41:32 EST


On Thu, Feb 22, 2024 at 12:41:55PM -0800, Paul E. McKenney wrote:
> On Thu, Feb 22, 2024 at 05:21:03PM +0100, Frederic Weisbecker wrote:
> > Le Fri, Feb 16, 2024 at 05:27:38PM -0800, Boqun Feng a écrit :
> > > From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>
> > >
> > > Holding a mutex across synchronize_rcu_tasks() and acquiring
> > > that same mutex in code called from do_exit() after its call to
> > > exit_tasks_rcu_start() but before its call to exit_tasks_rcu_stop()
> > > results in deadlock. This is by design, because tasks that are far
> > > enough into do_exit() are no longer present on the tasks list, making
> > > it a bit difficult for RCU Tasks to find them, let alone wait on them
> > > to do a voluntary context switch. However, such deadlocks are becoming
> > > more frequent. In addition, lockdep currently does not detect such
> > > deadlocks and they can be difficult to reproduce.
> > >
> > > In addition, if a task voluntarily context switches during that time
> > > (for example, if it blocks acquiring a mutex), then this task is in an
> > > RCU Tasks quiescent state. And with some adjustments, RCU Tasks could
> > > just as well take advantage of that fact.
> > >
> > > This commit therefore initializes the data structures that will be needed
> > > to rely on these quiescent states and to eliminate these deadlocks.
> > >
> > > Link: https://lore.kernel.org/all/20240118021842.290665-1-chenzhongjin@xxxxxxxxxx/
> > >
> > > Reported-by: Chen Zhongjin <chenzhongjin@xxxxxxxxxx>
> > > Reported-by: Yang Jihong <yangjihong1@xxxxxxxxxx>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > Tested-by: Yang Jihong <yangjihong1@xxxxxxxxxx>
> > > Tested-by: Chen Zhongjin <chenzhongjin@xxxxxxxxxx>
> > > Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>

Looks good, thanks!

Reviewed-by: Frederic Weisbecker <frederic@xxxxxxxxxx>