Re: [PATCH rcu 3/3] srcu: Explain why callbacks invocations can't run concurrently

From: Joel Fernandes
Date: Wed Dec 13 2023 - 13:35:44 EST


On Wed, Dec 13, 2023 at 12:52 PM Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>
> On Wed, Dec 13, 2023 at 09:27:09AM -0500, Joel Fernandes wrote:
> > On Tue, Dec 12, 2023 at 12:48 PM Neeraj Upadhyay (AMD)
> > <neeraj.iitr10@xxxxxxxxx> wrote:
> > >
> > > From: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > >
> > > If an SRCU barrier is queued while callbacks are running and a new
> > > callbacks invocator for the same sdp were to run concurrently, the
> > > RCU barrier might execute too early. As this requirement is non-obvious,
> > > make sure to keep a record.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > > Reviewed-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
> > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@xxxxxxxxx>
> > > ---
> > > kernel/rcu/srcutree.c | 6 ++++++
> > > 1 file changed, 6 insertions(+)
> > >
> > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > > index 2bfc8ed1eed2..0351a4e83529 100644
> > > --- a/kernel/rcu/srcutree.c
> > > +++ b/kernel/rcu/srcutree.c
> > > @@ -1715,6 +1715,11 @@ static void srcu_invoke_callbacks(struct work_struct *work)
> > > WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
> > > rcu_segcblist_advance(&sdp->srcu_cblist,
> > > rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> > > + /*
> > > + * Although this function is theoretically re-entrant, concurrent
> > > + * callbacks invocation is disallowed to avoid executing an SRCU barrier
> > > + * too early.
> > > + */
> >
> > Side comment:
> > I guess even without the barrier reasoning, it is best not to allow
> > concurrent CB execution anyway since it diverges from the behavior of
> > straight RCU :)
>
> Good point!
>
> But please do not forget item 12 on the list in checklist.rst. ;-)
> (Which I just updated to include the other call_rcu*() functions.)

I think this is more so now with recent kernels (with the dynamic nocb
switch) than with older kernels right? I haven't kept up with the
checklist recently (which is my bad).

My understanding comes from the fact that the RCU barrier depends on
callbacks on the same CPU executing in order with straight RCU
otherwise it breaks. Hence my comment. But as you pointed out, that's
outdated knowledge.

I should just shut up and hide in shame now.

:-/

- Joel