Re: One potential issue with concurrent execution of RCU callbacks...

From: Paul E. McKenney
Date: Tue Dec 08 2020 - 19:04:26 EST


On Tue, Dec 08, 2020 at 11:04:38PM +0100, Frederic Weisbecker wrote:
> On Tue, Dec 08, 2020 at 10:24:09AM -0800, Paul E. McKenney wrote:
> > > It reduces the code scope running with BH disabled.
> > > Also narrowing down helps to understand what it actually protects.
> >
> > I thought that you would call out unnecessarily delaying other softirq
> > handlers. ;-)
> >
> > But if such delays are a problem (and they might well be), then to
> > avoid them on non-rcu_nocb CPUs would instead/also require changing the
> > early-exit checks to check for other pending softirqs to the existing
> > checks involving time, need_resched, and idle. At which point, entering and
> > exiting BH-disabled again doesn't help, other than your point about the
> > difference in BH-disabled scopes on rcu_nocb and non-rcu_nocb CPUs.
>
> Wise observation!
>
> > Would it make sense to exit rcu_do_batch() if more than some amount
> > of time had elapsed and there was some non-RCU softirq pending?
> >
> > My guess is that the current tlimit checks in rcu_do_batch() make this
> > unnecessary.
>
> Right and nobody has complained about it so far.

If they did, my thought would be to add another early-exit check,
but under the tlimit check, so that pending non-RCU softirqs might
set a shorter time limit. For example, instead of allowing up to the
current one second in rcu_do_batch(), allow only up to 100 milliseconds
or whatever. But there are lots of choices, which is one reason to wait
until it becomes a problem.

> But I should add a comment explaining the reason for the BH-disabled
> section in my series.

That sounds like a most excellent idea, please do!

Thanx, Paul