Re: [PATCH] rcu: Make call_rcu() lazy only when CONFIG_RCU_LAZY is enabled

From: Paul E. McKenney
Date: Thu Oct 20 2022 - 14:39:13 EST


On Thu, Oct 20, 2022 at 04:42:05AM -0400, Joel Fernandes wrote:
> > On Oct 19, 2022, at 7:34 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > On Wed, Oct 19, 2022 at 02:25:29PM -0400, Joel Fernandes wrote:
> >>
> >>
> >>>> On Oct 19, 2022, at 1:45 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >>>
> >>> On Wed, Oct 19, 2022 at 08:12:30AM -0400, Joel Fernandes wrote:
> >>>>> On Oct 19, 2022, at 8:10 AM, Joel Fernandes <joel@xxxxxxxxxxxxxxxxx> wrote:
> >>>>>>> On Oct 19, 2022, at 6:34 AM, Zqiang <qiang1.zhang@xxxxxxxxx> wrote:
> >>>>>>>
> >>>>>>> Currently, regardless of whether the CONFIG_RCU_LAZY is enabled,
> >>>>>>> invoke the call_rcu() is always lazy, it also means that when
> >>>>>>> CONFIG_RCU_LAZY is disabled, invoke the call_rcu_flush() is also
> >>>>>>> lazy. therefore, this commit make call_rcu() lazy only when
> >>>>>>> CONFIG_RCU_LAZY is enabled.
> >>>>
> >>>> First, good eyes! Thank you for spotting this!!
>
> Indeed.
>
> >>>>>>> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> >>>>>>> ---
> >>>>>>> kernel/rcu/tree.c | 8 +++++++-
> >>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>>>>>>
> >>>>>>> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> >>>>>>> index abc615808b6e..97ef602da3d5 100644
> >>>>>>> --- a/kernel/rcu/tree.c
> >>>>>>> +++ b/kernel/rcu/tree.c
> >>>>>>> @@ -2839,7 +2839,6 @@ void call_rcu_flush(struct rcu_head *head, rcu_callback_t func)
> >>>>>>> return __call_rcu_common(head, func, false);
> >>>>>>> }
> >>>>>>> EXPORT_SYMBOL_GPL(call_rcu_flush);
> >>>>>>> -#endif
> >>>>>>>
> >>>>>>> /**
> >>>>>>> * call_rcu() - Queue an RCU callback for invocation after a grace period.
> >>>>>>> @@ -2890,6 +2889,13 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func)
> >>>>>>> return __call_rcu_common(head, func, true);
> >>>>>>> }
> >>>>>>> EXPORT_SYMBOL_GPL(call_rcu);
> >>>>>>> +#else
> >>>>>>> +void call_rcu(struct rcu_head *head, rcu_callback_t func)
> >>>>>>> +{
> >>>>>>> + return __call_rcu_common(head, func, false);
> >>>>>
> >>>>> Thanks. Instead of adding new function, you can also pass IS_ENABLED(CONFIG…) to the existing function of the same name.
> >>>
> >>> I do like this approach better -- less code, more obvious what is going on.
> >>
> >> Sounds good. Zqiang, do you mind updating your patch along these lines? That way you get the proper attribution.
>
> Acked that patch.
>
> >> More comments below:
> >>>
> >>>>> Looks like though I made every one test the patch without having to enable the config option ;-). Hey, I’m a half glass full kind of guy, why do you ask?
> >>>>>
> >>>>> Paul, I’ll take a closer look once I’m at the desk, but would you prefer to squash a diff into the existing patch, or want a new patch altogether?
> >>>>
> >>>> On the other hand, what I’d want is to nuke the config option altogether or make it default y, we want to catch issues sooner than later.
> >>>
> >>> That might be what we do at some point, but one thing at a time. Let's
> >>> not penalize innocent bystanders, at least not just yet.
> >>
> >> It’s a trade off, I thought that’s why we wanted to have the binary search stuff. If no one reports issue on Linux-next, then that code won’t be put to use in the near future at least.
> >
> > Well, not to put too fine a point on it, but we currently really are
> > exposing -next to lazy call_rcu(). ;-)
>
> This is true. I think I assumed nobody will enable a default off config option but I probably meant a smaller percentage will.
>
> >>> I do very strongly encourage the ChromeOS and Android folks to test this
> >>> very severely, however.
> >>
> >> Agreed. Yes that will happen, though I have to make a note for Android folks other than Vlad, to backports these (and enable the config option), carefully! Especially on pre-5.15 kernels. Luckily I had to do this (not so trivial) exercise myself.
> >
> > And this is another situation in which the binary search stuff may prove
> > extremely useful.
>
> Agreed. Thanks. Very least I owe per-rdp splitting of the hashtable, to that code. Steven and me talked today that probably the hashtable can go into the rcu_segcblist itself, and protect it by the nocb lock.

I have to ask...

How does this fit in with CPU-hotplug and callback migration?

More to the point, what events would cause us to decide that this is
required? For example, shouldn't we give your current binary-search
code at least a few chances to save the day?

Thanx, Paul

> >>>>>> +}
> >>>>>> +EXPORT_SYMBOL_GPL(call_rcu);
> >>>>>> +#endif
> >>>>>>
> >>>>>> /* Maximum number of jiffies to wait before draining a batch. */
> >>>>>> #define KFREE_DRAIN_JIFFIES (5 * HZ)
> >>>>>> --
> >>>>>> 2.25.1
> >>>>>>