Re: [PATCH 03/11] rcu/nocb: Invoke rcu_core() at the start of deoffloading

From: Frederic Weisbecker
Date: Mon Oct 04 2021 - 08:41:46 EST


On Fri, Oct 01, 2021 at 06:50:04PM +0100, Valentin Schneider wrote:
> On 30/09/21 00:10, Frederic Weisbecker wrote:
> > On PREEMPT_RT, if rcu_core() is preempted by the de-offloading process,
> > some work, such as callbacks acceleration and invocation, may be left
> > unattended due to the volatile checks on the offloaded state.
> >
> > In the worst case this work is postponed until the next rcu_pending()
> > check that can take a jiffy to reach, which can be a problem in case
> > of callbacks flooding.
> >
> > Solve that with invoking rcu_core() early in the de-offloading process.
> > This way any work dismissed by an ongoing rcu_core() call fooled by
> > a preempting deoffloading process will be caught up by a nearby future
> > recall to rcu_core(), this time fully aware of the de-offloading state.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Cc: Valentin Schneider <valentin.schneider@xxxxxxx>
> > Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Cc: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
> > Cc: Josh Triplett <josh@xxxxxxxxxxxxxxxx>
> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> > Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> > Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> One comment/question below.
>
> > @@ -990,6 +990,15 @@ static long rcu_nocb_rdp_deoffload(void *arg)
> > * will refuse to put anything into the bypass.
> > */
> > WARN_ON_ONCE(!rcu_nocb_flush_bypass(rdp, NULL, jiffies));
> > + /*
> > + * Start with invoking rcu_core() early. This way if the current thread
> > + * happens to preempt an ongoing call to rcu_core() in the middle,
> > + * leaving some work dismissed because rcu_core() still thinks the rdp is
> > + * completely offloaded, we are guaranteed a nearby future instance of
> > + * rcu_core() to catch up.
> > + */
> > + rcu_segcblist_set_flags(cblist, SEGCBLIST_RCU_CORE);
> > + invoke_rcu_core();
>
> I think your approach is a bit neater, but would there have been any issue
> with keeping the setting of SEGCBLIST_RCU_CORE within
> rcu_segcblist_offload() and bundling it with an invoke_rcu_core()?

Probably not in practice.

But in theory, it may be more comfortable to read the following in order:

1) Set SEGCBLIST_RCU_CORE so subsequent invocations of rcu_core() handle
callbacks

2) Invoke rcu_core()

3) Only once we achieved the above we can clear SEGCBLIST_OFFLOADED which
will stop the nocb kthreads.

If we did 3) first and only then 1) and 2), there would be a risk that callbacks
get completely ignored in the middle.

That said you have a point in that we could do:

1) Set SEGCBLIST_RCU_CORE and clear SEGCBLIST_OFFLOADED at the _very_ same time
(arrange that with a WRITE_ONCE() I guess).

2) Invoke rcu_core()

But well...arranging for rcu_core() to take over before we even consider
starting the de-offloading process provides some unexplainable relief to the
soul. Some code design sometimes rely more on faith than logic :)

Thanks.