Re: [PATCH v2] rcu/nocb: Avoid polling when myrdp->nocb_head_rdp list is empty

From: Frederic Weisbecker
Date: Fri Jun 10 2022 - 08:05:54 EST


On Tue, Jun 07, 2022 at 03:50:57PM +0800, Zqiang wrote:
> Currently, If the 'rcu_nocb_poll' bootargs is enable, all rcuog kthreads
> enter polling mode. however, due to only insert CPU's rdp which belong to
> rcu_nocb_mask to 'nocb_head_rdp' list or all CPU's rdp served by rcuog
> kthread have been de-offloaded, these cause the 'nocb_head_rdp' list
> served by rcuog kthread is empty, when the 'nocb_head_rdp' is empty,
> the rcuog kthread in polling mode not actually do anything. fix it by
> exiting polling mode when the 'nocb_head_rdp'list is empty, otherwise
> entering polling mode.
>
> Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
> ---
> v1->v2:
> Move rcu_nocb_poll flags check from rdp_offload_toggle() to
> rcu_nocb_rdp_offload/deoffload(), avoid unnecessary setting of
> rdp_gp->nocb_gp_sleep flags, because when rcu_nocb_poll is set
> the rdp_gp->nocb_gp_sleep is not used.
>
> kernel/rcu/tree_nocb.h | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index fa8e4f82e60c..2a52c9abc681 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -698,10 +698,14 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
> TPS("WakeBypassIsDeferred"));
> }
> if (rcu_nocb_poll) {
> - /* Polling, so trace if first poll in the series. */
> - if (gotcbs)
> - trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Poll"));
> - schedule_timeout_idle(1);
> + if (list_empty(&my_rdp->nocb_head_rdp)) {
> + rcu_wait(READ_ONCE(my_rdp->nocb_toggling_rdp));

I suspect you have based your patch on upstream tree which doesn't seem
to have this one yet:

"rcu/nocb: Add/del rdp to iterate from rcuog itself"

With this patch you can't wait on changes to my_rdp->nocb_toggling_rdp because
nocb_gp_wait() now performs the list_add/list_del itself.

Please rebase your patch on top of latest rcu:dev from Paul's tree. Then
all you need to do is to change the wait side, something like this (untested):

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index fa8e4f82e60c..f36d6be4f372 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -584,6 +584,15 @@ static int nocb_gp_toggle_rdp(struct rcu_data *rdp,
return ret;
}

+static void nocb_gp_sleep(struct rdp *my_rdp, int cpu)
+{
+ trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Sleep"));
+ swait_event_interruptible_exclusive(my_rdp->nocb_gp_wq,
+ !READ_ONCE(my_rdp->nocb_gp_sleep));
+ trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("EndSleep"));
+}
+
+
/*
* No-CBs GP kthreads come here to wait for additional callbacks to show up
* or for grace periods to end.
@@ -701,13 +710,19 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
/* Polling, so trace if first poll in the series. */
if (gotcbs)
trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Poll"));
- schedule_timeout_idle(1);
+ if (list_empty(&my_rdp->nocb_head_rdp)) {
+ raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
+ if (!my_rdp->nocb_toggling_rdp)
+ WRITE_ONCE(my_rdp->nocb_gp_sleep, true);
+ raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
+ /* Wait for any offloading rdp */
+ rdp_gp_sleep(my_rdp, cpu);
+ } else {
+ schedule_timeout_idle(1);
+ }
} else if (!needwait_gp) {
/* Wait for callbacks to appear. */
- trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("Sleep"));
- swait_event_interruptible_exclusive(my_rdp->nocb_gp_wq,
- !READ_ONCE(my_rdp->nocb_gp_sleep));
- trace_rcu_nocb_wake(rcu_state.name, cpu, TPS("EndSleep"));
+ rdp_gp_sleep(my_rdp, cpu);
} else {
rnp = my_rdp->mynode;
trace_rcu_this_gp(rnp, my_rdp, wait_gp_seq, TPS("StartWait"));