Re: [PATCH v2] srcu: Improve comments about acceleration leak

From: Joel Fernandes
Date: Tue Dec 12 2023 - 15:11:48 EST




> On Dec 10, 2023, at 8:57 PM, Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx> wrote:
>
> The comments added in commit 1ef990c4b36b ("srcu: No need to
> advance/accelerate if no callback enqueued") are a bit confusing to me.
> The comments are describing a scenario for code that was moved and is
> no longer the way it was (snapshot after advancing). Improve the code
> comments to reflect this and also document by acceleration can never
> fail.
>
> Cc: Frederic Weisbecker <frederic@xxxxxxxxxx>
> Cc: Neeraj Upadhyay <neeraj.iitr10@xxxxxxxxx>
> Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>

Do we want to quick review and put it in Neeraj PR?

Or next merge window ok with me. Just that then I have to keep track of it ;-)

Thanks,

- Joel



> ---
> v1->v2: Fix typo in change log.
>
> kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0351a4e83529..051e149490d1 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> if (rhp)
> rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> /*
> - * The snapshot for acceleration must be taken _before_ the read of the
> - * current gp sequence used for advancing, otherwise advancing may fail
> - * and acceleration may then fail too.
> + * It's crucial to capture the snapshot 's' for acceleration before
> + * reading the current gp_seq that is used for advancing. This is
> + * essential because if the acceleration snapshot is taken after a
> + * failed advancement attempt, there's a risk that a grace period may
> + * conclude and a new one may start in the interim. If the snapshot is
> + * captured after this sequence of events, the acceleration snapshot 's'
> + * could be excessively advanced, leading to acceleration failure.
> + * In such a scenario, an 'acceleration leak' can occur, where new
> + * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> + * Also note that encountering advancing failures is a normal
> + * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
> *
> - * This could happen if:
> + * To see this, consider the following events which occur if
> + * rcu_seq_snap() were to be called after advance:
> *
> * 1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
> * RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> if (rhp) {
> rcu_segcblist_advance(&sdp->srcu_cblist,
> rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> + /*
> + * Acceleration can never fail because the state of gp_seq used
> + * for advancing is <= the state of gp_seq used for
> + * acceleration. This means that RCU_NEXT_TAIL segment will
> + * always be able to be emptied by the acceleration into the
> + * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
> + */
> WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
> }
> if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> --
> 2.43.0.472.g3155946c3a-goog
>