Re: [PATCH] rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()

From: Paul E. McKenney
Date: Thu Jun 10 2021 - 20:49:21 EST


On Fri, Jun 11, 2021 at 09:28:10AM +0900, Akira Yokosawa wrote:
> On Thu, 10 Jun 2021 09:57:10 -0700, Paul E. McKenney wrote:
> > On Thu, Jun 10, 2021 at 05:50:29PM +0200, Frederic Weisbecker wrote:
> >> Add some missing critical pieces of explanation to understand the need
> >> for full memory barriers throughout the whole grace period state machine,
> >> thanks to Paul's explanations.
> >>
> >> Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> >> Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> >> Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> >> Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> >> Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> >
> > Nice!!! And not bad wording either, though I still could not resist the
> > urge to wordsmith further. Plus I combined your two examples, in order to
> > provide a trivial example use of the polling interfaces, if nothing else.
> >
> > Please let me know if I messed anything up.
>
> Hi Paul,
>
> See minor tweaks below to satisfy sphinx.
>
> >
> > Thanx, Paul
> >
> > ------------------------------------------------------------------------
> >
> > commit f21b8fbdf9a59553da825265e92cedb639b4ba3c
> > Author: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Date: Thu Jun 10 17:50:29 2021 +0200
> >
> > rcu/doc: Add a quick quiz to explain further why we need smp_mb__after_unlock_lock()
> >
> > Add some missing critical pieces of explanation to understand the need
> > for full memory barriers throughout the whole grace period state machine,
> > thanks to Paul's explanations.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> > Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> >
> > diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > index 11cdab037bff..3cd5cb4d86e5 100644
> > --- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > +++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
> > @@ -112,6 +112,35 @@ on PowerPC.
> > The ``smp_mb__after_unlock_lock()`` invocations prevent this
> > ``WARN_ON()`` from triggering.
> >
> > ++-----------------------------------------------------------------------+
> > +| **Quick Quiz**: |
> > ++-----------------------------------------------------------------------+
> > +| But the whole chain of rcu_node-structure locking guarantees that |
> > +| readers see all pre-grace-period accesses from the updater and |
> > +| also guarantees that the updater to see all post-grace-period |
> > +| accesses from the readers. So why do we need all of those calls |
> > +| to smp_mb__after_unlock_lock()? |
> > ++-----------------------------------------------------------------------+
> > +| **Answer**: |
> > ++-----------------------------------------------------------------------+
> > +| Because we must provide ordering for RCU's polling grace-period |
> > +| primitives, for example, get_state_synchronize_rcu() and |
> > +| poll_state_synchronize_rcu(). For example: |
> > +| |
> > +| CPU 0 CPU 1 |
> > +| ---- ---- |
> > +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> > +| g = get_state_synchronize_rcu() smp_mb() |
> > +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> > +| continue; |
>
> This indent causes warnings from sphinx:
>
> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:135: WARNING: Unexpected indentation.
> Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst:137: WARNING: Block quote ends without a blank line; unexpected unindent
>
> > +| r0 = READ_ONCE(Y) |
> > +| |
> > +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> > +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> > +| or offline) and thus won't interact directly with the RCU core |
> > +| processing at all. |
> > ++-----------------------------------------------------------------------+
> > +
> > This approach must be extended to include idle CPUs, which need
> > RCU's grace-period memory ordering guarantee to extend to any
> > RCU read-side critical sections preceding and following the current
>
> The code block in the answer can be fixed as follows:
>
> ++-----------------------------------------------------------------------+
> +| **Answer**: |
> ++-----------------------------------------------------------------------+
> +| Because we must provide ordering for RCU's polling grace-period |
> +| primitives, for example, get_state_synchronize_rcu() and |
> +| poll_state_synchronize_rcu(). For example:: |
> +| |
> +| CPU 0 CPU 1 |
> +| ---- ---- |
> +| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
> +| g = get_state_synchronize_rcu() smp_mb() |
> +| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
> +| continue; |
> +| r0 = READ_ONCE(Y) |
> +| |
> +| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
> +| happen, even if CPU 1 is in an RCU extended quiescent state (idle |
> +| or offline) and thus won't interact directly with the RCU core |
> +| processing at all. |
> ++-----------------------------------------------------------------------+
>
> Hint: Use of "::" and indented code block.

Thank you!

As in with the following patch to be merged into Frederic's original,
with attribution?

Thanx, Paul

------------------------------------------------------------------------

diff --git a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
index 3cd5cb4d86e5..bc884ebf88bb 100644
--- a/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
+++ b/Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst
@@ -125,15 +125,15 @@ The ``smp_mb__after_unlock_lock()`` invocations prevent this
+-----------------------------------------------------------------------+
| Because we must provide ordering for RCU's polling grace-period |
| primitives, for example, get_state_synchronize_rcu() and |
-| poll_state_synchronize_rcu(). For example: |
+| poll_state_synchronize_rcu(). For example:: |
| |
-| CPU 0 CPU 1 |
-| ---- ---- |
-| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
-| g = get_state_synchronize_rcu() smp_mb() |
-| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
-| continue; |
-| r0 = READ_ONCE(Y) |
+| CPU 0 CPU 1 |
+| ---- ---- |
+| WRITE_ONCE(X, 1) WRITE_ONCE(Y, 1) |
+| g = get_state_synchronize_rcu() smp_mb() |
+| while (!poll_state_synchronize_rcu(g)) r1 = READ_ONCE(X) |
+| continue; |
+| r0 = READ_ONCE(Y) |
| |
| RCU guarantees that that the outcome r0 == 0 && r1 == 0 will not |
| happen, even if CPU 1 is in an RCU extended quiescent state (idle |