Re: [PATCH RFC] srcu: Yet more detail for srcu_readers_active_idx_check() comments

From: Paul E. McKenney
Date: Fri Dec 16 2022 - 11:52:20 EST


On Fri, Dec 16, 2022 at 04:32:39PM +0000, Joel Fernandes wrote:
> On Thu, Dec 15, 2022 at 05:09:14PM -0800, Paul E. McKenney wrote:
> [...]
> > > >> 2. unlock()'s smp_mb() happened before Flip+smp_mb() , now the reader
> > > >> has no new smp_mb() that happens AFTER the flip happened. So it can
> > > >> totally sample the old idx again -- that particular reader will
> > > >> increment twice, but the next time, it will see the flipped one.
> > > >
> > > > I will let you transliterate both. ;-)
> > >
> > > I think I see what you mean now :)
> > >
> > > I believe the access I am referring to is the read of idx on one side and
> > > the write to idx on the other. However that is incomplete and I need to
> > > pair that with some of other access on both sides.
> > >
> > > So perhaps this:
> > >
> > > Writer does flip + smp_mb + read unlock counts [1]
> > >
> > > Reader does:
> > > read idx + smp_mb() + increment lock counts [2]
> > >
> > > And subsequently reader does
> > > Smp_mb() + increment unlock count. [3]
> > >
> > > So [1] races with either [2] or [2]+[3].
> > >
> > > Is that fair?
> >
> > That does look much better, thank you!
>
> Perhaps a comment with an ASCII diagram will help?
>
>
> Case 2:
> Both the reader and the updater see each other's writes too late, but because
> of memory barriers on both sides, they will eventually see each other's write
> with respect to their own. This is similar to the store-buffer problem. This
> let's a single reader contribute a maximum (unlock minus lock) imbalance of 2.
>
> The following diagram shows the subtle worst case followed by a simplified
> store-buffer explanation.
>
> READER UPDATER
> ------------- ----------
> // idx is initially 0.
> read_lock() {
> READ(idx) = 0;
> lock[0]++; --------------------------------------------,
> flip() { |
> smp_mb(); |
> smp_mb(); |
> } |
> |
> // RSCS |
> |
> read_unlock() { |
> smp_mb(); |
> idx++; // P |
> smp_mb(); |
> } |
> |
> scan_readers_idx(0) { |
> count all unlock[0]; |
> | |
> | |
> unlock[0]++; //X <--not-counted--`-----, |
> | |
> } V `------,
> // Will make sure next scan |
> // will not miss this unlock (X) |
> // if other side saw flip (P) ,--`
> // Call this MB [1] |
> // Order write(idx) with |
> // next scan's unlock. |
> smp_mb(); ,---`
> read_lock() { |
> READ(idx)=0; |
> lock[0]++; ----------------> count all lock[0]; |
> smp_mb(); | } |
> } | | V
> | `---> // Incorrect contribution to lock counting
> | // upto a maximum of 2 times.
> |
> `---> // Pairs with MB [1]. Makes sure that
> // the next read_lock()'s' idx read (Y) is ordered
> // with above write to unlock[0] (X).
> |
> rcu_read_unlock() { |
> smp_mb(); <---------------`
> unlock[0]++;
> }
>
> read_lock() {
> READ(idx) = 1; //Y
> lock[1]++;
> ...
> }
> scan_readers_idx(0) {
> count all unlock[0]; //Q
> ...
>
>
> thanks,
>
> - Joel
>
> }
>
> This makes it similar to the store buffer pattern. Using X, Y, P and Q
> annotated above, we get:
>
> READER UPDATER
> X (write) P (write)
>
> smp_mb(); smp_mb();
>
> Y (read) Q (read)

Given that this diagram is more than 50 lines long, it might go better in
a design document describing this part of RCU. Perhaps less detail or
segmented, but the same general idea as this guy:

Documentation/RCU/Design/Memory-Ordering/Tree-RCU-Memory-Ordering.rst

Thoughts?

Thanx, Paul