Re: [RFC 0/2] srcu: Remove pre-flip memory barrier

From: Joel Fernandes
Date: Tue Dec 20 2022 - 18:46:28 EST


On Tue, Dec 20, 2022 at 6:05 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> On Tue, Dec 20, 2022 at 07:06:57PM +0000, Joel Fernandes wrote:
> > On Tue, Dec 20, 2022 at 7:01 PM Mathieu Desnoyers
> > <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> > >
> > > On 2022-12-20 13:29, Joel Fernandes wrote:
> > > >
> > >
> > > > I do want to finish my memory barrier studies of SRCU over the holidays since I have been deep in the hole with that already. Back to the post flip memory barrier here since I think now even that might not be needed…
> > >
> > > I strongly suspect the memory barrier after flip is useless for the same
> > > reasons I mentioned explaining why the barrier before the flip is useless.
> > >
> > > However, we need to double-check that we have memory barriers at the
> > > beginning and end of synchronize_srcu, and between load of "unlock"
> > > counters and load of "lock" counters.
> > >
> > > Where is the barrier at the beginning of synchronize_srcu ?
> >
> > I believe we don't need another memory barrier at the beginning of
> > synchronize_srcu() (but this part of my SRCU study is still pending
> > ;)) . The grace period guarantee (read-side critical sections don't
> > span the GP) is already enforced by the memory barrier between
> > scanning for all unlocks, and scanning for all locks (Paired with
> > corresponding memory barriers on the read-side).
> >
> > Accordingly, before we scan all locks and match lock == unlock, there
> > is an smp_mb(). Why is that not sufficient?
>
> That's not enough, you still need a barrier between the updater's pre-GP
> accesses and the scans, so that post-GP read side sees the updater's pre-GP
> accesses:
>
>
> UPDATER READER
> ------- ------
> WRITE A WRITE srcu_read_lock
> smp_mb() //rcu_seq_snap() smp_mb()
> READ srcu_read_lock //scans READ A

But see the comments also in srcu_readers_active_idx_check()

* Needs to be a smp_mb() as the read side may
* contain a read from a variable that is written to before the
* synchronize_srcu() in the write side

So that appears to be already covered. Or is your point that the scans
are not happening on the same CPU as the pre-GP writer, as scans are
happening from workqueue ?

Perhaps that comment misled me.

Confused,

- Joel