Re: [RFC 0/2] srcu: Remove pre-flip memory barrier

From: Mathieu Desnoyers
Date: Wed Dec 21 2022 - 11:30:41 EST


On 2022-12-20 23:26, Joel Fernandes wrote:


On Dec 20, 2022, at 10:43 PM, Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:

On 2022-12-20 19:58, Frederic Weisbecker wrote:
On Wed, Dec 21, 2022 at 01:49:57AM +0100, Frederic Weisbecker wrote:
On Tue, Dec 20, 2022 at 07:15:00PM -0500, Joel Fernandes wrote:
On Tue, Dec 20, 2022 at 5:45 PM Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
Agreed about (1).

_ In (2), E pairs with the address-dependency between idx and lock_count.

But that is not the only reason. If that was the only reason for (2),
then there is an smp_mb() just before the next-scan post-flip before
the lock counts are read.

The post-flip barrier makes sure the new idx is visible on the next READER's
turn, but it doesn't protect against the fact that "READ idx then WRITE lock[idx]"
may appear unordered from the update side POV if there is no barrier between the
scan and the flip.

If you remove the smp_mb() from the litmus test I sent, things explode.
Or rather, look at it the other way, if there is no barrier between the lock
scan and the index flip (E), then the index flip can appear to be written before the
lock is read. Which means you may start activating the index before you finish
reading it (at least it appears that way from the readers pont of view).

Considering that you can have pre-existing readers from arbitrary index appearing anywhere in the grace period (because a reader can fetch the
index and be preempted for an arbitrary amount of time before incrementing the lock count), the grace period algorithm needs to deal with the fact that a newcoming reader can appear in a given index either before or after the flip.

I don't see how flipping the index before or after loading the unlock/lock values would break anything (except for unlikely counter overflow situations as previously discussed).

If you say unlikely, that means it can happen some times which is bad enough ;-). Maybe you mean impossible.

I mean that if we have a synchronize_srcu preemption long enough to get 2^32 or 2^64 concurrent srcu read-side critical sections, I strongly suspect that RCU stall detection will yell loudly. And if it does not already, then we should make it so.

So I mean "impossible unless the system is already unusable", rather than just "unlikely".

Thanks,

Mathieu

I would not settle for anything less than keeping the memory barrier around if it helps unlikely cases, but only D does help for the theoretical wrapping/overflow issue. E is broken and does not even help the theoretical issue IMO. And both D and E do not affect correctness IMO.

Anyway in all likelihood, I will be trying to remove E completely and clarify docs on D in the coming weeks. And also try to drop the size of the counters per our discussions

Thanks.




Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com