Re: [GIT PULL rcu/next] RCU commits for 4.13

From: Paul E. McKenney
Date: Wed Jun 28 2017 - 20:46:31 EST


On Wed, Jun 28, 2017 at 05:05:46PM -0700, Linus Torvalds wrote:
> On Wed, Jun 28, 2017 at 4:54 PM, Paul E. McKenney
> <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> >
> > Linus, are you dead-set against defining spin_unlock_wait() to be
> > spin_lock + spin_unlock? For example, is the current x86 implementation
> > of spin_unlock_wait() really a non-negotiable hard requirement? Or
> > would you be willing to live with the spin_lock + spin_unlock semantics?
>
> So I think the "same as spin_lock + spin_unlock" semantics are kind of insane.
>
> One of the issues is that the same as "spin_lock + spin_unlock" is
> basically now architecture-dependent. Is it really the
> architecture-dependent ordering you want to define this as?
>
> So I just think it's a *bad* definition. If somebody wants something
> that is exactly equivalent to spin_lock+spin_unlock, then dammit, just
> do *THAT*. It's completely pointless to me to define
> spin_unlock_wait() in those terms.
>
> And if it's not equivalent to the *architecture* behavior of
> spin_lock+spin_unlock, then I think it should be descibed in terms
> that aren't about the architecture implementation (so you shouldn't
> describe it as "spin_lock+spin_unlock", you should describe it in
> terms of memory barrier semantics.
>
> And if we really have to use the spin_lock+spinunlock semantics for
> this, then what is the advantage of spin_unlock_wait at all, if it
> doesn't fundamentally avoid some locking overhead of just taking the
> spinlock in the first place?
>
> And if we can't use a cheaper model, maybe we should just get rid of
> it entirely?
>
> Finally: if the memory barrier semantics are exactly the same, and
> it's purely about avoiding some nasty contention case, I think the
> concept is broken - contention is almost never an actual issue, and if
> it is, the problem is much deeper than spin_unlock_wait().

All good points!

I must confess that your sentence about getting rid of spin_unlock_wait()
entirely does resonate with me, especially given the repeated bouts of
"but what -exactly- is it -supposed- to do?" over the past 18 months
or so. ;-)

Just for completeness, here is a list of the definitions that have been
put forward, just in case it inspires someone to come up with something
better:

1. spin_unlock_wait() provides only acquire semantics. Code
placed after the spin_unlock_wait() will see the effects of
all previous critical sections, but there is no guarantees for
subsequent critical sections. The x86 implementation provides
this. I -think- that the ARM and PowerPC implementations could
get rid of a memory-barrier instruction and still provide this.

2. As #1 above, but a "smp_mb();spin_unlock_wait();" provides the
additional guarantee that code placed before this construct is
seen by all subsequent critical sections. The x86 implementation
provides this, as do ARM and PowerPC, but it is not clear that all
architectures do. As Alan noted, this is an extremely unnatural
definition for the current memory model.

3. [ Just for completeness, yes, this is off the table! ] The
spin_unlock_wait() has the same semantics as a spin_lock()
followed immediately by a spin_unlock().

4. spin_unlock_wait() is analogous to synchronize_rcu(), where
spin_unlock_wait()'s "read-side critical sections" are the lock's
normal critical sections. This was the first definition I heard
that made any sense to me, but it turns out to be equivalent
to #3. Thus, also off the table.

Does anyone know of any other possible definitions?

Thanx, Paul