Re: [PATCH] barriers: introduce smp_mb__release_acquire and update documentation

From: Paul E. McKenney
Date: Tue Sep 15 2015 - 13:47:56 EST


On Tue, Sep 15, 2015 at 05:13:30PM +0100, Will Deacon wrote:
> As much as we'd like to live in a world where RELEASE -> ACQUIRE is
> always cheaply ordered and can be used to construct UNLOCK -> LOCK
> definitions with similar guarantees, the grim reality is that this isn't
> even possible on x86 (thanks to Paul for bringing us crashing down to
> Earth).

"It is a service that I provide." ;-)

> This patch handles the issue by introducing a new barrier macro,
> smp_mb__release_acquire, that can be placed between a RELEASE and a
> subsequent ACQUIRE operation in order to upgrade them to a full memory
> barrier. At the moment, it doesn't have any users, so its existence
> serves mainly as a documentation aid.
>
> Documentation/memory-barriers.txt is updated to describe more clearly
> the ACQUIRE and RELEASE ordering in this area and to show an example of
> the new barrier in action.
>
> Cc: Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Will Deacon <will.deacon@xxxxxxx>

Some questions and comments below.

Thanx, Paul

> ---
>
> Following our discussion at [1], I thought I'd try to write something
> down...
>
> [1] http://lkml.kernel.org/r/20150828104854.GB16853@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
>
> Documentation/memory-barriers.txt | 23 ++++++++++++++++++++++-
> arch/powerpc/include/asm/barrier.h | 1 +
> arch/x86/include/asm/barrier.h | 2 ++
> include/asm-generic/barrier.h | 4 ++++
> 4 files changed, 29 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 2ba8461b0631..46a85abb77c6 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -459,11 +459,18 @@ And a couple of implicit varieties:
> RELEASE on that same variable are guaranteed to be visible. In other
> words, within a given variable's critical section, all accesses of all
> previous critical sections for that variable are guaranteed to have
> - completed.
> + completed. If the RELEASE and ACQUIRE operations act on independent
> + variables, an smp_mb__release_acquire() barrier can be placed between
> + them to upgrade the sequence to a full barrier.
>
> This means that ACQUIRE acts as a minimal "acquire" operation and
> RELEASE acts as a minimal "release" operation.
>
> +A subset of the atomic operations described in atomic_ops.txt have ACQUIRE
> +and RELEASE variants in addition to fully-ordered and relaxed definitions.
> +For compound atomics performing both a load and a store, ACQUIRE semantics
> +apply only to the load and RELEASE semantics only to the store portion of
> +the operation.
>
> Memory barriers are only required where there's a possibility of interaction
> between two CPUs or between a CPU and a device. If it can be guaranteed that
> @@ -1895,6 +1902,20 @@ the RELEASE would simply complete, thereby avoiding the deadlock.
> a sleep-unlock race, but the locking primitive needs to resolve
> such races properly in any case.
>
> +If necessary, ordering can be enforced by use of an
> +smp_mb__release_acquire() barrier:
> +
> + *A = a;
> + RELEASE M
> + smp_mb__release_acquire();
> + ACQUIRE N
> + *B = b;
> +
> +in which case, the only permitted sequences are:
> +
> + STORE *A, RELEASE M, ACQUIRE N, STORE *B
> + STORE *A, ACQUIRE N, RELEASE M, STORE *B
> +
> Locks and semaphores may not provide any guarantee of ordering on UP compiled
> systems, and so cannot be counted on in such a situation to actually achieve
> anything at all - especially with respect to I/O accesses - unless combined
> diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h
> index 0eca6efc0631..919624634d0a 100644
> --- a/arch/powerpc/include/asm/barrier.h
> +++ b/arch/powerpc/include/asm/barrier.h
> @@ -87,6 +87,7 @@ do { \
> ___p1; \
> })
>
> +#define smp_mb__release_acquire() smp_mb()

If we are handling locking the same as atomic acquire and release
operations, this could also be placed between the unlock and the lock.

However, independently of the unlock/lock case, this definition and
use of smp_mb__release_acquire() does not handle full ordering of a
release by one CPU and an acquire of that same variable by another.
In that case, we need roughly the same setup as the much-maligned
smp_mb__after_unlock_lock(). So, do we care about this case? (RCU does,
though not 100% sure about any other subsystems.)

> #define smp_mb__before_atomic() smp_mb()
> #define smp_mb__after_atomic() smp_mb()
> #define smp_mb__before_spinlock() smp_mb()
> diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
> index 0681d2532527..1c61ad251e0e 100644
> --- a/arch/x86/include/asm/barrier.h
> +++ b/arch/x86/include/asm/barrier.h
> @@ -85,6 +85,8 @@ do { \
> ___p1; \
> })
>
> +#define smp_mb__release_acquire() smp_mb()
> +
> #endif
>
> /* Atomic operations are already serializing on x86 */
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index b42afada1280..61ae95199397 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -119,5 +119,9 @@ do { \
> ___p1; \
> })
>
> +#ifndef smp_mb__release_acquire
> +#define smp_mb__release_acquire() do { } while (0)

Doesn't this need to be barrier() in the case where one variable was
released and another was acquired?

> +#endif
> +
> #endif /* !__ASSEMBLY__ */
> #endif /* __ASM_GENERIC_BARRIER_H */
> --
> 2.1.4
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/