Re: [PATCH V2 11/19] csky: Atomic operations

From: Peter Zijlstra
Date: Fri Jul 06 2018 - 07:57:02 EST


On Fri, Jul 06, 2018 at 07:01:31PM +0800, Guo Ren wrote:
> On Thu, Jul 05, 2018 at 07:50:59PM +0200, Peter Zijlstra wrote:

> > What's the memory ordering rules for your LDEX/STEX ?
> Every CPU has a local exclusive monitor.
>
> "Ldex rz, (rx, #off)" will add an entry into the local monitor, and the
> entry is composed of a address tag and a exclusive flag (inited with 1).
> Any stores (include other cores') will break the exclusive flag to 0 in
> the entry which could be indexed by the address tag.
>
> "Stex rz, (rx, #off)" has two condition:
> 1. Store Success: When the entry's exclusive flag is 1, it will store rz
> to the [rx + off] address and the rz will be set to 1.
> 2. Store Failure: When the entry's exclusive flag is 0, just rz will be
> set to 0.

That's how LL/SC works. What I was asking is if they have any effect on
memory ordering. Some architectures have LL/SC imply memory ordering,
most do not.

Going by your spinlock implementation they don't imply any memory
ordering.

> > The mandated semantics for xchg() / cmpxchg() is an effective smp_mb()
> > before _and_ after.
>
> switch (size) { \
> case 4: \
> smp_mb(); \
> asm volatile ( \
> "1: ldex.w %0, (%3) \n" \
> " mov %1, %2 \n" \
> " stex.w %1, (%3) \n" \
> " bez %1, 1b \n" \
> : "=&r" (__ret), "=&r" (tmp) \
> : "r" (__new), "r"(__ptr) \
> : "memory"); \
> smp_mb(); \
> break; \
> Hmm?
> But I couldn't undertand what's wrong without the 1th smp_mb()?
> 1th smp_mb will make all ld/st finish before ldex.w. Is it necessary?

Yes.

CPU0 CPU1

r1 = READ_ONCE(x); WRITE_ONCE(y, 1);
r2 = xchg(&y, 2); smp_store_release(&x, 1);

must not allow: r1==1 && r2==0

> > The above implementation suggests LDEX implies a SYNC.IS, is this
> > correct?
> No, ldex doesn't imply a sync.is.

Right, as per the spinlock emails, then your proposed primitives are
incorrect.