Re: spin_unlock optimization(i386)

Oliver Xymoron (oxymoron@waste.org)
Thu, 25 Nov 1999 08:57:32 -0600 (CST)


On Wed, 24 Nov 1999, Linus Torvalds wrote:

> HOWEVER, we have a slightly different situation here:
>
> CPU 1 CPU 2
>
> read b
> a = 0
> read a, see 0
> write b
>
> Note how both CPU's only did ONE write. That one write does not have any
> real "order", and really write ordering is not an issue AT ALL.
>
> In physics, this issue is called "causality" rather than "order", and I
> don't think the issue has really been clarified.

If I understand correctly (based on my PPro docs), reads may pass writes
but only by moving before. Speculation only goes in one direction and
writes are never speculative. Therefore, in the sequence you've described,
with the posit that CPU 2 does see the write to a, we can be sure that
every operation on CPU 1 that should have programmatically preceded that
write has occurred. So causality is maintained.

What you can't be sure of is that things after the write haven't already
occurred - speculative execution of reads past the end of a critical
section or into the beginning of another:

CPU 1 CPU2
write data,shared
write 0,lock
bts lock
jc 2
read shared

Here, shared is guaranteed to be written in program order, that is before
lock. But we have no guarantee that CPU2's read doesn't speculatively
happen before the lock grab, which means causality here is not preserved.

Which is ok, of course, since we need to force atomicity of the lock grab
anyway and the way to do that is with the LOCK prefix, which ends up
serializing memory access.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/