Re: [patch] Re: spin_unlock optimization(i386)

Gerard Roudier (groudier@club-internet.fr)
Sun, 21 Nov 1999 23:21:22 +0100 (MET)


On Sun, 21 Nov 1999, Manfred Spraul wrote:

> Linus Torvalds wrote:
> >
> > Note that as I said, even on current models the optimization is illegal.
> >
> [...]
> > So what you could end up doing is equivalent to
> >
> > CPU#1 CPU#2
> >
> > b = a; /* cache miss, we'll delay this.. */
> > spin_unlock();
> >
> > spin_lock();
> > a = 1;
>
> You have reordered write _ahead_ of read, and the effects were
> externally visible.
> This cannot happen:
>
> Pentium Processor Familly Developers Manual, Vol3, Chapter 19.2 Memory
> Access Ordering:
>
> "However, to optimize performance, the Pentium processor allows memory
> reads to be reordered ahead of buffered writes in most situations.
> Internally, CPU reads (cache hits) can be reordered around buffered
> writes. Memory reordering does not occur at the pins, reads (cache miss)
> and writes appear in-order."
>
> The second CPU will never see the spin_unlock() before the "b=a" line.

You should read:

'Intel Architecture Software Developer's Manual'
'Volume 3: System programming guide'
'7.2.2'

And then, may-be, you will understand, as I did, that a P6 processor is
consistent with itself when optimizing reads ahead writes and no more.
Reading more, you may end up thinking, as I did, that 'serialization'
followed by a LOCK seem to be the only way for ensuring that some 'read'
from some location will not break consistency around some 'semaphore' in
SMP.

The LOCK prefix is doing exactly what we expect from the 'unlock' of a
spinlock on a P6, in my opinion.

Gérard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/