Re: [patch] Re: spin_unlock optimization(i386)

Manfred (manfreds@colorfullife.com)
Sun, 21 Nov 1999 13:51:29 +0100


From: Ingo Molnar <mingo@chiara.csoma.elte.hu>
> [this very issue popped up a couple of days ago on FreeBSD mailing lists
> as well.]
>
I didn't read that, I noticed it while I was thinking about the rw
semaphore.

> as an additional optimization, instead of doing 'movl $0, %0' we rather
> want to use a 'movb $0, %0', because that has 3 bytes less instruction
> size and is still optimized in the CPU pipeline. [Although in the future
> this might be less and less the case. Anyway, right now it's very cheap.]
> This is safe because only the lowest bit is of interest to us.

There's an explicit warning in the Pentium Handbook:
Chapter 19.1.1 LOCK Prefix and the LOCK# Signal:
...
Semaphores (shared memory used for signalling between multiple processors)
should be accessed using identical address and length.
...
Btw, the semaphore/rw_lock operations change the flag-register, do you know
why we do not specify "cc" as a side-effect of the inline asm?

> > IA32 never reorders write operations, ie even without the "lock;" prefix
> > spin_unlock() is still a write memory barrier.
>
> yep, this should be possible. I remember having hacked in something like
> that a year ago but i saw crashes - although that might be an unrelated
> thing.
>
> > [I guess it's to late to change that for the 2.4 timeframe, [...]
>
> if this is safe from the cache-coherency point of view then we should do
> it now (patch attached). [We still want to keep it in volatile assembly so
> that both GCC and the CPU sees a barrier.]
>
With "lock;btrl" the CPU sees a read+write memory barrier, with "mov", it's
only a write memory barrier. Are you sure that this is not a problem?
Both Alpha and Sparc64 have a full (ie read+write) memory barrier in
spin_unlock.
Perhaps you should ask the Alpha/Sparc64 maintainer?

--
    Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/