[patch] Re: spin_unlock optimization(i386)

Ingo Molnar (mingo@chiara.csoma.elte.hu)
Sun, 21 Nov 1999 13:32:26 +0100 (CET)


On Sat, 20 Nov 1999, Manfred Spraul wrote:

> the current spin_unlock asm code is
> "lock; btrl $0,%0"
> it takes ~ 22 ticks on my PII/350.
>
> I think it's possible to replace that with
> "movl $0,%0"
> which would be a simple, pairable single-tick instruction.

[this very issue popped up a couple of days ago on FreeBSD mailing lists
as well.]

> IA32 never reorders write operations, ie even without the "lock;" prefix
> spin_unlock() is still a write memory barrier.

yep, this should be possible. I remember having hacked in something like
that a year ago but i saw crashes - although that might be an unrelated
thing.

> [I guess it's to late to change that for the 2.4 timeframe, [...]

if this is safe from the cache-coherency point of view then we should do
it now (patch attached). [We still want to keep it in volatile assembly so
that both GCC and the CPU sees a barrier.]

as an additional optimization, instead of doing 'movl $0, %0' we rather
want to use a 'movb $0, %0', because that has 3 bytes less instruction
size and is still optimized in the CPU pipeline. [Although in the future
this might be less and less the case. Anyway, right now it's very cheap.]
This is safe because only the lowest bit is of interest to us.

it's not safe to do similar tricks in the write lock case (because having
the write bit set doesnt mean exclusive ownership of all other bits), but
simple spinlocks should be fine.

> but release_kernel_lock() for i386 could be simplified, perhaps even
> unlock_kernel()]

release_kernel_lock() is basically just a case of spin_unlock(). the 'big
kernel lock' is not at all dominant in latest 2.3 kernels anymore. Doing
this in spin_unlock benefits both cases.

(my 8-way SMP box appears to be just fine after this change, under heavy
load. dbench numbers are visibly up, 252MB/sec instead of 242MB/sec)

i'm really happy about this - there are tons of places that are using
spin_unlock, and this effectively cuts the cost of spinlocks into half.

-- mingo

--- linux/include/asm-i386/spinlock.h.orig2 Sun Nov 21 03:50:29 1999
+++ linux/include/asm-i386/spinlock.h Sun Nov 21 03:52:10 1999
@@ -36,7 +36,7 @@
".previous"

#define spin_unlock_string \
- "lock ; btrl $0,%0"
+ "movb $0,%0"

#define spin_lock(lock) \
__asm__ __volatile__( \

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/