Re: [patch] Re: spin_unlock optimization(i386)

Gerard Roudier (groudier@club-internet.fr)
Sun, 21 Nov 1999 21:13:58 +0100 (MET)


On Sun, 21 Nov 1999, Jamie Lokier wrote:

> Gerard wrote:
> > Even if the simple 'mov' may ensure other processors to have a
> > consistent view of the spinlock, it does not prevent the CPU that
> > unlocks from playing with speculative execution around the 'unlock'
> > and perform speculative reads for example. Without a minimal
> > serialization this stuff does not seem safe to me, or at least not for
> > ever.
>
> If local speculative reads are a problem, perhaps this would be faster
> than "lock; btrl $0,%0":
>
> "movl $0,%0" then rmb().

I thought about speculations that involve other variables that may be
performed before the 'mov', or in parallel with the 'mov' and results used
after the 'mov'. This makes your rmb() just miss the train.

> The rmb() does a locked operation but no cross-CPU traffic on a PPro.
> Thus local reads and writes are fully serialised, and bus traffic is
> limited to the write which we believe is read correctly by other
> processors.
>
> The rmb() can be a separate asm so that GCC can place some register
> operations in between if it's that clever (which it isn't, but it might
> be one day).
>
> OTOH, I don't see how speculative reads can be a problem. You're
> speculatively reading from /outside/ a locked region so anything goes.

I wrote "/around/ the 'mov'" and it seemed obvious to me that this
also included code that is executed /inside/ the locked region.

Anyway, this code seems broken facing trivial scenario as the following:
(FIX ME)
...
READ A that is something modified inside the locked region
movl $0,%0

Let the processor reorder operations:
...
movl $0,%0
READ A

What we may want to ensure (I do :) ), at least, is that all pending
operations be completed before we go out of the locked region.
The current IA32 architectureS guarantee writes from a CPU to be observed
in the same order as _program_ order by other CPUs. But they seem very
permissive or intend to be so about reads in cached regions for
performance considerations.

May-be the offending 'unlock' using a simple 'mov' still work on P6 since
Intel is careful at allowing broken software to still work. ;)

How much for a specific kernel config option like CONFIG_ROUDIER ? :o)

Gérard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/