Re: [patch] new spinlock variant, spinlock-2.3.30-A4

Manfred (manfreds@colorfullife.com)
Sat, 4 Dec 1999 00:51:52 +0100

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Curtis M. Brune: "Process "pickling""
Previous message: Alan Cox: "Re: Building WAN card driver as monolythic (2nd round)"

From: Andrea Arcangeli <andrea@suse.de>
> On Fri, 3 Dec 1999, Manfred wrote:
>
> >It seems that the cpu discards speculative reads if it sees a write
> >operation on
> >the system bus (I tried to force a speculative read, and my proggy never
> >failed)
>
> could you elaborate? If reads would been invalidated by parallel writes
> the wait event shouldn't hang. The event-CPU writes in parallel and the
> speculative read is not invalidated.
>
Speculative reads are invalidated. This _only_ means that
cpu1: store A; store B
cpu2: read B, read A.
if cpu2 read the new value of B, then it's always true that it gets the
new value of A. (On my P/II 350, I'm not sure if this is officially
documented)

BUT I think there is another flaw in Ingo's new spinlocks (My mind was
warped around write ordering, and I missed the obvious problem):

Ingo wrote on 30 Nov 99:
> By the time we add (%%esi) and 4(%%esi), all other CPUs' potential
> stores must have been observed, and thus we get into the slow path if
> there was any write to any of the two words.

AFAIK, this is wrong. Our write goes into a write buffer, the write
instruction from the other cpu goes into it's write buffer. Neither
instruction will be visible during the next few (dozend) cycles on
the other cpu, ie both cpu's could get the spinlock.

------
we still need the rmb() during set_current_state():

The race is caused by 2 cpu's that write same value, and they race because
the timespan between "write instruction retires" and "write instruction is
visible to all cores" is not fixed.

This means that the following could happen:

cpu1 memory cpu2
A=0
A=1
* into buffer
* A=0
* * cpu2 is a snob
* * with a short buffer ;)
* * my test program
* *uses a rmb()
* * to trigger that.
* A=0 <--------
*
* cpu1 is lazy
*
*
------> A=1

As you see, although cpu2 has executed it's instruction after
cpu1, the value from cpu1 'wins'.

This is the basic problem with _lock_buffer():
* cpu1 sets current->state and decides that it should sleep
++ instructions retire
* cpu2 notices that is must wake-up the other thread (ie it
sets current->state)
++ instruction retire
* cpu2 flushes it's write buffer
* cpu1 flushes it's write buffer

-> lock-up.

-----------
Manfred

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Next message: Curtis M. Brune: "Process "pickling""
Previous message: Alan Cox: "Re: Building WAN card driver as monolythic (2nd round)"