Re: Add memory barrier when waiting on futex

From: Peter Zijlstra
Date: Tue Nov 26 2013 - 03:50:39 EST


On Tue, Nov 26, 2013 at 01:07:25AM +0000, Ma, Xindong wrote:
> [ 1038.694701] putmetho-11202 1...1 1035007289001: futex_wait: LEON, wait ==, addr:41300384, pid:11202
> [ 1038.694716] putmetho-11202 1...1 1035007308860: futex_wait_queue_me: LEON, q->task => 11202
> [ 1038.694731] SharedPr-11272 0...1 1035007319703: futex_wake: LEON, wake xx, addr:41300384, NULL task

> From the log captured, task 11202 runs on cpu1 and wait on futex and set task to its pid in queue_me(). Then
> task 11272 get scheduled on cpu0, it tries to wake up 11202. But the q->task set by cpu1 is not visible at first to
> cpu0, after several instructions, it's visible to cpu0 again. So this is the problem maybe in cache or instruction out
> of order. After adding memory barrier, the issue does not reproduced anymore.

So that suggests the spinlock implementation doesn't actually serialize
proper; why would you place an extra memory barrier at the site that
shows this and not try and fix the spinlock implementation?

That's FAIL 1.

Secondly, the current x86 spinlocks are correct and work for all known
chips. This leads me to believe your chip is broken; esp. since you
haven't specified what kind of chip you're running on (and are somewhat
avoiding the issue).

That's FAIL 2.

Having this information and not enriching the initial changelog with it
to more fully explain your reasoning,

that's FAIL 3.

Peter A, can you please figure out wth these guys are doing?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/