Re: [PATCH V3] powerpc: Implement {cmp}xchg for u8 and u16

From: Pan Xinhui
Date: Tue Apr 26 2016 - 07:35:27 EST



On 2016å04æ25æ 23:37, Peter Zijlstra wrote:
> On Mon, Apr 25, 2016 at 06:10:51PM +0800, Pan Xinhui wrote:
>>> So I'm not actually _that_ familiar with the PPC LL/SC implementation;
>>> but there are things a CPU can do to optimize these loops.
>>>
>>> For example, a CPU might choose to not release the exclusive hold of the
>>> line for a number of cycles, except when it passes SC or an interrupt
>>> happens. This way there's a smaller chance the SC fails and inhibits
>>> forward progress.
>
>> I am not sure if there is such hardware optimization.
>
> So I think the hardware must do _something_, otherwise competing cores
> doing load-exlusive could life-lock a system, each one endlessly
> breaking the exclusive ownership of the other and the store-conditional
> always failing.
>
Seems there is no such optimization.

We haver observed SC fails almost all the time in a contention tests, then got stuck in the loop. :(
one thread modify val with LL/SC, and other threads just modify val without any respect to LL/SC.

So in the end, I choose to rewrite this patch in asm. :)

> Of course, there are such implementations, and they tend to have to put
> in explicit backoff loops; however, IIRC, PPC doesn't need that. (See
> ARC for an example that needs to do this.)
>