Re: [RFC][PATCH 05/12] arch: Introduce arch_{,try_}_cmpxchg128{,_local}()

From: Peter Zijlstra
Date: Tue Dec 20 2022 - 10:11:03 EST


On Tue, Dec 20, 2022 at 08:31:19AM -0600, Linus Torvalds wrote:
> On Tue, Dec 20, 2022 at 5:09 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Mon, Dec 19, 2022 at 12:07:25PM -0800, Boqun Feng wrote:
> > >
> > > I wonder whether we should use "(*(u128 *)ptr)" instead of "(*(unsigned
> > > long *) ptr)"? Because compilers may think only 64bit value pointed by
> > > "ptr" gets modified, and they are allowed to do "useful" optimization.
> >
> > In this I've copied the existing cmpxchg_double() code; I'll have to let
> > the arch folks speak here, I've no clue.
>
> It does sound like the right thing to do. I doubt it ends up making a
> difference in practice, but yes, the asm doesn't have a memory
> clobber, so the input/output types should be the right ones for the
> compiler to not possibly do something odd and cache the part that it
> doesn't see as being accessed.

Right, and x86 does just *ptr, without trying to cast away the volatile
even.

I've pushed out a *(u128 *)ptr variant for arm64 and s390, then at least
we'll know if the compiler objects.