Re: [PATCH locking/Documentation 1/2] Add note of release-acquire store vulnerability

From: Paul E. McKenney
Date: Thu Sep 29 2016 - 15:19:16 EST


On Thu, Sep 29, 2016 at 08:44:39PM +0200, Peter Zijlstra wrote:
> On Thu, Sep 29, 2016 at 11:10:15AM -0700, Paul E. McKenney wrote:
> > > >
> > > > P0(int *x, int *y)
> > > > {
> > > > WRITE_ONCE(*x, 1);
> > > > smp_wmb();
> > > > smp_store_release(y, 1);
> > > > }
> > > >
> > > > P1(int *y)
> > > > {
> > > > WRITE_ONCE(*y, 2);
> > > > }
> > > >
> > > > P2(int *x, int *y)
> > > > {
> > > > r1 = smp_load_acquire(y);
> > > > r2 = READ_ONCE(*x);
> > > > }
> > > >
> > > > Both ARM and powerpc allow the "after the dust settles" outcome (r1=2 &&
> > > > r2=0), as does the current version of the early prototype Linux-kernel
> > >
> > > And the above needs to be (r1!=2 || r2 != 0)... Sigh!
> >
> > Make that (y==2 && r1==2 && r2 == 0).
> >
> > Any further bids? ;-)
>
> Isn't that the trivial P1,P2,P0 order again?

I don't believe so. Wouldn't the final P0 would leave y==1?

> How about something like so on PPC?
>
> P0(int *x, int *y)
> {
> WRITE_ONCE(*x, 1);
> smp_store_release(y, 1);
> }
>
> P1(int *x, int *y)
> {
> WRITE_ONCE(x, 2);

Need "WRITE_ONCE(*x, 2)" here.

> smp_store_release(y, 2);
> }
>
> P2(int *x, int *y)
> {
> r1 = smp_load_acquire(y);
> r2 = READ_ONCE(*x);
> }
>
> (((x==1 && y==2) | (x==2 && y==1)) && (r1==1 || r1==2) && r2==0)

That exists-clause is quite dazzling... So if each of P0 and P1
win, but on different stores, and if P2 follows one or the other
of P0 or P1, can r2 get the pre-initialization value for x?

> If you execute P0 and P1 concurrently and one store of each 'wins' the
> LWSYNC of either is null and void, and therefore P2 is unordered and can
> observe r2==0.

That vaguely resembles the infamous Z6.3, but only vaguely. The Linux-kernel
memory model says "forbidden" to this:

C C-WillDeacon-AcqRelStore.litmus

{
}

P0(int *x, int *y)
{
WRITE_ONCE(*x, 1);
smp_store_release(y, 1);
}

P1(int *x, int *y)
{
WRITE_ONCE(*x, 2);
smp_store_release(y, 2);
}

P2(int *x, int *y)
{
r1 = smp_load_acquire(y);
r2 = READ_ONCE(*x);
}

exists
(((x=1 /\ y=2) \/ (x=2 /\ y=1)) /\ (2:r1=1 \/ 2:r1=2) /\ 2:r2=0)

So let's try PPCMEM. If PPCMEM allows it, then the kernel model is
clearly broken.

PPC PeterZijlstra+o-r+o-r+a-o-SB.litmus
{
0:r1=1; 0:r2=2; 0:r3=x; 0:r4=y;
1:r1=1; 1:r2=2; 1:r3=x; 1:r4=y;
2:r3=x; 2:r4=y;
}
P0 | P1 | P2 ;
stw r1,0(r3) | stw r2,0(r3) | lwz r1,0(r4) ;
lwsync | lwsync | lwsync ;
stw r1,0(r4) | stw r2,0(r4) | lwz r2,0(r3) ;
exists
(((x=1 /\ y=2) \/ (x=2 /\ y=1)) /\ (2:r1=1 \/ 2:r1=2) /\ 2:r2=0)

Now herd says that this is forbidden, but let's ask ppcmem. Futzing with
the web site seems to say "no". The lwsyncs insist on propagating the
change to x to P2 before the change to y. The full-state-space search
tool might take awhile, so will get you know if it disagrees. Just in
case I missed some odd combination of ppcmem events.

But at the moment, my guess is that your dazzling condition cannot
happen on PowerPC, and that at least this aspect of the current draft
of the kernel memory model is in fact correct.

Or did I incorrectly translate your litmus test?

Thanx, Paul