Re: [PATCH v7 6/6] rcu/segcblist: Add additional comments to explain smp_mb()

From: joel
Date: Sat Oct 17 2020 - 01:49:50 EST


On Fri, Oct 16, 2020 at 09:27:53PM -0400, joel@xxxxxxxxxxxxxxxxx wrote:
[..]
> > > + *
> > > + * Memory barrier is needed after adding to length for the case
> > > + * where length transitions from 0 -> 1. This is because rcu_barrier()
> > > + * should never miss an update to the length. So the update to length
> > > + * has to be seen *before* any modifications to the segmented list. Otherwise a
> > > + * race can happen.
> > > + * P0 (what P1 sees) P1
> > > + * queue to list
> > > + * rcu_barrier sees len as 0
> > > + * set len = 1.
> > > + * rcu_barrier does nothing.
> >
> > So that would be:
> >
> > call_rcu() rcu_barrier()
> > -- --
> > WRITE(len, len + 1) l = READ(len)
> > smp_mb() if (!l)
> > queue check next CPU...
> >
> >
> > But I still don't see against what it pairs in rcu_barrier.
>
> Actually, for the second case maybe a similar reasoning can be applied
> (control dependency) but I'm unable to come up with a litmus test.
> In fact, now I'm wondering how is it possible that call_rcu() races with
> rcu_barrier(). The module should ensure that no more call_rcu() should happen
> before rcu_barrier() is called.
>
> confused

So I made a litmus test to show that smp_mb() is needed also after the update
to length. Basically, otherwise it is possible the callback will see garbage
that the module cleanup/unload did.

C rcubarrier+ctrldep

(*
* Result: Never
*
* This litmus test shows that rcu_barrier (P1) prematurely
* returning by reading len 0 can cause issues if P0 does
* NOT have a smb_mb() after WRITE_ONCE(len, 1).
* mod_data == 2 means module was unloaded (so data is garbage).
*)

{ int len = 0; int enq = 0; }

P0(int *len, int *mod_data, int *enq)
{
int r0;

WRITE_ONCE(*len, 1);
smp_mb(); /* Needed! */
WRITE_ONCE(*enq, 1);

r0 = READ_ONCE(*mod_data);
}

P1(int *len, int *mod_data, int *enq)
{
int r0;
int r1;

r1 = READ_ONCE(*enq);

// barrier Just for test purpose ("exists" clause) to force the..
// ..rcu_barrier() to see enq before len
smp_mb();
r0 = READ_ONCE(*len);

// implicit memory barrier due to conditional */
if (r0 == 0)
WRITE_ONCE(*mod_data, 2);
}

// Did P0 read garbage?
exists (0:r0=2 /\ 1:r0=0 /\ 1:r1=1)