Re: call_function_many: fix list delete vs add race

From: Peter Zijlstra
Date: Mon Jan 31 2011 - 15:39:13 EST


On Mon, 2011-01-31 at 14:26 -0600, Milton Miller wrote:
> On Mon, 31 Jan 2011 about 11:27:45 +0100, Peter Zijlstra wrote:
> > On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> > > Peter pointed out there was nothing preventing the list_del_rcu in
> > > smp_call_function_interrupt from running before the list_add_rcu in
> > > smp_call_function_many. Fix this by not setting refs until we have put
> > > the entry on the list. We can use the lock acquire and release instead
> > > of a wmb.
> > >
> > > Reported-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > > Signed-off-by: Milton Miller <miltonm@xxxxxxx>
> > > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > > ---
> > >
> > > I tried to force this race with a udelay before the lock & list_add and
> > > by mixing all 64 online cpus with just 3 random cpus in the mask, but
> > > was unsuccessful. Still, it seems to be a valid race, and the fix
> > > is a simple change to the current code.
> >
> > Yes, I think this will fix it, I think simply putting that assignment
> > under the lock is sufficient, because then the list removal will
> > serialize again the list add. But placing it after the list add does
> > also seem sufficient.
> >
> > Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> >
>
> I was worried some architectures would allow a write before the spinlock
> to drop into the spinlock region,

That is indeed allowed to happen

> in which case the data or function
> pointer could be found stale with the cpu mask bit set.

But that is ok, right? the atomic_read(->refs) test will fail and we'll
continue.

> The unlock
> must flush all prior writes and

and reads

> therefore the new function and data
> will be seen before refs is set.


Which again should be just fine, given the interrupt does:

if (!cpumask_test_cpu())
continue

rmb

if (!atomic_read())
continue

and thus we'll be on our happy merry way. If we do however observe the
new ->refs value we have already acquired the lock on the sending end
and the spinlock before the list_del_rcu() will serialize against it
such that we'll always finish the list_add_rcu() before executing the
del.

Or am I not quite understanding things?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/