Re: call_function_many: fix list delete vs add race

From: Milton Miller
Date: Mon Jan 31 2011 - 16:36:34 EST


On Mon, 31 Jan 2011 about 22:17:57 +0100, Peter Zijlstra wrote:
>
> On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> > @@ -491,15 +491,17 @@ void smp_call_function_many(const struct
> > cpumask_clear_cpu(this_cpu, data->cpumask);
> >
> > /*
> > - * To ensure the interrupt handler gets an complete view
> > - * we order the cpumask and refs writes and order the read
> > - * of them in the interrupt handler. In addition we may
> > - * only clear our own cpu bit from the mask.
> > + * We reuse the call function data without waiting for any grace
> > + * period after some other cpu removes it from the global queue.
> > + * This means a cpu might find our data block as it is writen.
> > + * The interrupt handler waits until it sees refs filled out
> > + * while its cpu mask bit is set; here we may only clear our
> > + * own cpu mask bit, and must wait to set refs until we are sure
> > + * previous writes are complete and we have obtained the lock to
> > + * add the element to the queue. We use the acquire and release
> > + * of the lock as a wmb() -- acquire prevents write moving up and
> > + * release requires old writes are visible.
>
> That's wrong:
>
> ->foo =
> LOCK
> UNLOCK
> ->bar =
>
> can be re-ordered as:
>
> LOCK
> ->bar =
> ->foo =
> UNLOCK
>
> Only a UNLOCK+LOCK sequence can be considered an MB.

I knew I was copying people for a reason.

Thats a problem because there is nothing preventing the set of func and
data from being foo, which if it went after bar (set refs) would mean
that we executed the wrong function and/or call data.

>
> However, I think the code is still OK, because list_add_rcu() implies a
> wmb(), so in that respect its an improvement since we fix a race and
> avoid an extra wmb. But the comment needs an update.
>
> > */
> > - smp_wmb();
> > -
> > - atomic_set(&data->refs, cpumask_weight(data->cpumask));
> > -
> > raw_spin_lock_irqsave(&call_function.lock, flags);
> > /*
> > * Place entry at the _HEAD_ of the list, so that any cpu still
> > @@ -509,6 +511,8 @@ void smp_call_function_many(const struct
> > list_add_rcu(&data->csd.list, &call_function.queue);
> > raw_spin_unlock_irqrestore(&call_function.lock, flags);
>
> And this wants to grow a comment that it relies on the wmb implied by
> list_add_rcu()
>
> > + atomic_set(&data->refs, cpumask_weight(data->cpumask));
> > +

yea, I'll make an updated patch, and move the set into the lock.

but after some food (and meetings).

milton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/