Re: KCSAN: data-race in find_next_bit / rcu_report_exp_cpu_mult

From: Steven Rostedt
Date: Mon Oct 07 2019 - 09:34:59 EST


On Mon, 7 Oct 2019 12:04:16 +0200
Marco Elver <elver@xxxxxxxxxx> wrote:

> +RCU maintainers
> This might be a data-race in RCU itself.
>
> >
> > write to 0xffffffff85a7f140 of 8 bytes by task 7 on cpu 0:
> > rcu_report_exp_cpu_mult+0x4f/0xa0 kernel/rcu/tree_exp.h:244

Here we have:

raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (!(rnp->expmask & mask)) {
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return;
}
rnp->expmask &= ~mask;
__rcu_report_exp_rnp(rnp, wake, flags); /* Releases rnp->lock. */

> >
> > read to 0xffffffff85a7f140 of 8 bytes by task 7251 on cpu 1:
> > _find_next_bit lib/find_bit.c:39 [inline]
> > find_next_bit+0x57/0xe0 lib/find_bit.c:70
> > sync_rcu_exp_select_node_cpus+0x28e/0x510 kernel/rcu/tree_exp.h:375

and here we have:


raw_spin_unlock_irqrestore_rcu_node(rnp, flags);

/* IPI the remaining CPUs for expedited quiescent state. */
for_each_leaf_node_cpu_mask(rnp, cpu, rnp->expmask) {


The write to rnp->expmask is done under the rnp->lock, but on the read
side, that lock is released before the for loop. Should we have
something like:

unsigned long expmask;
[...]

expmask = rnp->expmask;
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);

/* IPI the remaining CPUs for expedited quiescent state. */
for_each_leaf_node_cpu_mask(rnp, cpu, expmask) {

?

-- Steve