Re: [PATCH 2/2] rcu: Remove needless preemption disablement in rcu_all_qs()

From: Boqun Feng
Date: Tue Jul 06 2021 - 09:30:01 EST


On Tue, Jul 06, 2021 at 02:30:58PM +0200, Frederic Weisbecker wrote:
> On Tue, Jul 06, 2021 at 09:51:01AM +0200, Peter Zijlstra wrote:
> > On Tue, Jul 06, 2021 at 01:43:44AM +0200, Frederic Weisbecker wrote:
> > > The preemption is already disabled when we write rcu_data.rcu_urgent_qs.
> > > We can use __this_cpu_write() directly, although that path is mostly
> > > used when CONFIG_PREEMPT=n.
> > >
> > > Signed-off-by: Frederic Weisbecker <frederic@xxxxxxxxxx>
> > > Cc: Neeraj Upadhyay <neeraju@xxxxxxxxxxxxxx>
> > > Cc: Joel Fernandes <joel@xxxxxxxxxxxxxxxxx>
> > > Cc: Uladzislau Rezki <urezki@xxxxxxxxx>
> > > Cc: Boqun Feng <boqun.feng@xxxxxxxxx>
> > > ---
> > > kernel/rcu/tree_plugin.h | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> > > index 27b74352cccf..38b3d01424d7 100644
> > > --- a/kernel/rcu/tree_plugin.h
> > > +++ b/kernel/rcu/tree_plugin.h
> > > @@ -871,7 +871,7 @@ void rcu_all_qs(void)
> > > preempt_enable();
> > > return;
> > > }
> > > - this_cpu_write(rcu_data.rcu_urgent_qs, false);
> > > + __this_cpu_write(rcu_data.rcu_urgent_qs, false);
> >
> > There's another subtle difference between this_cpu_write() and
> > __this_cpu_write() aside from preempt. this_cpu_write() is also
> > IRQ-safe, while __this_cpu_write() is not.
> >
> > I've not looked at the usage here to see if that is relevant, but the
> > Changelog only mentioned the preempt side of things, and that argument
> > is incomplete in general.
>
> You're right, I missed that. I see this rcu_urgent_qs is set by
> RCU TASKS from rcu_tasks_wait_gp() (did I missed another path?).
> Not sure if this is called from IRQ nor if it actually matters to
> protect against IRQs for that single write.

I think __this_cpu_write() being IRQ-unsafe means it may overwrite
percpu writes to other bytes in the same word? Let's say the
rcu_urgent_qs is the lowest byte in the word, the pseduo asm code of
__this_cpu_write() may be:

__this_cpu_write(ptr, v):
long tmp = *ptr;
tmp &= ~(0xff);
tmp |= v;
*ptr = tmp;

and the following sequence introduces an overwrite:

__this_cpu_write(ptr, v): // v is 0, and *ptr is 1
long tmp = *ptr; // tmp is 1
<interrupted>
this_cpu_write() // modify another byte of *ptr, make it
// 0xff01
<ret from interrupt>
tmp &= ~(0xff) // tmp is 0
tmp |=v; // tmp is 0
*ptr = tmp; // *ptr is 0, overwrite a percpu write on
// another field.

I know that many archs have byte-wise store, so compilers don't really
have the reason to generate code as above, but __this_cpu_write() is
just a normal write, nothing prevents this from happenning, unless I'm
missing something here?

Regards,
Boqun

>
> I'm not quite used to rcu_tasks. Paul?