Re: [PATCH] rcutorture: Fix rcu_torture_pipe_update_one()/rcu_torture_writer() data race and concurrency bug

From: Paul E. McKenney
Date: Thu Mar 07 2024 - 15:57:32 EST


On Thu, Mar 07, 2024 at 12:00:44PM -0800, Linus Torvalds wrote:
> On Thu, 7 Mar 2024 at 11:47, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
> >
> > > - The per-thread counter (Thread-Local Storage) incremented by a single
> > > thread, read by various threads concurrently, is a good target
> > > for WRITE_ONCE()/READ_ONCE() pairing. This is actually what we do in
> > > various liburcu implementations which track read-side critical sections
> > > per-thread.
> >
> > Agreed, but do any of these use WRITE_ONCE(x, READ_ONCE(x) + 1) or
> > similar?
>
> Absolutely not.
>
> The READ_ONCE->WRITE_ONCE pattern is almost certainly a bug.
>
> The valid reason to have a WRITE_ONCE() is that there's a _later_
> READ_ONCE() on another CPU.
>
> So WRITE_ONCE->READ_ONCE (across threads) is very valid. But
> READ_ONCE->WRITE_ONCE (inside a thread) simply is not a valid
> operation.
>
> We do have things like "local_t", which allows for non-smp-safe local
> thread atomic accesses, but they explicitly are *NOT* about some kind
> of READ_ONCE -> WRITE_ONCE sequence that by definition cannot be
> atomic unless you disable interrupts and disable preemption (at which
> point they become pointless and only generate worse code).
>
> But the point of "local_t" is that you can do things that aresafe if
> there is no SMP issues. They are kind of an extension of the
> percpu_add() kind of operations.
>
> In fact, I think it might be interesting to catch those
> READ_ONCE->WRITE_ONCE chains (perhaps with coccinelle?) because they
> are a sign of bugs.

Good point, adding Julia Lawall on CC.

A really stupid version of this pattern (WRITE_ONCE.*READ_ONCE) found
four more in RCU, so I will take a look at those and either fix or add
comments as appropriate.

> Now, there's certainly some possibility of "I really don't care about
> some stats, I'm willing to do non-smp-safe and non-thread safe
> operations if they are faster". So I'm not saying a
> READ_ONCE->WRITE_ONCE data dependency is _always_ a bug, but I do
> think it's a pattern that is very very close to being one.

I agree that valid use cases should be quite rare.

Thanx, Paul