Re: RCU vs data_race()

From: Paul E. McKenney
Date: Fri Jun 18 2021 - 16:48:02 EST


On Fri, Jun 18, 2021 at 01:26:10PM +0200, Peter Zijlstra wrote:
> On Fri, Jun 18, 2021 at 10:59:26AM +0200, Marco Elver wrote:
> > On Fri, Jun 18, 2021 at 10:24AM +0200, Peter Zijlstra wrote:
> > > Hi Paul,
> > >
> > > Due to a merge conflict I had to look at some recent RCU code, and I saw
> > > you went a little overboard with data_race(). How's something like the
> > > below look to you?
> >
> > I commented below. The main thing is just using the __no_kcsan function
> > attribute if it's only about accesses within the function (and not
> > also about called functions elsewhere).
> >
> > Using the attribute also improves performance slightly (not that it
> > matters much in a KCSAN-enabled kernel) due to no instrumentation.
>
> Aah yes ofcourse! Much better still.
>
> > > The idea being that we fundamentally don't care about data races for
> > > debug/error condition prints, so marking every single variable access is
> > > just clutter.
> >
> > Having data_race() around the pr_* helpers seems reasonable, if you
> > worry about future unnecessary markings that might pop up due to them.
>
> Right, so I did them on WARN and higher, figuring that those really only
> happen when there's smoething wrong and we're way past caring about
> data races. Paul has a few pr_info() users that are heavy on
> data_race(), but your __no_kcsan attribute suggestion helps with that.

But there can be cases where some mutex is supposed to be held across
updates to one of the fields to be printed, and that mutex is held across
the pr_*(). In that case, we -want- KCSAN to yell if there is a data
race involving that field.

So I am not at all a fan of this change.

But a similar technique might help elsewhere. RCU uses strict
KCSAN settings (something about me not wanting minor code-generation
performance issues turnign into full-fledged RCU concurrency bugs),
but invokes code that uses looser settings. This means that RCU gets
"false-positive" KCSAN complaints on racing calls to (for example)
schedule_timeout_interruptible().

My thought is to create a rcu_schedule_timeout_interruptible(), for one
example, that suppresses KCSAN warnings under the assumption that they
will be caught by KCSAN runs on other parts of the kernel. Among other
things, this would also allow them to be easily adjusted as appropriate.

Thoughts?

Thanx, Paul