Re: [PATCH RFC v2 rcu 3/8] srcu: Check for consistent per-CPU per-srcu_struct NMI safety

From: Frederic Weisbecker
Date: Mon Oct 03 2022 - 08:37:36 EST


On Mon, Oct 03, 2022 at 04:57:18AM -0700, Paul E. McKenney wrote:
> On Mon, Oct 03, 2022 at 12:13:31PM +0200, Frederic Weisbecker wrote:
> > On Sun, Oct 02, 2022 at 04:51:03PM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 03, 2022 at 12:06:19AM +0200, Frederic Weisbecker wrote:
> > > > On Thu, Sep 29, 2022 at 11:07:26AM -0700, Paul E. McKenney wrote:
> > > > > This commit adds runtime checks to verify that a given srcu_struct uses
> > > > > consistent NMI-safe (or not) read-side primitives on a per-CPU basis.
> > > > >
> > > > > Link: https://lore.kernel.org/all/20220910221947.171557773@xxxxxxxxxxxxx/
> > > > >
> > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > > Cc: John Ogness <john.ogness@xxxxxxxxxxxxx>
> > > > > Cc: Petr Mladek <pmladek@xxxxxxxx>
> > > > > ---
> > > > > include/linux/srcu.h | 4 ++--
> > > > > include/linux/srcutiny.h | 4 ++--
> > > > > include/linux/srcutree.h | 9 +++++++--
> > > > > kernel/rcu/srcutree.c | 38 ++++++++++++++++++++++++++++++++------
> > > > > 4 files changed, 43 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > > > index 2cc8321c0c86..565f60d57484 100644
> > > > > --- a/include/linux/srcu.h
> > > > > +++ b/include/linux/srcu.h
> > > > > @@ -180,7 +180,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
> > > > > int retval;
> > > > >
> > > > > if (IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
> > > > > - retval = __srcu_read_lock_nmisafe(ssp);
> > > > > + retval = __srcu_read_lock_nmisafe(ssp, true);
> > > > > else
> > > > > retval = __srcu_read_lock(ssp);
> > > >
> > > > Shouldn't it be checked also when CONFIG_NEED_SRCU_NMI_SAFE=n ?
> > >
> > > You are asking why there is no "true" argument to __srcu_read_lock()?
> > > That is because it checks unconditionally.
> >
> > It checks unconditionally but it always assumes not to be called as nmisafe.
> >
> > For example on x86/arm64/loongarch, the same ssp used with both srcu_read_lock() and
> > srcu_read_lock_nmisafe() won't report an issue. But on powerpc it will.
> >
> > My point is that strong archs should warn as well on behalf of others, to detect
> > mistakes early.
>
> Good point, especially given that x86_64 and arm64 are a rather large
> fraction of the uses. Not critically urgent, but definitely nice to have.

No indeed.

>
> Did you by chance have a suggestion for a nice way to accomplish this?

This could be like this:

enum srcu_nmi_flags {
SRCU_NMI_UNKNOWN = 0x0,
SRCU_NMI_UNSAFE = 0x1,
SRCU_NMI_SAFE = 0x2
};

#ifdef CONFIG_NEED_SRCU_NMI_SAFE
static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
{
int idx;
struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);

idx = READ_ONCE(ssp->srcu_idx) & 0x1;
atomic_long_inc(&sdp->srcu_lock_count[idx]);
smp_mb__after_atomic(); /* B */ /* Avoid leaking the critical section. */

srcu_check_nmi_safety(ssp, flags);

return idx;
}
#else
static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
{
srcu_check_nmi_safety(ssp, flags);
return __srcu_read_lock(ssp);
}
#endif

static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp)
{
return __srcu_read_lock_nmisafe(ssp, SRCU_NMI_SAFE);
}

// An __srcu_read_lock() caller in kernel/rcu/tasks.h must be
// taken care of as well
static inline int srcu_read_lock(struct srcu_struct *ssp)
{
srcu_check_nmi_safety(ssp, SRCU_NMI_UNSAFE);
return __srcu_read_lock(ssp);
}

And then you can call __srcu_read_lock_nmisafe(ssp, SRCU_NMI_UNKNOWN) from
initializers of gp.