Re: [PATCH RFC v2 rcu 3/8] srcu: Check for consistent per-CPU per-srcu_struct NMI safety

From: Paul E. McKenney
Date: Mon Oct 03 2022 - 09:32:18 EST


On Mon, Oct 03, 2022 at 02:37:21PM +0200, Frederic Weisbecker wrote:
> On Mon, Oct 03, 2022 at 04:57:18AM -0700, Paul E. McKenney wrote:
> > On Mon, Oct 03, 2022 at 12:13:31PM +0200, Frederic Weisbecker wrote:
> > > On Sun, Oct 02, 2022 at 04:51:03PM -0700, Paul E. McKenney wrote:
> > > > On Mon, Oct 03, 2022 at 12:06:19AM +0200, Frederic Weisbecker wrote:
> > > > > On Thu, Sep 29, 2022 at 11:07:26AM -0700, Paul E. McKenney wrote:
> > > > > > This commit adds runtime checks to verify that a given srcu_struct uses
> > > > > > consistent NMI-safe (or not) read-side primitives on a per-CPU basis.
> > > > > >
> > > > > > Link: https://lore.kernel.org/all/20220910221947.171557773@xxxxxxxxxxxxx/
> > > > > >
> > > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > > > Cc: John Ogness <john.ogness@xxxxxxxxxxxxx>
> > > > > > Cc: Petr Mladek <pmladek@xxxxxxxx>
> > > > > > ---
> > > > > > include/linux/srcu.h | 4 ++--
> > > > > > include/linux/srcutiny.h | 4 ++--
> > > > > > include/linux/srcutree.h | 9 +++++++--
> > > > > > kernel/rcu/srcutree.c | 38 ++++++++++++++++++++++++++++++++------
> > > > > > 4 files changed, 43 insertions(+), 12 deletions(-)
> > > > > >
> > > > > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > > > > index 2cc8321c0c86..565f60d57484 100644
> > > > > > --- a/include/linux/srcu.h
> > > > > > +++ b/include/linux/srcu.h
> > > > > > @@ -180,7 +180,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
> > > > > > int retval;
> > > > > >
> > > > > > if (IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
> > > > > > - retval = __srcu_read_lock_nmisafe(ssp);
> > > > > > + retval = __srcu_read_lock_nmisafe(ssp, true);
> > > > > > else
> > > > > > retval = __srcu_read_lock(ssp);
> > > > >
> > > > > Shouldn't it be checked also when CONFIG_NEED_SRCU_NMI_SAFE=n ?
> > > >
> > > > You are asking why there is no "true" argument to __srcu_read_lock()?
> > > > That is because it checks unconditionally.
> > >
> > > It checks unconditionally but it always assumes not to be called as nmisafe.
> > >
> > > For example on x86/arm64/loongarch, the same ssp used with both srcu_read_lock() and
> > > srcu_read_lock_nmisafe() won't report an issue. But on powerpc it will.
> > >
> > > My point is that strong archs should warn as well on behalf of others, to detect
> > > mistakes early.
> >
> > Good point, especially given that x86_64 and arm64 are a rather large
> > fraction of the uses. Not critically urgent, but definitely nice to have.
>
> No indeed.
>
> >
> > Did you by chance have a suggestion for a nice way to accomplish this?
>
> This could be like this:
>
> enum srcu_nmi_flags {
> SRCU_NMI_UNKNOWN = 0x0,
> SRCU_NMI_UNSAFE = 0x1,
> SRCU_NMI_SAFE = 0x2
> };
>
> #ifdef CONFIG_NEED_SRCU_NMI_SAFE
> static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
> {
> int idx;
> struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);
>
> idx = READ_ONCE(ssp->srcu_idx) & 0x1;
> atomic_long_inc(&sdp->srcu_lock_count[idx]);
> smp_mb__after_atomic(); /* B */ /* Avoid leaking the critical section. */
>
> srcu_check_nmi_safety(ssp, flags);
>
> return idx;
> }
> #else
> static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
> {
> srcu_check_nmi_safety(ssp, flags);
> return __srcu_read_lock(ssp);
> }
> #endif
>
> static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp)
> {
> return __srcu_read_lock_nmisafe(ssp, SRCU_NMI_SAFE);
> }
>
> // An __srcu_read_lock() caller in kernel/rcu/tasks.h must be
> // taken care of as well
> static inline int srcu_read_lock(struct srcu_struct *ssp)
> {
> srcu_check_nmi_safety(ssp, SRCU_NMI_UNSAFE);
> return __srcu_read_lock(ssp);
> }
>
> And then you can call __srcu_read_lock_nmisafe(ssp, SRCU_NMI_UNKNOWN) from
> initializers of gp.

Not bad at all!

Would you like to send a patch?

I do not consider this to be something for the current merge window even
if the rest goes in because printk() is used heavily and because it is
easy to get access to powerpc and presumably also riscv systems.

But as you say, it would be very good to have longer term for the case
where srcu_read_lock_nmisafe() is used for some more obscure purpose.

Thanx, Paul