Re: [PATCH RFC v2 rcu 3/8] srcu: Check for consistent per-CPU per-srcu_struct NMI safety

From: Frederic Weisbecker
Date: Mon Oct 03 2022 - 09:36:17 EST


On Mon, Oct 03, 2022 at 06:32:10AM -0700, Paul E. McKenney wrote:
> On Mon, Oct 03, 2022 at 02:37:21PM +0200, Frederic Weisbecker wrote:
> > On Mon, Oct 03, 2022 at 04:57:18AM -0700, Paul E. McKenney wrote:
> > > On Mon, Oct 03, 2022 at 12:13:31PM +0200, Frederic Weisbecker wrote:
> > > > On Sun, Oct 02, 2022 at 04:51:03PM -0700, Paul E. McKenney wrote:
> > > > > On Mon, Oct 03, 2022 at 12:06:19AM +0200, Frederic Weisbecker wrote:
> > > > > > On Thu, Sep 29, 2022 at 11:07:26AM -0700, Paul E. McKenney wrote:
> > > > > > > This commit adds runtime checks to verify that a given srcu_struct uses
> > > > > > > consistent NMI-safe (or not) read-side primitives on a per-CPU basis.
> > > > > > >
> > > > > > > Link: https://lore.kernel.org/all/20220910221947.171557773@xxxxxxxxxxxxx/
> > > > > > >
> > > > > > > Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
> > > > > > > Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> > > > > > > Cc: John Ogness <john.ogness@xxxxxxxxxxxxx>
> > > > > > > Cc: Petr Mladek <pmladek@xxxxxxxx>
> > > > > > > ---
> > > > > > > include/linux/srcu.h | 4 ++--
> > > > > > > include/linux/srcutiny.h | 4 ++--
> > > > > > > include/linux/srcutree.h | 9 +++++++--
> > > > > > > kernel/rcu/srcutree.c | 38 ++++++++++++++++++++++++++++++++------
> > > > > > > 4 files changed, 43 insertions(+), 12 deletions(-)
> > > > > > >
> > > > > > > diff --git a/include/linux/srcu.h b/include/linux/srcu.h
> > > > > > > index 2cc8321c0c86..565f60d57484 100644
> > > > > > > --- a/include/linux/srcu.h
> > > > > > > +++ b/include/linux/srcu.h
> > > > > > > @@ -180,7 +180,7 @@ static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp) __acquires(ssp
> > > > > > > int retval;
> > > > > > >
> > > > > > > if (IS_ENABLED(CONFIG_NEED_SRCU_NMI_SAFE))
> > > > > > > - retval = __srcu_read_lock_nmisafe(ssp);
> > > > > > > + retval = __srcu_read_lock_nmisafe(ssp, true);
> > > > > > > else
> > > > > > > retval = __srcu_read_lock(ssp);
> > > > > >
> > > > > > Shouldn't it be checked also when CONFIG_NEED_SRCU_NMI_SAFE=n ?
> > > > >
> > > > > You are asking why there is no "true" argument to __srcu_read_lock()?
> > > > > That is because it checks unconditionally.
> > > >
> > > > It checks unconditionally but it always assumes not to be called as nmisafe.
> > > >
> > > > For example on x86/arm64/loongarch, the same ssp used with both srcu_read_lock() and
> > > > srcu_read_lock_nmisafe() won't report an issue. But on powerpc it will.
> > > >
> > > > My point is that strong archs should warn as well on behalf of others, to detect
> > > > mistakes early.
> > >
> > > Good point, especially given that x86_64 and arm64 are a rather large
> > > fraction of the uses. Not critically urgent, but definitely nice to have.
> >
> > No indeed.
> >
> > >
> > > Did you by chance have a suggestion for a nice way to accomplish this?
> >
> > This could be like this:
> >
> > enum srcu_nmi_flags {
> > SRCU_NMI_UNKNOWN = 0x0,
> > SRCU_NMI_UNSAFE = 0x1,
> > SRCU_NMI_SAFE = 0x2
> > };
> >
> > #ifdef CONFIG_NEED_SRCU_NMI_SAFE
> > static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
> > {
> > int idx;
> > struct srcu_data *sdp = raw_cpu_ptr(ssp->sda);
> >
> > idx = READ_ONCE(ssp->srcu_idx) & 0x1;
> > atomic_long_inc(&sdp->srcu_lock_count[idx]);
> > smp_mb__after_atomic(); /* B */ /* Avoid leaking the critical section. */
> >
> > srcu_check_nmi_safety(ssp, flags);
> >
> > return idx;
> > }
> > #else
> > static inline int __srcu_read_lock_nmisafe(struct srcu_struct *ssp, enum srcu_nmi_flags flags)
> > {
> > srcu_check_nmi_safety(ssp, flags);
> > return __srcu_read_lock(ssp);
> > }
> > #endif
> >
> > static inline int srcu_read_lock_nmisafe(struct srcu_struct *ssp)
> > {
> > return __srcu_read_lock_nmisafe(ssp, SRCU_NMI_SAFE);
> > }
> >
> > // An __srcu_read_lock() caller in kernel/rcu/tasks.h must be
> > // taken care of as well
> > static inline int srcu_read_lock(struct srcu_struct *ssp)
> > {
> > srcu_check_nmi_safety(ssp, SRCU_NMI_UNSAFE);
> > return __srcu_read_lock(ssp);
> > }
> >
> > And then you can call __srcu_read_lock_nmisafe(ssp, SRCU_NMI_UNKNOWN) from
> > initializers of gp.
>
> Not bad at all!
>
> Would you like to send a patch?
>
> I do not consider this to be something for the current merge window even
> if the rest goes in because printk() is used heavily and because it is
> easy to get access to powerpc and presumably also riscv systems.
>
> But as you say, it would be very good to have longer term for the case
> where srcu_read_lock_nmisafe() is used for some more obscure purpose.

Sure thing!

Thanks.