Re: [PATCH] rcutorture: Traverse possible cpu to set maxcpu in rcu_nocb_toggle()

From: Z qiang
Date: Sat Aug 26 2023 - 02:16:13 EST


>
> On Fri, Aug 25, 2023 at 10:28:37AM +0800, Z qiang wrote:
> > >
> > > On Thu, Aug 24, 2023 at 04:42:06PM +0800, Zqiang wrote:
> > > > Currently, the maxcpu is set by traversing online CPUs, however, if
> > > > the rcutorture.onoff_holdoff is set zero and onoff_interval is set
> > > > non-zero, and the some CPUs with larger cpuid has been offline before
> > > > setting maxcpu, for these CPUs, even if they are online again, also
> > > > cannot be offload or deoffload.
> > > >
> > > > This commit therefore use for_each_possible_cpu() instead of
> > > > for_each_online_cpu() in rcu_nocb_toggle().
> > > >
> > > > Signed-off-by: Zqiang <qiang.zhang1211@xxxxxxxxx>
> > > > ---
> > > > kernel/rcu/rcutorture.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > > > index a58372bdf0c1..b75d0fe558ce 100644
> > > > --- a/kernel/rcu/rcutorture.c
> > > > +++ b/kernel/rcu/rcutorture.c
> > > > @@ -2131,7 +2131,7 @@ static int rcu_nocb_toggle(void *arg)
> > > > VERBOSE_TOROUT_STRING("rcu_nocb_toggle task started");
> > > > while (!rcu_inkernel_boot_has_ended())
> > > > schedule_timeout_interruptible(HZ / 10);
> > > > - for_each_online_cpu(cpu)
> > > > + for_each_possible_cpu(cpu)
> > >
> > > Last I checked, bad things could happen if the code attempted to
> > > nocb_toggle a CPU that had not yet come online. Has that changed?
> >
> > For example, there are 8 online CPUs in the system, before we traversing online
> > CPUs and set maxcpu, CPU7 has been offline, this causes us to miss nocb_toggle
> > for CPU7(maxcpu=6)
> >
> > Even though we still use for_each_online_cpu(), the things described
> > above also happen. before we toggle the CPU, this CPU has been offline.
>
> Suppose we have a system whose possible CPUs are 0, 1, 2, and 3. However,
> only 0 and 1 are present in this system, and until some manual action is
> taken, only 0 and 1 will ever be online. (Yes, this really can happen!)
> In that state, won't toggling CPU 2 and 3 result in failures?
>

Agree.
As long as we enabled rcutorture.onoff_interval, regardless of whether we use
online CPUs or possible CPUs to set maxcpu, It is all possible to
toggling the CPUs failure
and print "NOCB: Cannot CB-offload offline CPU" log. but the failures
due to CPU offline are acceptable.

but at least the toggling operation on CPU7 will not be missed. when
CPU7 comes online again.

Would it be better to use for_each_present_cpu() ?

Thanks
Zqiang

>
> Thanx, Paul
>
> > Thanks
> > Zqiang
> >
> >
> > >
> > > Thanx, Paul
> > >
> > > > maxcpu = cpu;
> > > > WARN_ON(maxcpu < 0);
> > > > if (toggle_interval > ULONG_MAX)
> > > > --
> > > > 2.17.1
> > > >