Re: [PATCH] rcu: remove unnecessary check cpu_no_qs.norm on rcu_report_qs_rdp
From: Z qiang
Date: Mon Jul 24 2023 - 02:49:07 EST
>
> Hi Z qiang!
>
> Thanks for replying. But I'm pinned on something wrong..!
>
> > For built with CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT=n kernels
> > Consider the following scenario:
> >
> > __rcu_read_unlock()
> > -> rcu_read_unlock_strict()
note that as soon as we exit the rcu read critical section
the rcu_read_unlock_strict() will be called, even if the RCU is idle
at this time
> > ->rdp = this_cpu_ptr(&rcu_data);
> > ->rdp->cpu_no_qs.b.norm = false;
> >
> > by interrupt and return invoke rcu_core():
> > ->rcu_check_quiescent_state()
> > ->rdp = raw_cpu_ptr(&rcu_data);
> > -> rcu_check_quiescent_state(rdp);
> > ->note_gp_changes(rdp);
> > -> __note_gp_changes(rnp, rdp)
> > start new gp
> > ->rdp->cpu_no_qs.b.norm = true;
> >
> > ->rcu_report_qs_rdp(rdp);
> > ->if (rdp->cpu_no_qs.b.norm || ...)
>
> I've already seen this scenario, But I think something is missing in my view.
> What I couldn't catch is
>
> ->rcu_check_quiescent_state()
> ->rdp = raw_cpu_ptr(&rcu_data);
> -> rcu_check_quiescent_state(rdp);
> ->note_gp_changes(rdp);
> -> __note_gp_changes(rnp, rdp)
> start new gp
>
> the new gp is started.
>
>
Maybe this "start new gp" note misunderstood you.
For built with CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT=n kernels,
if the gp kthread start a new GP before we exit the RCU read critical section,
and just before we call rcu_report_qs_rdp() in
rcu_read_unlock_strict(), at this time if the clock irq
happens and find that the "rcu_seq_current(&rnp->gp_seq) !=
rdp->gp_seq" is true in rcu_pening(),
will trigger RCU softirq and find that the rcu_seq_new_gp(rdp->gp_seq,
rnp->gp_seq) is true,
will set rdp->cpu_no_qs.b.norm is true. when we return from the
softirq and call rcu_report_qs_rdp()
in rcu_read_unlock_strict(), find that the rdp->cpu_no_qs.b.norm is true.
so there is a situation where the rdp->cpu_no_qs.b.norm is true.
>
> to set cpu_no_qs.b.norm as true, below condition should be true
>
> 1201 >---if (rcu_seq_new_gp(rdp->gp_seq, rnp->gp_seq) ||
> 1202 >--- unlikely(READ_ONCE(rdp->gpwrap))) {
>
> Here,
> How rcu_seq_new_gp could return true and new gp already started via
> rcu_gp_kthread.
> IIUC, because rcu_gp_fqs_loop couldn't see the root rnp->qsmask is
> zero, it couldn't call rcu_gp_init.
>
>
> Sorry to make noise, but would you correct me what I'm thinking wrong?
>
> Many thanks..!
>
> -----------
> Sincerely,
> Levi.
>
> On Mon, Jul 24, 2023 at 4:21 AM Z qiang <qiang.zhang1211@xxxxxxxxx> wrote:
> >
> > >
> > > Thanks for replying to reply Paul :)
> > >
> > > > And try testing with CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT=n.
> > > > Though there might be better Kconfig options to use. Those two come
> > > > immediately to mind.
> > >
> > > I've tested with this option via rcu torture.
> > > and it doesn't report any problems.
> > > and after commit 6d60ea03ac2d3 ("rcu: Report QS for outermost
> > > PREEMPT=n rcu_read_unlock() for strict GPs")
> > > it always makes cpu_no_qs.b.norm false whenever it calls
> > > rcu_report_qs_rdp in rcu_read_unlock.
> > >
> > > > But one critical piece is that softirq handlers, including the RCU_SOFTIRQ
> > > > handler rcu_core_si(), can be invoked upon return from interrupts.
> > >
> > > I think in that case, no problem because if it reports qs already,
> > > rcu_check_quiescent_state wouldn't call rcu_report_qs_rdp again.
> > > And if RCU_SOFTIRQ is called as soon as RCU interrupt is finished,
> > > it seems the fastest one to notify qs to related tree.
> > >
> > > > Another critical piece is that if a CPU is idle during any part of a
> > > > grace period, the grace-period kthread can report a quiescent state on
> > > > its behalf.
> > >
> > > I think
> > > 1) If timer interrupt is still programed,
> > > - when rcu_sched_clock_irq first reports qs, no problem
> > > - If qs is reported via grace period thread first,
> > > note_gp_chagned in rcu interrupt
> > > will recognize this situation by setting core_needs_qs as false,
> > > it doesn't call rcu_report_qs_rdp thou cpu_no_qs.b.norm is true.
> > >
> > > 2) If the timer interrupt isn't programmed,
> > > - rcu_gp_kthreaad reports its qs, no problem.
> > >
> > > So, I think there's no problem removing
> > > "rdp->cpu_no_qs.b.norm" check in rcu_report_qs_rdp.
> > > or wrap this condition check as WARN_ON_ONCE.
> > >
> > > > Does that help?
> > > Many thanks always :)
> > >
> >
> >
> > Hi Levi
> >
> > For built with CONFIG_RCU_STRICT_GRACE_PERIOD=y and CONFIG_PREEMPT=n kernels
> > Consider the following scenario:
> >
> > __rcu_read_unlock()
> > -> rcu_read_unlock_strict()
> > ->rdp = this_cpu_ptr(&rcu_data);
> > ->rdp->cpu_no_qs.b.norm = false;
> >
> > by interrupt and return invoke rcu_core():
> > ->rcu_check_quiescent_state()
> > ->rdp = raw_cpu_ptr(&rcu_data);
> > -> rcu_check_quiescent_state(rdp);
> > ->note_gp_changes(rdp);
> > -> __note_gp_changes(rnp, rdp)
> > start new gp
> > ->rdp->cpu_no_qs.b.norm = true;
> >
> > ->rcu_report_qs_rdp(rdp);
> > ->if (rdp->cpu_no_qs.b.norm || ...)
> >
> >
> > Thanks
> > Zqiang
> >
> >
> > >
> > > --------
> > > SIncerely,
> > > Levi.