Re: [RFC v2 4/5] rcu: Use for_each_leaf_node_cpu() in force_qs_rnp()

From: Paul E. McKenney
Date: Tue Dec 20 2016 - 00:10:11 EST


On Mon, Dec 19, 2016 at 11:15:15PM +0800, Boqun Feng wrote:
> On Thu, Dec 15, 2016 at 02:51:36PM +0000, Colin Ian King wrote:
> > On 15/12/16 14:42, Boqun Feng wrote:
> > > On Thu, Dec 15, 2016 at 12:04:59PM +0000, Mark Rutland wrote:
> > >> On Thu, Dec 15, 2016 at 10:42:03AM +0800, Boqun Feng wrote:
> > >>> ->qsmask of an RCU leaf node is usually more sparse than the
> > >>> corresponding cpu_possible_mask. So replace the
> > >>> for_each_leaf_node_possible_cpu() in force_qs_rnp() with
> > >>> for_each_leaf_node_cpu() to save several checks.
> > >>>
> > >>> [Note we need to use "1UL << bit" instead of "1 << bit" to generate the
> > >>> corresponding mask for a bit because @mask is unsigned long, this was
> > >>> spotted by Colin Ian King <colin.king@xxxxxxxxxxxxx> and CoverityScan in
> > >>> a previous version of this patch.]
> > >>
> > >> Nit: This note can go now that we use leaf_node_cpu_bit(). ;)
> > >>
> > >
> > > I kinda keep this here for honoring the effort of finding out this bug
> > > from Colin, but yes, it's no longer needed here for the current code.
> >
> > Yep, remove it.
> >
>
> Paul, here is a modified version of this patch, what I only did is
> removing this note.
>
> Besides I rebased the whole series on the current rcu/dev branch of -rcu
> tree, on this very commit:
>
> 8e9b2521b18a ("doc: Quick-Quiz answers are now inline")
>
> And I put the latest version at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/boqun/linux.git leaf-node
>
> If you thought it's better, I could send a v3 ;-)

I would feel better about this patchset if it reduced the number of lines
of code rather than increasing them. That said, part of the increase
is a commment. Still, I am not convinced that the extra level of macro
is carrying its weight.

dbf18a2422e2 ("rcu: Introduce for_each_leaf_node_cpu()")

The commit log needs a bit of wordsmithing.

The added WARN_ON_ONCE(!cpu_possible(cpu)) still seems strange.
What is its purpose, really? What does its triggering tell you?
What other checks did you consider as an alternative?

And if you are going to add checks of this type, should you
also check for this being a leaf rcu_node structure?

3f0b4ba1fe94 ("rcu: Use for_each_leaf_node_cpu() in RCU stall checking")

This does look a bit nicer, but why the added blank lines?
Are they really helping?

The commit log seems a bit misplaced. This code is almost never
executed (once per 21 seconds at the most), so performance really
isn't a consideration. The simpler-looking code might be.

fd799f1ac7b7 ("rcu: Use for_each_leaf_node_cpu() in ->expmask iteration")

Ditto on blank lines.

Again, this code is executed per expedited grace period, so
performance really isn't a big deal. More of a big deal than
the stall-warning code, but we still are way off of any fastpath.

69a1baedbf42 ("rcu: Use for_each_leaf_node_cpu() in force_qs_rnp()")

Ditto again on blank lines.

And on the commit log. This code is executed about once
per several jiffies, and on larger machines, per 20 jiffies
or so. Performance really isn't a consideration.

7b00e50e3efb ("rcu: Use for_each_leaf_node_cpu() in online CPU iteration")

And another ditto on blank lines.

This code executes once per CPU-hotplug operation, so again isn't
at all performance critical.

In short, if you are trying to sell this to me as a significant performance
boost, I am not buying. The added WARN_ON_ONCE() looks quite dubious,
though perhaps I am misunderstanding its purpose. My assumption is
that you want to detect missing UL suffixes on bitmask constants, in
which case I bet there is a better way.

Speaking of which, how do we know that this is free of bugs?

Thanx, Paul

> Regards,
> Boqun
>
> ------------------------>8
> From: Boqun Feng <boqun.feng@xxxxxxxxx>
> Date: Thu, 8 Dec 2016 23:21:11 +0800
> Subject: [PATCH v2.1 4/5] rcu: Use for_each_leaf_node_cpu() in force_qs_rnp()
>
> ->qsmask of an RCU leaf node is usually more sparse than the
> corresponding cpu_possible_mask. So replace the
> for_each_leaf_node_possible_cpu() in force_qs_rnp() with
> for_each_leaf_node_cpu() to save several checks.
>
> Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
> ---
> kernel/rcu/tree.c | 12 +++++-------
> 1 file changed, 5 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 4ea4496f4ecc..c2b753fb7f09 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3046,13 +3046,11 @@ static void force_qs_rnp(struct rcu_state *rsp,
> continue;
> }
> }
> - for_each_leaf_node_possible_cpu(rnp, cpu) {
> - unsigned long bit = leaf_node_cpu_bit(rnp, cpu);
> - if ((rnp->qsmask & bit) != 0) {
> - if (f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
> - mask |= bit;
> - }
> - }
> +
> + for_each_leaf_node_cpu(rnp, rnp->qsmask, cpu)
> + if (f(per_cpu_ptr(rsp->rda, cpu), isidle, maxj))
> + mask |= leaf_node_cpu_bit(rnp, cpu);
> +
> if (mask != 0) {
> /* Idle/offline CPUs, report (releases rnp->lock. */
> rcu_report_qs_rnp(mask, rsp, rnp, rnp->gpnum, flags);
> --
> 2.10.2
>