Re: [PATCH v1.1 02/11] rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion

From: Neeraj Upadhyay
Date: Mon Dec 13 2021 - 04:11:26 EST



On 12/13/2021 2:27 PM, David Woodhouse wrote:
On Fri, 2021-12-10 at 09:56 +0530, Neeraj Upadhyay wrote:
- if (rdp->grpmask & rcu_rnp_online_cpus(rnp) || READ_ONCE(rnp->ofl_seq) & 0x1)
+ /*
+ * Strictly, we care here about the case where the current CPU is
+ * in rcu_cpu_starting() and thus has an excuse for rdp->grpmask
+ * not being up to date. So arch_spin_is_locked() might have a

Minor:

Is this comment right - "thus has an excuse for rdp->grpmask not being
up to date"; shouldn't it be "thus has an excuse for rnp->qsmaskinitnext
not being up to date"?

Also, arch_spin_is_locked() also handles the rcu_report_dead() case,
where raw_spin_unlock_irqrestore_rcu_node() can have a rcu_read_lock
from lockdep path with CPU bits already cleared from rnp->qsmaskinitnext?

Good point; thanks. How's this:

/*
* Strictly, we care here about the case where the current CPU is in
* rcu_cpu_starting() or rcu_report_dead() and thus has an excuse for
* rdp->qsmaskinitnext not being up to date. So arch_spin_is_locked()
* might have a false positive if it's held by some *other* CPU, but
* that's OK because that just means a false *negative* on the
* warning.
*/


Looks good to me!



if (WARN_ON_ONCE(rnp->qsmask & mask)) { /* RCU waiting on incoming CPU? */
+ /* rcu_report_qs_rnp() *really* wants some flags to restore */
+ unsigned long flags2;

Minor: checkpatch flags it "Missing a blank line after declarations"

Ack. Also fixed and pushed out to my parallel-5.16 branch at
https://git.infradead.org/users/dwmw2/linux.git/shortlog/refs/heads/parallel-5.16
> This commit is probably the only one that's strictly needed for that
parallel bringup, but for now I've kept my rcu boost thread mutex
patch, and added your two patches (with minor whitespace fixes). I
think the best option is to let Paul handle them all.


Thanks; the 4 RCU specific patches in that tree looks good to me.


Thanks
Neeraj

We'll do the final step of actually *enabling* the parallel bringup on
any given architecture only after the various fixes have made their way
in and we've done a proper review of the remaining code paths.