Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation

From: Manfred Spraul
Date: Sat Aug 30 2008 - 09:32:53 EST


Lai Jiangshan wrote:
I just had a fast review. so my comments is nothing but cleanup.

Thanks, Lai.

Paul E. McKenney wrote:
Hello!

+rcu_start_gp(struct rcu_state *rsp, unsigned long iflg)
+ __releases(rsp->rda[smp_processor_id()]->lock)
+{
+ unsigned long flags = iflg;
+ struct rcu_data *rdp = rsp->rda[smp_processor_id()];
+ struct rcu_node *rnp = rcu_get_root(rsp);
+ struct rcu_node *rnp_cur;
+ struct rcu_node *rnp_end;
+
+ if (!cpu_needs_another_gp(rsp, rdp)) {
/*
- * Accessing nohz_cpu_mask before incrementing rcp->cur needs a
- * Barrier Otherwise it can cause tickless idle CPUs to be
- * included in rcp->cpumask, which will extend graceperiods
- * unnecessarily.
+ * Either there is no need to detect any more grace periods
+ * at the moment, or we are already in the process of
+ * detecting one. Either way, we should not start a new
+ * RCU grace period, so drop the lock and return.
*/
- smp_mb();
- cpus_andnot(rcp->cpumask, cpu_online_map, nohz_cpu_mask);
+ spin_unlock_irqrestore(&rnp->lock, flags);
+ return;
+ }
+
+ /* Advance to a new grace period and initialize state. */
+
+ rsp->gpnum++;
+ rsp->signaled = RCU_SIGNAL_INIT;
+ rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
+ record_gp_stall_check_time();
+ dyntick_save_completed(rsp, rsp->completed - 1);
+ note_new_gpnum(rsp, rdp);
+
+ /*
+ * Because we are first, we know that all our callbacks will
+ * be covered by this upcoming grace period, even the ones
+ * that were registered arbitrarily recently.
+ */
+
+ rcu_next_callbacks_are_ready(rdp);
+ rdp->nxttail[RCU_WAIT_TAIL] = rdp->nxttail[RCU_NEXT_TAIL];
- rcp->signaled = 0;
+ /* Special-case the common single-level case. */
+
+ if (NUM_RCU_NODES == 1) {
+ rnp->qsmask = rnp->qsmaskinit;

I tried a mask like qsmaskinit before. The system came to deadlock
when I did on/offline cpus.
I didn't find out the whys for I bethought of these two problem:

problem 1:
----race condition 1:
<cpu_down>
synchronize_rcu <called from offline handler in other subsystem>
rcu_offline_cpu


-----race condition 2:
rcu_online_cpu
synchronize_rcu <called from online handler in other subsystem>
<cpu_up>

in these two condition, synchronize_rcu isblocked for ever for
synchronize_rcu have to wait a cpu in rnp->qsmask, but this
cpu don't run.

Can we disallow synchronize_rcu() from the cpu notifiers? Are there any users that do a synchronize_rcu() from within the notifiers?
I don't see any other solution.
Something like qsmaskinit is needed - always enumerating all cpus just doesn't scale.

Perhaps it's possible to rely on CPU_DYING, but I haven't figured out yet how to handle read-side critical sections in CPU_DYING handlers.
Interrupts after CPU_DYING could be handled by rcu_irq_enter(), rcu_irq_exit() [yes, they exist on x86: the arch code enables the local interrupts in order to process the currently queued interrupts]

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/