Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation

From: Manfred Spraul
Date: Sat Aug 30 2008 - 09:32:53 EST

Next message: Christoph Hellwig: "Re: [PATCH 2/2] utrace: ptrace cooperation"
Previous message: Rafael J. Wysocki: "Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd"
In reply to: Lai Jiangshan: "Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation"
Next in thread: Paul E. McKenney: "Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCUimplementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Lai Jiangshan wrote:

I just had a fast review. so my comments is nothing but cleanup.

Thanks, Lai.

Paul E. McKenney wrote:

Hello!

+rcu_start_gp(struct rcu_state *rsp, unsigned long iflg)
+ __releases(rsp->rda[smp_processor_id()]->lock)
+{
+ unsigned long flags = iflg;
+ struct rcu_data *rdp = rsp->rda[smp_processor_id()];
+ struct rcu_node *rnp = rcu_get_root(rsp);
+ struct rcu_node *rnp_cur;
+ struct rcu_node *rnp_end;
+
+ if (!cpu_needs_another_gp(rsp, rdp)) {
/*
- * Accessing nohz_cpu_mask before incrementing rcp->cur needs a
- * Barrier Otherwise it can cause tickless idle CPUs to be
- * included in rcp->cpumask, which will extend graceperiods
- * unnecessarily.
+ * Either there is no need to detect any more grace periods
+ * at the moment, or we are already in the process of
+ * detecting one. Either way, we should not start a new
+ * RCU grace period, so drop the lock and return.
*/
- smp_mb();
- cpus_andnot(rcp->cpumask, cpu_online_map, nohz_cpu_mask);
+ spin_unlock_irqrestore(&rnp->lock, flags);
+ return;
+ }
+
+ /* Advance to a new grace period and initialize state. */
+
+ rsp->gpnum++;
+ rsp->signaled = RCU_SIGNAL_INIT;
+ rsp->jiffies_force_qs = jiffies + RCU_JIFFIES_TILL_FORCE_QS;
+ record_gp_stall_check_time();
+ dyntick_save_completed(rsp, rsp->completed - 1);
+ note_new_gpnum(rsp, rdp);
+
+ /*
+ * Because we are first, we know that all our callbacks will
+ * be covered by this upcoming grace period, even the ones
+ * that were registered arbitrarily recently.
+ */
+
+ rcu_next_callbacks_are_ready(rdp);
+ rdp->nxttail[RCU_WAIT_TAIL] = rdp->nxttail[RCU_NEXT_TAIL];
- rcp->signaled = 0;
+ /* Special-case the common single-level case. */
+
+ if (NUM_RCU_NODES == 1) {
+ rnp->qsmask = rnp->qsmaskinit;

I tried a mask like qsmaskinit before. The system came to deadlock
when I did on/offline cpus.
I didn't find out the whys for I bethought of these two problem:

problem 1:
----race condition 1:
<cpu_down>
synchronize_rcu <called from offline handler in other subsystem>
rcu_offline_cpu

-----race condition 2:
rcu_online_cpu
synchronize_rcu <called from online handler in other subsystem>
<cpu_up>

in these two condition, synchronize_rcu isblocked for ever for
synchronize_rcu have to wait a cpu in rnp->qsmask, but this
cpu don't run.

Can we disallow synchronize_rcu() from the cpu notifiers? Are there any users that do a synchronize_rcu() from within the notifiers?
I don't see any other solution.
Something like qsmaskinit is needed - always enumerating all cpus just doesn't scale.

Perhaps it's possible to rely on CPU_DYING, but I haven't figured out yet how to handle read-side critical sections in CPU_DYING handlers.
Interrupts after CPU_DYING could be handled by rcu_irq_enter(), rcu_irq_exit() [yes, they exist on x86: the arch code enables the local interrupts in order to process the currently queued interrupts]

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Christoph Hellwig: "Re: [PATCH 2/2] utrace: ptrace cooperation"
Previous message: Rafael J. Wysocki: "Re: Linux 2.6.27-rc5: System boot regression caused by commit a2bd7274b47124d2fc4dfdb8c0591f545ba749dd"
In reply to: Lai Jiangshan: "Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCU implementation"
Next in thread: Paul E. McKenney: "Re: [PATCH, RFC, tip/core/rcu] v3 scalable classic RCUimplementation"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]