Re: [PATCH, RFC] v4 scalable classic RCU implementation

From: Manfred Spraul
Date: Tue Sep 16 2008 - 12:52:46 EST


Hi Paul,

Paul E. McKenney wrote:
+/*
+ * Scan the leaf rcu_node structures, processing dyntick state for any that
+ * have not yet encountered a quiescent state, using the function specified.
+ * Returns 1 if the current grace period ends while scanning (possibly
+ * because we made it end).
+ */
+static int rcu_process_dyntick(struct rcu_state *rsp, long lastcomp,
+ int (*f)(struct rcu_data *))
+{
+ unsigned long bit;
+ int cpu;
+ unsigned long flags;
+ unsigned long mask;
+ struct rcu_node *rnp_cur = rsp->level[NUM_RCU_LVLS - 1];
+ struct rcu_node *rnp_end = &rsp->node[NUM_RCU_NODES];
+
+ for (; rnp_cur < rnp_end; rnp_cur++) {
+ mask = 0;
+ spin_lock_irqsave(&rnp_cur->lock, flags);
+ if (rsp->completed != lastcomp) {
+ spin_unlock_irqrestore(&rnp_cur->lock, flags);
+ return 1;
+ }
+ if (rnp_cur->qsmask == 0) {
+ spin_unlock_irqrestore(&rnp_cur->lock, flags);
+ continue;
+ }
+ cpu = rnp_cur->grplo;
+ bit = 1;
+ mask = 0;
+ for (; cpu <= rnp_cur->grphi; cpu++, bit <<= 1) {
+ if ((rnp_cur->qsmask & bit) != 0 && f(rsp->rda[cpu]))
+ mask |= bit;
+ }
I'm still comparing my implementation with your code:
- f is called once for each cpu in the system, correct?
- if at least one cpu is in nohz mode, this loop will be needed for every grace period.

That means an O(NR_CPUS) loop with disabled local interrupts :-(
Is that correct?

Unfortunately, my solution is even worse:
My rcu_irq_exit() acquires a global spinlock when called on a nohz cpus.
A few cpus in cpu_idle, nohz, executing 50k network interrupts/sec would cacheline-trash that spinlock.
I'm considering counting interrupts: if a nohz cpu executes more than a few interrupts/tick, then add a timer that check rcu_pending().

Perhaps even wouldn't be enough: I remember that the initial unhandled irq detection code broke miserably on large SGI systems:
An atomic_inc(&global_var) in the local timer interrupt (i.e.: NR_CPUS*HZ calls/sec) caused so severe trashing that the system wouldn't boot. IIRC that was with 512 cpus.


--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/