Re: [PATCH, RFC, tip/core/rcu] scalable classic RCU implementation

From: Manfred Spraul
Date: Sun Aug 24 2008 - 04:09:17 EST


Paul E. McKenney wrote:
+/*
+ * Definition for node within the RCU grace-period-detection hierarchy.
+ */
+struct rcu_node {
+ spinlock_t lock;
+ unsigned long qsmask; /* CPUs or groups that need to switch in */
+ /* order for current grace period to proceed.*/
+ unsigned long qsmaskinit;
+ /* Per-GP initialization for qsmask. */
I'm not sure if a bitmap is the right storage. If I understand the code correctly, it contains two information:
1) If the bitmap is clear, then all cpus have completed whatever they need to do.
A counter is more efficient than a bitmap. Especially: It would allow to choose the optimal fan-out, independent from 32/64 bits.
2) The information if the current cpu must do something to complete the current period.non
This is a local information, usually (always?) only the current cpu needs to know if it must do something.
But this doesn't need to be stored in a shared structure, the information could be stored in a per-cpu structure.

+ /*
+ * Extract the list of ready callbacks, disabling to prevent
+ * races with call_rcu() from interrupt handlers.
+ */
+ local_irq_save(flags);
+ list = rdp->nxtlist;
+ rdp->nxtlist = *rdp->nxttail[RCU_DONE_TAIL];
+ *rdp->nxttail[RCU_DONE_TAIL] = NULL;
+ tail = rdp->nxttail[RCU_DONE_TAIL];
+ for (count = RCU_NEXT_SIZE - 1; count >= 0; count--)
+ if (rdp->nxttail[count] == rdp->nxttail[RCU_DONE_TAIL])
+ rdp->nxttail[count] = &rdp->nxtlist;
+ local_irq_restore(flags);
Do you have a description of the events between call_rcu() and the rcu callback?
Is the following description correct?

__call_rcu() queues in RCU_NEXT_TAIL.
In the middle of the current grace period: rcu_check_quiescent_state() calls rcu_next_callbacks_are_ready().
Entry now in RCU_NEXT_READY_TAIL
** 0.5 cycles: wait until all cpus have completed the current cycle.
rcu_process_gp_end() moves from NEXT_READY_TAIL to WAIT_TAIL

** full grace period
rcu_process_gp_end() moves from WAIT_TAIL to DONE_TAIL
rcu_do_batch() finds the entries in DONE_TAIL and calls the callback.


+/*
+ * Do softirq processing for the current CPU.
+ */
static void rcu_process_callbacks(struct softirq_action *unused)
{
/*
* Memory references from any prior RCU read-side critical sections
- * executed by the interrupted code must be see before any RCU
+ * executed by the interrupted code must be seen before any RCU
* grace-period manupulations below.
*/
smp_mb(); /* See above block comment. */
Why this mb()? There was a grace period between the last read side critical section that might have accessed the pointer.
The rcu internal code already does spin_lock()+spin_unlock(). Isn't that sufficient?

- __rcu_process_callbacks(&rcu_ctrlblk, &__get_cpu_var(rcu_data));
- __rcu_process_callbacks(&rcu_bh_ctrlblk, &__get_cpu_var(rcu_bh_data));
+ __rcu_process_callbacks(&rcu_state, &__get_cpu_var(rcu_data));
+ __rcu_process_callbacks(&rcu_bh_state, &__get_cpu_var(rcu_bh_data));
Have you considered merging RCU_DONE_TAIL for rcu_data and rcu_bh_data?

+
+/**
+ * call_rcu - Queue an RCU callback for invocation after a grace period.
+ * @head: structure to be used for queueing the RCU updates.
+ * @func: actual update function to be invoked after the grace period
+ *
+ * The update function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. RCU read-side critical
+ * sections are delimited by rcu_read_lock() and rcu_read_unlock(),
+ * and may be nested.
+ */
The docbook entry is duplicated: They are in include/linux/rcupdate.h and here.
What about removing one of them?
I would go one step further:
Even add call_rcu_sched() into rcupdate.h. Add a Kconfig bool "RCU_NEEDS_SCHED" and automatically define either the extern or the #define.

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/