[RFC 1/5] rcu: Introduce primitives to iterate mask bits in an RCU leaf node

From: Boqun Feng
Date: Fri Dec 09 2016 - 03:49:30 EST


There are some places inside RCU core, where we need to iterate all mask
(->qsmask, ->expmask, etc) bits in a leaf node, in order to iterate all
corresponding CPUs. The current code iterates all possible CPUs in this
leaf node and then checks with the mask to see whether the bit is set.

However, given the fact that most bits in cpu_possible_mask are set but
rare bits in an RCU leaf node mask are set(in other words, ->qsmask and
its friends are usually more sparse than cpu_possible_mask), it's better
to iterate in the other way, that is iterating mask bits in a leaf node
and then checking with cpu_possible(). By doing so, we can save several
checks in the loop, moreover, that fast path checking(e.g. ->qsmask ==
0) could then be consolidated into the loop logic.

This patch introduce leaf_node_for_each_mask_bit() and
leaf_node_for_each_mask_possible_cpu() to iterate mask bits in a more
efficient way.

Signed-off-by: Boqun Feng <boqun.feng@xxxxxxxxx>
---
kernel/rcu/tree.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index c0a4bf8f1ed0..4078a8ec2bd1 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -260,6 +260,9 @@ struct rcu_node {
*/
#define leaf_node_cpu_bit(rnp, cpu) (1UL << ((cpu) - (rnp)->grplo))

+/* This returns the corresponding cpu_id for a bit in a RCU lead node */
+#define leaf_node_cpu_id(rnp, bit) ((bit) + (rnp)->grplo)
+
/*
* Do a full breadth-first scan of the rcu_node structures for the
* specified rcu_state structure.
@@ -295,6 +298,33 @@ struct rcu_node {
cpu <= rnp->grphi; \
cpu = cpumask_next((cpu), cpu_possible_mask))

+
+#define QSMASK_BITS(mask) (BITS_PER_BYTE * sizeof(mask))
+/*
+ * Iterate over all set bits in @mask of a leaf RCU node.
+ *
+ * The iterator is the bit offset in @mask of a leaf node, to get the cpu
+ * id, use leaf_node_cpu_id()
+ *
+ * Note @rnp has to be a leaf node and @mask has to belong to @rnp.
+ */
+#define leaf_node_for_each_mask_bit(rnp, mask, bit) \
+ for ((bit) = find_first_bit(&(mask), QSMASK_BITS(mask)); \
+ (bit) < QSMASK_BITS(mask); \
+ (bit) = find_next_bit(&(mask), QSMASK_BITS(mask), (bit) + 1))
+
+/*
+ * Iterate over all possible CPUs a leaf RCU node which are still masked in
+ * @mask.
+ *
+ * Note @rnp has to be a leaf node and @mask has to belong to @rnp.
+ */
+#define leaf_node_for_each_mask_possible_cpu(rnp, mask, bit, cpu) \
+ leaf_node_for_each_mask_bit(rnp, mask, bit) \
+ if (!cpu_possible((cpu) = leaf_node_cpu_id(rnp, bit))) \
+ continue; \
+ else
+
/*
* Union to allow "aggregate OR" operation on the need for a quiescent
* state by the normal and expedited grace periods.
--
2.10.2