[PATCH tip/core/rcu 3/4] rcu/tree: Count number of batched kfree_rcu() locklessly

From: paulmck
Date: Wed Apr 15 2020 - 13:20:05 EST


From: "Joel Fernandes (Google)" <joel@xxxxxxxxxxxxxxxxx>

We can relax the correctness of counting of number of queued objects in
favor of not hurting performance, by locklessly sampling per-cpu
counters. This should be Ok since under high memory pressure, it should not
matter if we are off by a few objects while counting. The shrinker will
still do the reclaim.

Signed-off-by: Joel Fernandes (Google) <joel@xxxxxxxxxxxxxxxxx>
[ paulmck: Remove unused "flags" variable. ]
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
---
kernel/rcu/tree.c | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 05dcbf8..aef587e 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2939,7 +2939,7 @@ static inline bool queue_kfree_rcu_work(struct kfree_rcu_cpu *krcp)
krcp->head = NULL;
}

- krcp->count = 0;
+ WRITE_ONCE(krcp->count, 0);

/*
* One work is per one batch, so there are two "free channels",
@@ -3077,7 +3077,7 @@ void kfree_call_rcu(struct rcu_head *head, rcu_callback_t func)
krcp->head = head;
}

- krcp->count++;
+ WRITE_ONCE(krcp->count, krcp->count + 1);

// Set timer to drain after KFREE_DRAIN_JIFFIES.
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
@@ -3097,15 +3097,13 @@ static unsigned long
kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
{
int cpu;
- unsigned long flags, count = 0;
+ unsigned long count = 0;

/* Snapshot count of all CPUs */
for_each_online_cpu(cpu) {
struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);

- spin_lock_irqsave(&krcp->lock, flags);
- count += krcp->count;
- spin_unlock_irqrestore(&krcp->lock, flags);
+ count += READ_ONCE(krcp->count);
}

return count;
--
2.9.5