[PATCH v2] rcu-tasks: Make rude RCU-Tasks work well with CPU hotplug

From: Zqiang
Date: Mon Nov 28 2022 - 09:29:01 EST


Currently, invoke rcu_tasks_rude_wait_gp() to wait one rude
RCU-tasks grace period, if __num_online_cpus == 1, will return
directly, indicates the end of the rude RCU-task grace period.
suppose the system has two cpus, consider the following scenario:

CPU0 CPU1 (going offline)
migration/1 task:
cpu_stopper_thread
-> take_cpu_down
-> _cpu_disable
(dec __num_online_cpus)
->cpuhp_invoke_callback
preempt_disable
access old_data0
task1
del old_data0 .....
synchronize_rcu_tasks_rude()
task1 schedule out
....
task2 schedule in
rcu_tasks_rude_wait_gp()
->__num_online_cpus == 1
->return
....
task1 schedule in
->free old_data0
preempt_enable

when CPU1 dec __num_online_cpus and __num_online_cpus is equal one,
the CPU1 has not finished offline, stop_machine task(migration/1)
still running on CPU1, maybe still accessing 'old_data0', but the
'old_data0' has freed on CPU0.

This commit add cpus_read_lock/unlock() protection before accessing
__num_online_cpus variables, to ensure that the CPU in the offline
process has been completed offline.

Signed-off-by: Zqiang <qiang1.zhang@xxxxxxxxx>
---
kernel/rcu/tasks.h | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
index 4a991311be9b..08e72c6462d8 100644
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1033,14 +1033,30 @@ static void rcu_tasks_be_rude(struct work_struct *work)
{
}

+static DEFINE_PER_CPU(struct work_struct, rude_work);
+
// Wait for one rude RCU-tasks grace period.
static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
{
+ int cpu;
+ struct work_struct *work;
+
+ cpus_read_lock();
if (num_online_cpus() <= 1)
- return; // Fastpath for only one CPU.
+ goto end;// Fastpath for only one CPU.

rtp->n_ipis += cpumask_weight(cpu_online_mask);
- schedule_on_each_cpu(rcu_tasks_be_rude);
+ for_each_online_cpu(cpu) {
+ work = per_cpu_ptr(&rude_work, cpu);
+ INIT_WORK(work, rcu_tasks_be_rude);
+ schedule_work_on(cpu, work);
+ }
+
+ for_each_online_cpu(cpu)
+ flush_work(per_cpu_ptr(&rude_work, cpu));
+
+end:
+ cpus_read_unlock();
}

void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
--
2.25.1