[PATCH] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()

From: Vincent Donnefort
Date: Thu Nov 04 2021 - 13:52:59 EST


Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:

CPU1 CPU2
==================================================================

per_cpu(sd_llc_id, CPUX) => 0
partition_sched_domains_locked()
detach_destroy_domains()
cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
per_cpu(sd_llc_id, CPUX) => 0
per_cpu(sd_llc_id, CPUX) = CPUX
per_cpu(sd_llc_id, CPUX) => CPUX
return false

ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().

Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.

Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu <jing-ting.wu@xxxxxxxxxxxx>
Signed-off-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index f2611b9cf503..f5ca15cdcff4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3726,6 +3726,9 @@ void wake_up_if_idle(int cpu)

bool cpus_share_cache(int this_cpu, int that_cpu)
{
+ if (this_cpu == that_cpu)
+ return true;
+
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
}

--
2.25.1