[tip: sched/urgent] sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()

From: tip-bot2 for Vincent Donnefort
Date: Thu Nov 11 2021 - 07:22:40 EST


The following commit has been merged into the sched/urgent branch of tip:

Commit-ID: 42dc938a590c96eeb429e1830123fef2366d9c80
Gitweb: https://git.kernel.org/tip/42dc938a590c96eeb429e1830123fef2366d9c80
Author: Vincent Donnefort <vincent.donnefort@xxxxxxx>
AuthorDate: Thu, 04 Nov 2021 17:51:20
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Thu, 11 Nov 2021 13:09:32 +01:00

sched/core: Mitigate race cpus_share_cache()/update_top_cache_domain()

Nothing protects the access to the per_cpu variable sd_llc_id. When testing
the same CPU (i.e. this_cpu == that_cpu), a race condition exists with
update_top_cache_domain(). One scenario being:

CPU1 CPU2
==================================================================

per_cpu(sd_llc_id, CPUX) => 0
partition_sched_domains_locked()
detach_destroy_domains()
cpus_share_cache(CPUX, CPUX) update_top_cache_domain(CPUX)
per_cpu(sd_llc_id, CPUX) => 0
per_cpu(sd_llc_id, CPUX) = CPUX
per_cpu(sd_llc_id, CPUX) => CPUX
return false

ttwu_queue_cond() wouldn't catch smp_processor_id() == cpu and the result
is a warning triggered from ttwu_queue_wakelist().

Avoid a such race in cpus_share_cache() by always returning true when
this_cpu == that_cpu.

Fixes: 518cd6234178 ("sched: Only queue remote wakeups when crossing cache boundaries")
Reported-by: Jing-Ting Wu <jing-ting.wu@xxxxxxxxxxxx>
Signed-off-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Link: https://lore.kernel.org/r/20211104175120.857087-1-vincent.donnefort@xxxxxxx
---
kernel/sched/core.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 523fd60..cec173a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3726,6 +3726,9 @@ out:

bool cpus_share_cache(int this_cpu, int that_cpu)
{
+ if (this_cpu == that_cpu)
+ return true;
+
return per_cpu(sd_llc_id, this_cpu) == per_cpu(sd_llc_id, that_cpu);
}