[PATCH] sched/topology: Optimized copy default topology in sched_init_numa()

From: Hao Jia
Date: Mon Jun 27 2022 - 06:54:15 EST


The size of struct sched_domain_topology_level is 64 bytes.
For NUMA platforms, almost all are multi-core (enable CONFIG_SCHED_MC),
That is to say, the default_topology array has at least 128 bytes that
need to be copied in sched_init_numa(). For most x86 platforms,
CONFIG_SCHED_SMT will be enabled, so more copies will be required.

And memcpy() will be optimized under different architectures.
Fortunately, for platforms with CONFIG_NUMA enabled,
these optimizations are likely to be used.
So, let's use memcpy to copy default topology in sched_init_numa().

Tests are done in an Intel Xeon(R) Platinum 8260 CPU@2.40GHz machine
with 2 NUMA nodes each of which has 24 cores with SMT2 enabled, so 96
CPUs in total.

Use RDTSC to count time-consuming, and based on 5.19-rc4.

Enable CONFIG_SCHED_SMT && CONFIG_SCHED_CLUSTER && CONFIG_SCHED_MC,
So the default_topology array has 256 bytes that need to be copied
in sched_init_numa().
5.19-rc4 5.19-rc4 with patch
average tsc ticks 516.57 85.33 (-83.48%*)

Enable CONFIG_SCHED_MC, So the default_topology array has
128 bytes that need to be copied in sched_init_numa().
5.19-rc4 5.19-rc4 with patch
average tsc ticks 65.71 55.00 (-16.30%*)

Signed-off-by: Hao Jia <jiahao.os@xxxxxxxxxxxxx>
---
kernel/sched/topology.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 05b6c2ad90b9..c6f497d263cd 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1918,8 +1918,7 @@ void sched_init_numa(int offline_node)
/*
* Copy the default topology bits..
*/
- for (i = 0; sched_domain_topology[i].mask; i++)
- tl[i] = sched_domain_topology[i];
+ memcpy(tl, sched_domain_topology, sizeof(struct sched_domain_topology_level) * i);

/*
* Add the NUMA identity distance, aka single NODE.
--
2.32.0