Re: [RFC PATCH v3 0/3] sched: Skip queued wakeups only when L2 is shared

From: Swapnil Sapkal
Date: Fri Aug 25 2023 - 06:12:28 EST


Hello Mathieu,

On 8/22/2023 5:01 PM, Mathieu Desnoyers wrote:
This series improves performance of scheduler wakeups on large systems
by skipping queued wakeups only when CPUs share their L2 cache, rather
than when they share their LLC.

The speedup mainly reproduces on workloads which have at least *some*
idle time (because it significantly increases the number of migrations,
and thus remote wakeups), *and* it needs to have a sufficient load to
cause contention on the runqueue locks.

Feedback is welcome,

I ran some micro-benchmarks as part of testing this series. Here are the
observations:

- Hackbench shows improvement with this patch and Aaron's patch with
6.5-rc1 kernel as the baseline.

- tbench and netperf shows shows some dip in performance with highly
overloaded case.

- Other micro-benchmarks shows more or less similar performance with
these patches.

o System Details

- 4th Generation EPYC System
- 2 x 128C/256T
- NPS1 mode

o Kernels

base: 6.5.0-rc1
base + mathieu-queued-wakeup: 6.5.0-rc1 + Mathieu's patches [1]
base + aaron-tg-load-avg: 6.5.0-rc1 + Aaron's patch [2]
base + queued-wakeup + tg-load-avg: 6.5.0-rc1 + Mathieu's patches [1] + Aaron's patch [2]

[References]

[1] "sched: Skip queued wakeups only when L2 is shared"
(https://lore.kernel.org/all/20230822113133.643238-1-mathieu.desnoyers@xxxxxxxxxxxx/)
[2] "Reduce cost of accessing tg->load_avg"
(https://lore.kernel.org/lkml/20230823060832.454842-1-aaron.lu@xxxxxxxxx/)

==================================================================
Test : hackbench
Units : Time in seconds
Interpretation: Lower is better
Statistic : AMean
==================================================================
Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1-groups: 22.15 (0.00 pct) 22.46 (-1.39 pct) 22.35 (-0.90 pct) 21.20 (4.28 pct)
2-groups: 22.76 (0.00 pct) 21.78 (4.30 pct) 22.60 (0.70 pct) 21.90 (3.77 pct)
4-groups: 22.12 (0.00 pct) 22.02 (0.45 pct) 22.22 (-0.45 pct) 21.94 (0.81 pct)
8-groups: 24.80 (0.00 pct) 22.36 (9.83 pct) 22.99 (7.29 pct) 22.00 (11.29 pct)
16-groups: 31.09 (0.00 pct) 21.56 (30.65 pct) 22.13 (28.81 pct) 20.60 (33.74 pct)

==================================================================
Test : tbench
Units : Throughput
Interpretation: Higher is better
Statistic : AMean
==================================================================
Clients: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1 261.49 (0.00 pct) 261.18 (-0.11 pct) 262.29 (0.30 pct) 257.80 (-1.41 pct)
2 514.08 (0.00 pct) 521.30 (1.40 pct) 517.66 (0.69 pct) 510.96 (-0.60 pct)
4 1002.51 (0.00 pct) 988.81 (-1.36 pct) 995.04 (-0.74 pct) 987.74 (-1.47 pct)
8 1978.74 (0.00 pct) 1966.60 (-0.61 pct) 1991.85 (0.66 pct) 1941.39 (-1.88 pct)
16 3864.14 (0.00 pct) 3952.03 (2.27 pct) 3914.80 (1.31 pct) 3873.88 (0.25 pct)
32 7473.19 (0.00 pct) 7602.38 (1.72 pct) 7585.94 (1.50 pct) 7423.44 (-0.66 pct)
64 14335.10 (0.00 pct) 14313.17 (-0.15 pct) 14474.67 (0.97 pct) 14030.63 (-2.12 pct)
128 27275.73 (0.00 pct) 25176.80 (-7.69 pct) 28066.53 (2.89 pct) 25045.53 (-8.17 pct)
256 41688.17 (0.00 pct) 44373.40 (6.44 pct) 43779.37 (5.01 pct) 41427.00 (-0.62 pct)
512 137481.33 (0.00 pct) 136466.67 (-0.73 pct) 134824.00 (-1.93 pct) 141280.00 (2.76 pct)
1024 140534.00 (0.00 pct) 141916.33 (0.98 pct) 137008.33 (-2.50 pct) 126319.33 (-10.11 pct)
2048 145378.00 (0.00 pct) 145479.33 (0.06 pct) 138763.67 (-4.54 pct) 124471.00 (-14.38 pct)

==================================================================
Test : netperf
Units : Througput
Interpretation: Higher is better
Statistic : AMean
==================================================================
6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1-clients: 59642.88 (0.00 pct) 61647.37 (3.36 pct) 61186.24 (2.58 pct) 59099.11 (-0.91 pct)
2-clients: 59349.65 (0.00 pct) 60896.01 (2.60 pct) 60582.49 (2.07 pct) 62738.47 (5.70 pct)
4-clients: 59197.37 (0.00 pct) 60457.29 (2.12 pct) 63042.52 (6.49 pct) 60879.58 (2.84 pct)
8-clients: 61977.66 (0.00 pct) 60389.92 (-2.56 pct) 62078.15 (0.16 pct) 60314.65 (-2.68 pct)
16-clients: 61518.83 (0.00 pct) 61143.51 (-0.61 pct) 60946.08 (-0.93 pct) 59388.78 (-3.46 pct)
32-clients: 58230.81 (0.00 pct) 58653.20 (0.72 pct) 58594.14 (0.62 pct) 58188.52 (-0.07 pct)
64-clients: 58050.92 (0.00 pct) 57834.55 (-0.37 pct) 58183.51 (0.22 pct) 57565.75 (-0.83 pct)
128-clients: 54324.55 (0.00 pct) 54385.60 (0.11 pct) 54913.43 (1.08 pct) 53917.11 (-0.75 pct)
256-clients: 70155.29 (0.00 pct) 69390.68 (-1.08 pct) 70097.50 (-0.08 pct) 64410.66 (-8.18 pct)
512-clients: 61511.77 (0.00 pct) 61480.99 (-0.05 pct) 54493.82 (-11.40 pct) 46227.05 (-24.84 pct)

==================================================================
Test : stream-10
Units : Bandwidth, MB/s
Interpretation: Higher is better
Statistic : HMean
==================================================================
Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
Copy: 353336.76 (0.00 pct) 352956.36 (-0.10 pct) 349583.67 (-1.06 pct) 351152.80 (-0.61 pct)
Scale: 353474.88 (0.00 pct) 354582.35 (0.31 pct) 350543.75 (-0.82 pct) 353275.74 (-0.05 pct)
Add: 371984.24 (0.00 pct) 372824.87 (0.22 pct) 369173.72 (-0.75 pct) 370483.63 (-0.40 pct)
Triad: 372625.41 (0.00 pct) 278389.62 (-25.28 pct) 369504.06 (-0.83 pct) 369070.11 (-0.95 pct)

==================================================================
Test : stream-100
Units : Bandwidth, MB/s
Interpretation: Higher is better
Statistic : HMean
==================================================================
Test: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
Copy: 353476.35 (0.00 pct) 354954.50 (0.41 pct) 354614.56 (0.32 pct) 353512.71 (0.01 pct)
Scale: 353214.73 (0.00 pct) 354884.12 (0.47 pct) 355841.17 (0.74 pct) 353220.53 (0.00 pct)
Add: 370755.48 (0.00 pct) 372292.72 (0.41 pct) 375307.35 (1.22 pct) 369917.77 (-0.22 pct)
Triad: 370652.02 (0.00 pct) 372732.11 (0.56 pct) 375718.85 (1.36 pct) 369926.26 (-0.19 pct)

==================================================================
Test : schbench (old)
Units : 99th percentile latency in us
Interpretation: Lower is better
Statistic : Median
==================================================================
#workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1: 56.00 (0.00 pct) 58.00 (-3.57 pct) 60.00 (-7.14 pct) 60.00 (-7.14 pct)
2: 61.00 (0.00 pct) 56.00 (8.19 pct) 59.00 (3.27 pct) 60.00 (1.63 pct)
4: 64.00 (0.00 pct) 62.00 (3.12 pct) 66.00 (-3.12 pct) 64.00 (0.00 pct)
8: 96.00 (0.00 pct) 78.00 (18.75 pct) 76.00 (20.83 pct) 93.00 (3.12 pct)
16: 98.00 (0.00 pct) 95.00 (3.06 pct) 98.00 (0.00 pct) 95.00 (3.06 pct)
32: 137.00 (0.00 pct) 144.00 (-5.10 pct) 133.00 (2.91 pct) 130.00 (5.10 pct)
64: 206.00 (0.00 pct) 210.00 (-1.94 pct) 200.00 (2.91 pct) 217.00 (-5.33 pct)
128: 348.00 (0.00 pct) 347.00 (0.28 pct) 413.00 (-18.67 pct) 366.00 (-5.17 pct)
256: 679.00 (0.00 pct) 669.00 (1.47 pct) 669.00 (1.47 pct) 675.00 (0.58 pct)
512: 1366.00 (0.00 pct) 1366.00 (0.00 pct) 1442.00 (-5.56 pct) 1430.00 (-4.68 pct)


==================================================================
Test : schbench (new)
Units : 99th percentile latency in us
Interpretation: Lower is better
Statistic : Median
==================================================================
Metric: wakeup_lat_summary
#workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1: 15.00 (0.00 pct) 15.00 (0.00 pct) 16.00 (-6.66 pct) 17.00 (-13.33 pct)
2: 16.00 (0.00 pct) 16.00 (0.00 pct) 17.00 (-6.25 pct) 17.00 (-6.25 pct)
4: 17.00 (0.00 pct) 17.00 (0.00 pct) 15.00 (11.76 pct) 17.00 (0.00 pct)
8: 11.00 (0.00 pct) 13.00 (-18.18 pct) 11.00 (0.00 pct) 11.00 (0.00 pct)
16: 11.00 (0.00 pct) 11.00 (0.00 pct) 10.00 (9.09 pct) 9.00 (18.18 pct)
32: 11.00 (0.00 pct) 11.00 (0.00 pct) 11.00 (0.00 pct) 11.00 (0.00 pct)
64: 10.00 (0.00 pct) 11.00 (-10.00 pct) 10.00 (0.00 pct) 10.00 (0.00 pct)
128: 11.00 (0.00 pct) 12.00 (-9.09 pct) 12.00 (-9.09 pct) 11.00 (0.00 pct)
256: 117.00 (0.00 pct) 162.00 (-38.46 pct) 90.00 (23.07 pct) 103.00 (11.96 pct)
512: 22496.00 (0.00 pct) 21664.00 (3.69 pct) 22368.00 (0.56 pct) 21408.00 (4.83 pct)

Metric: request_lat_summary
#workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1: 6872.00 (0.00 pct) 6872.00 (0.00 pct) 6792.00 (1.16 pct) 6856.00 (0.23 pct)
2: 6824.00 (0.00 pct) 6824.00 (0.00 pct) 6872.00 (-0.70 pct) 6856.00 (-0.46 pct)
4: 6824.00 (0.00 pct) 6808.00 (0.23 pct) 6872.00 (-0.70 pct) 6824.00 (0.00 pct)
8: 6824.00 (0.00 pct) 6824.00 (0.00 pct) 6872.00 (-0.70 pct) 6824.00 (0.00 pct)
16: 6824.00 (0.00 pct) 6840.00 (-0.23 pct) 6872.00 (-0.70 pct) 6840.00 (-0.23 pct)
32: 6840.00 (0.00 pct) 6840.00 (0.00 pct) 6888.00 (-0.70 pct) 6856.00 (-0.23 pct)
64: 6840.00 (0.00 pct) 6872.00 (-0.46 pct) 6888.00 (-0.70 pct) 6872.00 (-0.46 pct)
128: 12272.00 (0.00 pct) 12784.00 (-4.17 pct) 13200.00 (-7.56 pct) 12016.00 (2.08 pct)
256: 13328.00 (0.00 pct) 13392.00 (-0.48 pct) 13712.00 (-2.88 pct) 13552.00 (-1.68 pct)
512: 88832.00 (0.00 pct) 86400.00 (2.73 pct) 88192.00 (0.72 pct) 85632.00 (3.60 pct)

Metric: rps_summary
#workers: 6.5.0-rc1 (base) base + mathieu-queued-wakeup base + aaron-tg-load-avg base + queued-wakeup + tg-load-avg
1: 297.00 (0.00 pct) 297.00 (0.00 pct) 297.00 (0.00 pct) 299.00 (-0.67 pct)
2: 601.00 (0.00 pct) 603.00 (-0.33 pct) 595.00 (0.99 pct) 601.00 (0.00 pct)
4: 1206.00 (0.00 pct) 1206.00 (0.00 pct) 1190.00 (1.32 pct) 1206.00 (0.00 pct)
8: 2412.00 (0.00 pct) 2412.00 (0.00 pct) 2396.00 (0.66 pct) 2420.00 (-0.33 pct)
16: 4840.00 (0.00 pct) 4824.00 (0.33 pct) 4792.00 (0.99 pct) 4840.00 (0.00 pct)
32: 9648.00 (0.00 pct) 9648.00 (0.00 pct) 9584.00 (0.66 pct) 9680.00 (-0.33 pct)
64: 19360.00 (0.00 pct) 19296.00 (0.33 pct) 19168.00 (0.99 pct) 19296.00 (0.33 pct)
128: 37952.00 (0.00 pct) 35264.00 (7.08 pct) 36672.00 (3.37 pct) 38080.00 (-0.33 pct)
256: 41408.00 (0.00 pct) 41536.00 (-0.30 pct) 39744.00 (4.01 pct) 40896.00 (1.23 pct)
512: 36288.00 (0.00 pct) 36800.00 (-1.41 pct) 35264.00 (2.82 pct) 35776.00 (1.41 pct)

Tested-by: Swapnil Sapkal <Swapnil.Sapkal@xxxxxxx>


Thanks,

Mathieu

Mathieu Desnoyers (3):
sched: Rename cpus_share_cache to cpus_share_llc
sched: Introduce cpus_share_l2c (v3)
sched: ttwu_queue_cond: skip queued wakeups across different l2 caches

Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Valentin Schneider <vschneid@xxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Ben Segall <bsegall@xxxxxxxxxx>
Cc: Mel Gorman <mgorman@xxxxxxx>
Cc: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Cc: Juri Lelli <juri.lelli@xxxxxxxxxx>
Cc: Swapnil Sapkal <Swapnil.Sapkal@xxxxxxx>
Cc: Aaron Lu <aaron.lu@xxxxxxxxx>
Cc: Julien Desfossez <jdesfossez@xxxxxxxxxxxxxxxx>
Cc: x86@xxxxxxxxxx

block/blk-mq.c | 2 +-
include/linux/sched/topology.h | 10 ++++++++--
kernel/sched/core.c | 14 +++++++++++---
kernel/sched/fair.c | 8 ++++----
kernel/sched/sched.h | 2 ++
kernel/sched/topology.c | 32 +++++++++++++++++++++++++++++---
6 files changed, 55 insertions(+), 13 deletions(-)

--
Thanks and Regards,
Swapnil