[PATCH v2 0/4] sched/eevdf: Optimize reweight and pick

From: Abel Wu
Date: Tue Nov 14 2023 - 22:37:19 EST


v1 -> v2:
- Removed unrelated hunks in 2nd patch.

---------

This patchset makes these contributions:

[1/4] Fixes the problem that vruntime doesn't get adjusted
when reweight at !0-lag point.

[2/4] Optimize out the fallback search on @best_left which
doubles the cost in worst case.

[3/4] Enable O(1) fastpath picking based on deadline-sorted
leftmost-cached rbtree.

[4/4] Statistics for patch 3, not intended to upstream.

All the benchmarks are done inside a normal cpu cgroup in a clean
environment with cpu turbo disabled, on a Dual-CPU Intel Xeon(R)
Platinum 8260 with 2 NUMA nodes each of which has 24C/48T.

p0: baseline, tip/master 1187c0b3a6c2
p1: p0 + patch(1)
p3: p0 + patch(1~3)

hackbench
=========
case load p0% (std%) p1% ( std%) p3% ( std%)
process-pipe group-1 1.00 ( 2.49) -1.73 ( 2.64) -3.77 ( 0.91)
process-pipe group-2 1.00 ( 5.23) +5.51 ( 2.32) -3.41 ( 4.28)
process-pipe group-4 1.00 ( 5.30) +3.53 ( 5.46) +6.51 ( 1.44)
process-pipe group-8 1.00 ( 1.36) -1.85 ( 2.22) -3.57 ( 1.06)
process-sockets group-1 1.00 ( 2.29) -2.39 ( 2.66) -2.39 ( 1.86)
process-sockets group-2 1.00 ( 3.46) +0.46 ( 1.85) +1.19 ( 2.08)
process-sockets group-4 1.00 ( 1.43) -1.98 ( 2.78) +4.52 ( 8.68)
process-sockets group-8 1.00 ( 0.95) -1.60 ( 0.94) +2.78 ( 2.14)
threads-pipe group-1 1.00 ( 1.92) +5.33 ( 1.54) +3.47 ( 1.09)
threads-pipe group-2 1.00 ( 0.64) +0.51 ( 2.31) +2.91 ( 0.43)
threads-pipe group-4 1.00 ( 3.03) -2.91 ( 2.31) +1.83 ( 1.65)
threads-pipe group-8 1.00 ( 2.55) +1.89 ( 3.04) -1.29 ( 2.32)
threads-sockets group-1 1.00 ( 0.71) +0.83 ( 0.52) -0.42 ( 0.52)
threads-sockets group-2 1.00 ( 2.48) -2.52 ( 1.20) -3.27 ( 0.59)
threads-sockets group-4 1.00 ( 1.96) +2.67 ( 2.34) +3.74 ( 1.18)
threads-sockets group-8 1.00 ( 1.09) -2.30 ( 0.51) +3.07 ( 0.62)

netperf
=======
case load p0% (std%) p1% ( std%) p3% ( std%)
TCP_RR thread-24 1.00 ( 2.48) -2.15 ( 2.38) +0.17 ( 1.95)
TCP_RR thread-48 1.00 ( 0.73) -1.59 ( 0.51) +0.47 ( 0.93)
TCP_RR thread-72 1.00 ( 1.04) -1.26 ( 1.03) -0.09 ( 1.13)
TCP_RR thread-96 1.00 ( 29.36) +70.41 ( 14.86) +17.88 ( 37.24)
TCP_RR thread-192 1.00 ( 28.29) -1.30 ( 34.03) -2.00 ( 30.63)
TCP_STREAM thread-24 1.00 ( 1.57) +0.38 ( 1.90) +0.20 ( 1.72)
TCP_STREAM thread-48 1.00 ( 0.08) -0.29 ( 0.07) +0.15 ( 0.12)
TCP_STREAM thread-72 1.00 ( 0.01) -0.00 ( 0.00) +0.00 ( 0.00)
TCP_STREAM thread-96 1.00 ( 0.76) +0.16 ( 0.65) +0.30 ( 0.47)
TCP_STREAM thread-192 1.00 ( 0.65) +0.23 ( 0.46) +0.25 ( 0.49)
UDP_RR thread-24 1.00 ( 1.74) -1.26 ( 2.41) +0.81 ( 3.02)
UDP_RR thread-48 1.00 ( 0.56) -0.40 ( 16.72) -0.98 ( 0.36)
UDP_RR thread-72 1.00 ( 0.84) -0.70 ( 0.66) -0.27 ( 0.88)
UDP_RR thread-96 1.00 ( 1.24) -0.44 ( 1.01) -0.99 ( 8.99)
UDP_RR thread-192 1.00 ( 28.02) -0.42 ( 31.59) -1.80 ( 26.23)
UDP_STREAM thread-24 1.00 (100.05) +0.31 (100.04) +0.32 (100.06)
UDP_STREAM thread-48 1.00 (104.35) +1.22 (105.14) +1.65 (104.10)
UDP_STREAM thread-72 1.00 (100.69) +1.28 (100.63) -0.17 (100.49)
UDP_STREAM thread-96 1.00 ( 99.63) +0.33 ( 99.51) -0.25 ( 99.53)
UDP_STREAM thread-192 1.00 (100.57) +2.00 (107.01) -1.21 ( 99.51)

tbench
======
case load p0% (std%) p1% ( std%) p3% ( std%)
loopback thread-24 1.00 ( 0.49) -1.47 ( 0.94) +0.08 ( 0.75)
loopback thread-48 1.00 ( 0.42) -0.04 ( 0.53) -0.06 ( 0.34)
loopback thread-72 1.00 ( 7.10) -3.33 ( 2.98) -5.06 ( 0.31)
loopback thread-96 1.00 ( 0.80) +2.65 ( 0.80) -0.68 ( 1.30)
loopback thread-192 1.00 ( 1.24) +1.21 ( 0.73) -1.78 ( 0.22)

schbench
========
case load p0% (std%) p1% ( std%) p3% ( std%)
normal mthread-1 1.00 ( 5.83) -2.83 ( 1.46) +1.24 ( 2.51)
normal mthread-2 1.00 ( 4.45) +8.94 ( 7.81) +14.24 ( 7.44)
normal mthread-4 1.00 ( 2.73) +2.53 ( 4.31) +12.44 ( 5.99)
normal mthread-8 1.00 ( 0.15) +0.21 ( 0.13) -0.34 ( 0.11)

Seems no obvious complain from these benchmarks.
Comments are appreciated! Thanks!

Abel Wu (4):
sched/eevdf: Fix vruntime adjustment on reweight
sched/eevdf: Sort the rbtree by virtual deadline
sched/eevdf: O(1) fastpath for task selection
sched/stats: Add statistics for pick_eevdf()

include/linux/sched.h | 2 +-
kernel/sched/debug.c | 11 +-
kernel/sched/fair.c | 341 ++++++++++++++++++++++++++----------------
kernel/sched/sched.h | 6 +
kernel/sched/stats.c | 6 +-
5 files changed, 233 insertions(+), 133 deletions(-)

--
2.37.3