[no subject]

From: rene
Date: Tue Aug 29 2023 - 06:42:01 EST


"Rafael J. Wysocki" <rafael@xxxxxxxxxx>
Subject: Scheduler not fully honoring CPU priorities for pref perf cores
From: Rene Rebe <rene@xxxxxxxxxxxxx>
X-Mailer: Mew version 6.8 on Emacs 29.1
Mime-Version: 1.0
Content-Type: Text/Plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Hey there,

over the weekend I tested the AMD cppc prefcore patch[1], and after
fixing it (sigh) and seeing some improvements, I was still suprised
that the kernel scheduler with SCHED_MC_PRIO=3Dy would not more relibal=
e
schedule to the highest perf cores.

For examples this are the sorted perf values and associated cpu cores
set by the AMD pstate code via sched_set_itmt_core_prio() and thus
returned by arch_asym_cpu_priority

236 0 16
236 2 18
231 4 20
226 5 21
221 1 17
216 7 23
211 6 22
206 3 19
201 15 31
196 13 29
191 11 27
186 14 30
181 12 28
176 10 26
171 8 24
166 9 25

And I would expect the scheduler to fill the SMT siblings by priority,
e.g. any of: 0/16, 2/18, 4/20, ... However, while this "somewhat"
happens, it does not reliably happen always . For example currently,
with the 6.4 or 6.5 kernel, the Linux scheduelr somehow quite
deterministically decides to first use core 15 for me, before
utilizing 0, 2 or 4 and often throws in core 13, too - before using
other higher boosting cores.

AFAICS there appears to be quite some missed opportunity here, given
that some workloads have minute long single or few threada loads,
e.g. minute long gcc, clang, rustc LTO linking (at times 4-6 minutes
in the case of Firefox even on a Ryzen 7950x) there appears to be
quite some room for improvements. In the current state I can meassure
a ~0.7% performance improvement with the (fixed) AMD prefered core
patch, while as a test manually re-scheduling lto linker jobs from
mediocre to highest performance cores using taskset from user-space I
can reach a avg. improvement of 200MHz and nearly 2% of total Firefox
build time improvement. I would expect many such workloads, including
possibly Linux gaming to have such room for improvements.
Unfortunately I don't have a compatible Intel Turbo Boost Max
Technology system to test if their initial implementation would have
behave any better.

I tried to debug this, however, as you probably can imagine, there are
quite many different scheduling conditions to follow and understand for=

someone not regularly working on the scheduler. So I thought I better
drop a quick note and ask for input. Any guidance how to best debug
this scheduler decisions would be highly appreciated, as I did not yet
find a good way to debug this further, ... :-/

Thank you so much,

Ren=E9

1) https://lore.kernel.org/linux-acpi/20230829064340.1136448-1-li.meng@=
amd.com/

-- =

Ren=E9 Rebe, ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin
https://exactcode.com | https://t2sde.org | https://rene.rebe.de