Re: [PATCH v5 0/5] Rework system pressure interface to the scheduler

From: Lukasz Luba
Date: Wed Feb 21 2024 - 08:32:08 EST


Hi Vincent,

On 2/20/24 14:59, Vincent Guittot wrote:
Following the consolidation and cleanup of CPU capacity in [1], this serie
reworks how the scheduler gets the pressures on CPUs. We need to take into
account all pressures applied by cpufreq on the compute capacity of a CPU
for dozens of ms or more and not only cpufreq cooling device or HW
mitigiations. We split the pressure applied on CPU's capacity in 2 parts:
- one from cpufreq and freq_qos
- one from HW high freq mitigiation.

The next step will be to add a dedicated interface for long standing
capping of the CPU capacity (i.e. for seconds or more) like the
scaling_max_freq of cpufreq sysfs. The latter is already taken into
account by this serie but as a temporary pressure which is not always the
best choice when we know that it will happen for seconds or more.

[1] https://lore.kernel.org/lkml/20231211104855.558096-1-vincent.guittot@xxxxxxxxxx/

Change since v4:
- Add READ_ONCE() in cpufreq_get_pressure()
- Add ack and reviewed tags

Change since v3:
- Fix uninitialized variables in cpufreq_update_pressure()

Change since v2:
- Rework cpufreq_update_pressure()

Change since v1:
- Use struct cpufreq_policy as parameter of cpufreq_update_pressure()
- Fix typos and comments
- Make sched_thermal_decay_shift boot param as deprecated

Vincent Guittot (5):
cpufreq: Add a cpufreq pressure feedback for the scheduler
sched: Take cpufreq feedback into account
thermal/cpufreq: Remove arch_update_thermal_pressure()
sched: Rename arch_update_thermal_pressure into
arch_update_hw_pressure
sched/pelt: Remove shift of thermal clock

.../admin-guide/kernel-parameters.txt | 1 +
arch/arm/include/asm/topology.h | 6 +-
arch/arm64/include/asm/topology.h | 6 +-
drivers/base/arch_topology.c | 26 ++++----
drivers/cpufreq/cpufreq.c | 36 +++++++++++
drivers/cpufreq/qcom-cpufreq-hw.c | 4 +-
drivers/thermal/cpufreq_cooling.c | 3 -
include/linux/arch_topology.h | 8 +--
include/linux/cpufreq.h | 10 +++
include/linux/sched/topology.h | 8 +--
.../{thermal_pressure.h => hw_pressure.h} | 14 ++---
include/trace/events/sched.h | 2 +-
init/Kconfig | 12 ++--
kernel/sched/core.c | 8 +--
kernel/sched/fair.c | 63 +++++++++----------
kernel/sched/pelt.c | 18 +++---
kernel/sched/pelt.h | 16 ++---
kernel/sched/sched.h | 22 +------
18 files changed, 144 insertions(+), 119 deletions(-)
rename include/trace/events/{thermal_pressure.h => hw_pressure.h} (55%)



The code looks good and works as expected. The time delays in those
old mechanisms that were important to me are good now. The boost is
handled, cpufreq capping from sysfs - all good. Also the last patch
which removes the shift and makes it obsolete. Thanks!

Feel free to add to all patches:

Reviewed-by: Lukasz Luba <lukasz.luba@xxxxxxx>
Tested-by: Lukasz Luba <lukasz.luba@xxxxxxx>

Regards,
Lukasz