Re: [PATCH 1/1] sched: Consider CPU contention in frequency & load-balance busiest CPU selection

From: Dietmar Eggemann
Date: Thu May 04 2023 - 13:12:15 EST


On 03/05/2023 18:08, Vincent Guittot wrote:
> On Thu, 6 Apr 2023 at 17:50, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>
>> Use new cpu_boosted_util_cfs() instead of cpu_util_cfs().
>>
>> The former returns max(util_avg, runnable_avg) capped by max CPU
>> capacity. CPU contention is thereby considered through runnable_avg.
>>
>> The change in load-balance only affects migration type `migrate_util`.
>
> would be good to get some figures to show the benefit

Yes. Will add JankbenchX on Pixel6 for sugov_get_util() and `perf bench
sched messaging` on Ampere Altra with the next version.

>> Suggested-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
>> ---
>> kernel/sched/cpufreq_schedutil.c | 3 ++-
>> kernel/sched/fair.c | 2 +-
>> kernel/sched/sched.h | 19 +++++++++++++++++++
>> 3 files changed, 22 insertions(+), 2 deletions(-)
>>
>> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
>> index e3211455b203..728b186cd367 100644
>> --- a/kernel/sched/cpufreq_schedutil.c
>> +++ b/kernel/sched/cpufreq_schedutil.c
>> @@ -158,7 +158,8 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
>> struct rq *rq = cpu_rq(sg_cpu->cpu);
>>
>> sg_cpu->bw_dl = cpu_bw_dl(rq);
>> - sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
>> + sg_cpu->util = effective_cpu_util(sg_cpu->cpu,
>> + cpu_boosted_util_cfs(sg_cpu->cpu),
>
> Shouldn't we have a similar change in feec to estimate correctly which
> OPP/ freq will be selected by schedutil ?

Yes, this should be more correct. Schedutil and EAS should see the world
the same way.

But IMHO only for the

find_energy_efficient_cpu()
compute_energy()
eenv_pd_max_util()
util = cpu_util_next(..., p, ...)
effective_cpu_util(..., util, FREQUENCY_UTIL, ...)
^^^^^^^^^^^^^^
case.

Not sure what I do for the task contribution? We use
task_util(p)/_task_util_est(p) inside cpu_util_next().
Do I have to consider p->se.avg.runnable_avg as well?

I don't think that we have a testcase showing any diff for this change
individually though.

[...]