Re: [PATCH 1/1] sched: Consider CPU contention in frequency & load-balance busiest CPU selection

From: Vincent Guittot
Date: Fri May 05 2023 - 04:24:00 EST


On Thu, 4 May 2023 at 19:11, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>
> On 03/05/2023 18:08, Vincent Guittot wrote:
> > On Thu, 6 Apr 2023 at 17:50, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
> >>
> >> Use new cpu_boosted_util_cfs() instead of cpu_util_cfs().
> >>
> >> The former returns max(util_avg, runnable_avg) capped by max CPU
> >> capacity. CPU contention is thereby considered through runnable_avg.
> >>
> >> The change in load-balance only affects migration type `migrate_util`.
> >
> > would be good to get some figures to show the benefit
>
> Yes. Will add JankbenchX on Pixel6 for sugov_get_util() and `perf bench
> sched messaging` on Ampere Altra with the next version.
>
> >> Suggested-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> >> Signed-off-by: Dietmar Eggemann <dietmar.eggemann@xxxxxxx>
> >> ---
> >> kernel/sched/cpufreq_schedutil.c | 3 ++-
> >> kernel/sched/fair.c | 2 +-
> >> kernel/sched/sched.h | 19 +++++++++++++++++++
> >> 3 files changed, 22 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> >> index e3211455b203..728b186cd367 100644
> >> --- a/kernel/sched/cpufreq_schedutil.c
> >> +++ b/kernel/sched/cpufreq_schedutil.c
> >> @@ -158,7 +158,8 @@ static void sugov_get_util(struct sugov_cpu *sg_cpu)
> >> struct rq *rq = cpu_rq(sg_cpu->cpu);
> >>
> >> sg_cpu->bw_dl = cpu_bw_dl(rq);
> >> - sg_cpu->util = effective_cpu_util(sg_cpu->cpu, cpu_util_cfs(sg_cpu->cpu),
> >> + sg_cpu->util = effective_cpu_util(sg_cpu->cpu,
> >> + cpu_boosted_util_cfs(sg_cpu->cpu),
> >
> > Shouldn't we have a similar change in feec to estimate correctly which
> > OPP/ freq will be selected by schedutil ?
>
> Yes, this should be more correct. Schedutil and EAS should see the world
> the same way.
>
> But IMHO only for the
>
> find_energy_efficient_cpu()
> compute_energy()
> eenv_pd_max_util()
> util = cpu_util_next(..., p, ...)
> effective_cpu_util(..., util, FREQUENCY_UTIL, ...)
> ^^^^^^^^^^^^^^
yes only to get same max utilization and as a result the same OPP as schedutil

> case.
>
> Not sure what I do for the task contribution? We use
> task_util(p)/_task_util_est(p) inside cpu_util_next().
> Do I have to consider p->se.avg.runnable_avg as well?

hmm, I would stay with util_avg for now

>
> I don't think that we have a testcase showing any diff for this change
> individually though.
>
> [...]