Re: [PATCH 5/5] sched/fair: Merge select_idle_core/cpu()

From: Vincent Guittot
Date: Thu Jan 14 2021 - 10:45:44 EST


On Thu, 14 Jan 2021 at 14:53, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Thu, Jan 14, 2021 at 02:25:32PM +0100, Vincent Guittot wrote:
> > On Thu, 14 Jan 2021 at 10:35, Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
> > >
> > > On Wed, Jan 13, 2021 at 06:03:00PM +0100, Vincent Guittot wrote:
> > > > > @@ -6159,16 +6171,29 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
> > > > > for_each_cpu_wrap(cpu, cpus, target) {
> > > > > if (!--nr)
> > > > > return -1;
> > > > > - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
> > > > > - break;
> > > > > + if (smt) {
> > > >
> > > > If we want to stay on something similar to the previous behavior, we
> > > > want to check on all cores if test_idle_cores is true so nr should be
> > > > set to number of cores
> > > >
> > >
> > > I don't think we necessarily want to do that. has_idle_cores is an
> > > effective throttling mechanism but it's not perfect. If the full domain
> > > is always scanned for a core then there can be excessive scanning in
> >
> > But that's what the code is currently doing. Can this change be done
> > in another patch so we can check the impact of each change more
> > easily?
>
> Ok, when looking at this again instead of just the mail, the flow is;
>
> int i, cpu, idle_cpu = -1, nr = INT_MAX;
> ...
> if (sched_feat(SIS_PROP) && !smt) {
> /* recalculate nr */
> }
>
> The !smt check should mean that core scanning is still scanning the entire

yes good point. I missed this change.

> domain. There is no need to make it specific to the core account and we
> are already doing the full scan. Throttling that would be a separate patch.
>
> > This patch 5 should focus on merging select_idle_core and
> > select_idle_cpu so we keep (almost) the same behavior but each CPU is
> > checked only once.
> >
>
> Which I think it's already doing. Main glitch really is that
> __select_idle_cpu() shouldn't be taking *idle_cpu as it does not consume
> the information.

don't really like the if (smt) else in the for_each_cpu_wrap(cpu,
cpus, target) loop because it just looks like we fail to merge idle
core and idle cpu search loop at the end.

But there is probably not much we can do without changing what is
accounted idle core search in the avg_scan_cost


>
> > > workloads like hackbench which tends to have has_idle_cores return false
> > > positives. It becomes important once average busy CPUs is over half of
> > > the domain for SMT2.
> > >
> > > At least with the patch if that change was made, we still would not scan
> > > twice going over the same runqueues so it would still be an improvement
> >
> > yeah, it's for me the main goal of this patchset with the calculation
> > of avg_can_cost being done only when SIS_PROP is true and the remove
> > of SIS_AVG
> >
> > any changes in the number of cpu/core to loop on is sensitive to
> > regression and should be done in a separate patch IMHO
> >
>
> Understood.
>
> --
> Mel Gorman
> SUSE Labs