Re: [PATCH 2/2 v2] sched: use load_avg for selecting idlest group

From: Morten Rasmussen
Date: Wed Nov 30 2016 - 09:24:40 EST


On Wed, Nov 30, 2016 at 02:54:00PM +0100, Vincent Guittot wrote:
> On 30 November 2016 at 14:49, Vincent Guittot
> <vincent.guittot@xxxxxxxxxx> wrote:
> > On 30 November 2016 at 13:49, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> >> On Fri, Nov 25, 2016 at 04:34:33PM +0100, Vincent Guittot wrote:
> >>> find_idlest_group() only compares the runnable_load_avg when looking for
> >>> the least loaded group. But on fork intensive use case like hackbench
> >
> > [snip]
> >
> >>> + min_avg_load = avg_load;
> >>> + idlest = group;
> >>> + } else if ((runnable_load < (min_runnable_load + imbalance)) &&
> >>> + (100*min_avg_load > imbalance_scale*avg_load)) {
> >>> + /*
> >>> + * The runnable loads are close so we take
> >>> + * into account blocked load through avg_load
> >>> + * which is blocked + runnable load
> >>> + */
> >>> + min_avg_load = avg_load;
> >>> idlest = group;
> >>> }
> >>>
> >>> @@ -5470,13 +5495,16 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p,
> >>> goto no_spare;
> >>>
> >>> if (this_spare > task_util(p) / 2 &&
> >>> - imbalance*this_spare > 100*most_spare)
> >>> + imbalance_scale*this_spare > 100*most_spare)
> >>> return NULL;
> >>> else if (most_spare > task_util(p) / 2)
> >>> return most_spare_sg;
> >>>
> >>> no_spare:
> >>> - if (!idlest || 100*this_load < imbalance*min_load)
> >>> + if (!idlest ||
> >>> + (min_runnable_load > (this_runnable_load + imbalance)) ||
> >>> + ((this_runnable_load < (min_runnable_load + imbalance)) &&
> >>> + (100*min_avg_load > imbalance_scale*this_avg_load)))
> >>
> >> I don't get why you have imbalance_scale applied to this_avg_load and
> >> not min_avg_load. IIUC, you end up preferring non-local groups?
> >
> > In fact, I have keep the same condition that is used when looping the group.
> > You're right that we should prefer local rq if avg_load are close and
> > test the condition
> > (100*this_avg_load > imbalance_scale*min_avg_load) instead
>
> Of course the correct condition is
> (100*this_avg_load < imbalance_scale*min_avg_load)

Agreed, I should have read the entire thread before replying :-)