Re: [PATCH v3 09/10] sched/fair: use load instead of runnable load in wakeup path

From: Vincent Guittot
Date: Mon Oct 07 2019 - 11:27:25 EST


On Mon, 7 Oct 2019 at 17:14, Rik van Riel <riel@xxxxxxxxxxx> wrote:
>
> On Thu, 2019-09-19 at 09:33 +0200, Vincent Guittot wrote:
> > runnable load has been introduced to take into account the case where
> > blocked load biases the wake up path which may end to select an
> > overloaded
> > CPU with a large number of runnable tasks instead of an underutilized
> > CPU with a huge blocked load.
> >
> > Tha wake up path now starts to looks for idle CPUs before comparing
> > runnable load and it's worth aligning the wake up path with the
> > load_balance.
> >
> > Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>
> On a single socket system, patches 9 & 10 have the
> result of driving a woken up task (when wake_wide is
> true) to the CPU core with the lowest blocked load,
> even when there is an idle core the task could run on
> right now.
>
> With the whole series applied, I see a 1-2% regression
> in CPU use due to that issue.
>
> With only patches 1-8 applied, I see a 1% improvement in
> CPU use for that same workload.

Thanks for testing.
patch 8-9 have just replaced runnable load by blocked load and then
removed the duplicated metrics in find_idlest_group.
I'm preparing an additional patch that reworks find_idlest_group() to
behave similarly to find_busiest_group(). It gathers statistics what
it already does, then classifies the groups and finally selects the
idlest one. This should fix the problem that you mentioned above when
it selects a group with lowest blocked load whereas there are idle
cpus in another group with high blocked load.

>
> Given that it looks like select_idle_sibling and
> find_idlest_group_cpu do roughly the same thing, I
> wonder if it is enough to simply add an additional
> test to find_idlest_group to have it return the
> LLC sg, if it is called on the LLC sd on a single
> socket system.

That make sense to me

>
> That way find_idlest_group_cpu can still find an
> idle core like it does today.
>
> Does that seem like a reasonable thing?

That's worth testing

>
> I can run tests with that :)
>
> --
> All Rights Reversed.