Re: [PATCH] sched/fair: check for idle core

From: Julia Lawall
Date: Wed Oct 21 2020 - 11:33:50 EST




On Wed, 21 Oct 2020, Vincent Guittot wrote:

> On Wed, 21 Oct 2020 at 17:18, Julia Lawall <julia.lawall@xxxxxxxx> wrote:
> >
> >
> >
> > On Wed, 21 Oct 2020, Mel Gorman wrote:
> >
> > > On Wed, Oct 21, 2020 at 03:24:48PM +0200, Julia Lawall wrote:
> > > > > I worry it's overkill because prev is always used if it is idle even
> > > > > if it is on a node remote to the waker. It cuts off the option of a
> > > > > wakee moving to a CPU local to the waker which is not equivalent to the
> > > > > original behaviour.
> > > >
> > > > But it is equal to the original behavior in the idle prev case if you go
> > > > back to the runnable load average days...
> > > >
> > >
> > > It is similar but it misses the sync treatment and sd->imbalance_pct part of
> > > wake_affine_weight which has unpredictable consequences. The data
> > > available is only on the fully utilised case.
> >
> > OK, what if my patch were:
> >
> > @@ -5800,6 +5800,9 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
> > if (sync && cpu_rq(this_cpu)->nr_running == 1)
> > return this_cpu;
> >
> > + if (!sync && available_idle_cpu(prev_cpu))
> > + return prev_cpu;
> > +
>
> this is not useful because when prev_cpu is idle, its runnable_avg was
> null so the only
> way for this_cpu to be selected by wake_affine_weight is to be null
> too which is not really
> possible when sync is set because sync is used to say, the running
> task on this cpu
> is about to sleep

OK, I agree. Previously prev_eff_load was 0 when prev was idle, and
whether the sync code is executed in wake_affine_weight or not, it will
not b the case that this_eff_load < prev_eff_load, so this will not be
selected.

julia


>
> > return nr_cpumask_bits;
> > }
> >
> > The sd->imbalance_pct part would have previously been a multiplication by
> > 0, so it doesn't need to be taken into account.
> >
> > julia
> >
> > >
> > > > The problem seems impossible to solve, because there is no way to know by
> > > > looking only at prev and this whether the thread would prefer to stay
> > > > where it was or go to the waker.
> > > >
> > >
> > > Yes, this is definitely true. Looking at prev_cpu and this_cpu is a
> > > crude approximation and the path is heavily limited in terms of how
> > > clever it can be.
> > >
> > > --
> > > Mel Gorman
> > > SUSE Labs
> > >
>