Re: [PATCH V2] sched: fair: Use the earliest break even

From: Morten Rasmussen
Date: Wed Mar 18 2020 - 04:33:31 EST


On Wed, Mar 18, 2020 at 08:51:04AM +0100, Vincent Guittot wrote:
> On Tue, 17 Mar 2020 at 15:30, Morten Rasmussen <morten.rasmussen@xxxxxxx> wrote:
> > On Tue, Mar 17, 2020 at 02:48:51PM +0100, Daniel Lezcano wrote:
> > > On 17/03/2020 08:56, Morten Rasmussen wrote:
> > > > On Thu, Mar 12, 2020 at 11:04:19AM +0100, Daniel Lezcano wrote:
> > > >> On 12/03/2020 09:36, Vincent Guittot wrote:
> > > >>> On Wed, 11 Mar 2020 at 21:28, Daniel Lezcano <daniel.lezcano@xxxxxxxxxx> wrote:
> > > >>>>
> > > >>>> In the idle CPU selection process occuring in the slow path via the
> > > >>>> find_idlest_group_cpu() function, we pick up in priority an idle CPU
> > > >>>> with the shallowest idle state otherwise we fall back to the least
> > > >>>> loaded CPU.
> > > >>>
> > > >>> The idea makes sense but this path is only used by fork and exec so
> > > >>> I'm not sure about the real impact
> > > >>
> > > >> I agree the fork / exec path is called much less often than the wake
> > > >> path but it makes more sense for the decision.
> > > >
> > > > Looking at the flow in find_idlest_cpu(), AFAICT,
> > > > find_idlest_group_cpu() is not actually making the final choice of CPU,
> > > > so going through a lot of trouble there looking at idle states is
> > > > pointless. Is there something I don't see?
> > > >
> > > > We fellow sd->child until groups == CPUs which which means that
> > > > find_idlest_group() actually makes the final choice as the final group
> > > > passed to find_idlest_group_cpu() is single-CPU group. The flow has been
> > > > like that for years. Even before you added the initial idle-state
> > > > awareness.
> > > >
> > > > I agree with Vincent, if this should really make a difference it should
> > > > include wake-ups existing tasks too. Although I'm aware it would be a
> > > > more invasive change. As said from the beginning, the idea is fine, but
> > > > the current implementation should not make any measurable difference?
> > >
> > > I'm seeing the wake-ups path so sensitive, I'm not comfortable to do any
> > > changes in it. That is the reason why the patch only changes the slow path.
> >
> > Right. I'm not against being cautious at all. It would be interesting to
> > evaluate how bad it really is. The extra time-stamping business cost is
> > the same, so it really down how much we dare to use the information in
> > the fast-path and change the CPU selection policy. And of course, how
> > much can be gained by the change.
>
> Hmm... the fast path not even parses all idle CPUs right now so it
> will be difficult to make any comparison. Regarding wakeup path, the
> only place that could make sense IMO is the EAS slow wakeup path
> find_energy_efficient_cpu() which mitigates performance vs energy
> consumption

Agreed, it is not really compatible with the current fast-path policies.
I would start with just hacking it in to evaluate the potential
improvements to have an idea about what is on the table.

As you say, it could live somewhere slightly out the way like the EAS
path, but I'm not sure if actually fits well under EAS. EAS is disabled
for symmetric CPU capacities, which is probably the systems that will
benefit the most. Also, as mentioned already in this thread, I don't see
how the idea brings any energy savings?

Morten