Re: [PATCH v5 2/2] sched/fair: Check a task has a fitting cpu when updating misfit

From: Qais Yousef
Date: Tue Feb 20 2024 - 10:59:32 EST


On 02/12/24 18:27, Vincent Guittot wrote:

> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index b803030c3a03..8b8035f5c8f6 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5092,24 +5092,36 @@ static inline int task_fits_cpu(struct task_struct *p, int cpu)
> >
> > static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
> > {
> > + unsigned long cpu_cap;
> > + int cpu = cpu_of(rq);
> > +
> > if (!sched_asym_cpucap_active())
> > return;
> >
> > - if (!p || p->nr_cpus_allowed == 1) {
> > - rq->misfit_task_load = 0;
> > - return;
> > - }
> > + if (!p || p->nr_cpus_allowed == 1)
> > + goto out;
> >
> > - if (task_fits_cpu(p, cpu_of(rq))) {
> > - rq->misfit_task_load = 0;
> > - return;
> > - }
> > + cpu_cap = arch_scale_cpu_capacity(cpu);
> > +
> > + /* If we can't fit the biggest CPU, that's the best we can ever get. */
> > + if (cpu_cap == rq->rd->max_cpu_capacity)
>
> Isn't the condition above also covered by the condition below and
> becomes now useless ?

Yes, you're right. If it is allowed to run on rd->max_cpu_capacity then the
below check will cover it. If it is not allowed, then it won't be there on the
first place.

I'll drop it.

> > -/*
> > - * Check whether a rq has a misfit task and if it looks like we can actually
> > - * help that task: we can migrate the task to a CPU of higher capacity, or
> > - * the task's current CPU is heavily pressured.
> > - */
> > -static inline int check_misfit_status(struct rq *rq, struct sched_domain *sd)
> > +/* Check if the rq has a misfit task */
> > +static inline bool check_misfit_status(struct rq *rq, struct sched_domain *sd)
> > {
> > - return rq->misfit_task_load &&
> > - (arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity ||
> > - check_cpu_capacity(rq, sd));
> > + if (!rq->misfit_task_load)
> > + return false;
>
> I think that only the above is enough ...
>
> > +
> > + /* Can we migrate to a CPU with higher capacity? */
> > + if (arch_scale_cpu_capacity(rq->cpu) < rq->rd->max_cpu_capacity)
>
> because rq->misfit_task_load is set to 0 if
> arch_scale_cpu_capacity(rq->cpu) == rq->rd->max_cpu_capacity
>
> That would also mean that we don't need to keep and set
> rd->max_cpu_capacity anymore as we remove the 2 uses of it

+1

I'll drop max_cpu_capacity as a new patch on top

>
> > + return true;
> > +
> > + /* Is the task's CPU being heavily pressured? */
> > + return check_cpu_capacity(rq, sd);
>
> and this one has already been tested in nohz_balancer_kick() before
> calling check_misfit_status()

Yes, removed.

I realized that I wanted to also add a new patch to not double balance_interval
for misfit failures. I think you indicated that seems the right thing to do?


Thanks

--
Qais Yousef