Re: [PATCH] sched/fair: Remove the cost of a redundant cpumask_next_wrap in select_idle_cpu

From: Barry Song
Date: Tue Nov 23 2021 - 19:07:54 EST


On Wed, Nov 24, 2021 at 10:07 AM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Tue, Nov 23, 2021 at 07:22:29PM +0800, Barry Song wrote:
> > From: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> >
> > This patch keeps the same scanning amount, but drops a redundant loop
> > of cpumask_next_wrap.
> > The original code did for_each_cpu_wrap(cpu, cpus, target + 1), then
> > checked --nr; this patch does --nr before doing the next loop, thus,
> > it can remove a cpumask_next_wrap() which costs a little bit.
> >
> > Signed-off-by: Barry Song <song.bao.hua@xxxxxxxxxxxxx>
> > ---
> > kernel/sched/fair.c | 8 ++++----
> > 1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index ff69f24..e2fb3e0 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -6298,9 +6298,9 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> >
> > span_avg = sd->span_weight * avg_idle;
> > if (span_avg > 4*avg_cost)
> > - nr = div_u64(span_avg, avg_cost);
> > + nr = div_u64(span_avg, avg_cost) - 1;
> > else
> > - nr = 4;
> > + nr = 3;
> >
> > time = cpu_clock(this);
> > }
> > @@ -6312,11 +6312,11 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, bool
> > return i;
> >
> > } else {
> > - if (!--nr)
> > - return -1;
> > idle_cpu = __select_idle_cpu(cpu, p);
> > if ((unsigned int)idle_cpu < nr_cpumask_bits)
> > break;
> > + if (!--nr)
> > + return -1;
> > }
> > }
>
> That's just confusing code. Isn't it much clearer to write the whole
> thing like so ?
>
> nr--;

this is fine to avoid the code of setting 4 to 3 and setting
div_u64(span_avg, avg_cost) to
div_u64(span_avg, avg_cost) - 1;

> for_each_cpu_wrap(cpu, cpus, target+1) {
> ...
> if (!nr--)

I guess you mean if(!--nr).

For example, if nr=4, the original code will only check 3 cpus for
__select_idle_cpu.
the new code "nr--" will check 4 cpus for __select_idle_cpu. to keep the amount
untouched, the code should be --nr.

> return -1;
> }
>

Thanks
Barry