Re: [PATCH 2/2] sched/fair: Improve the for loop in select_idle_core()

From: Peter Zijlstra
Date: Mon Feb 11 2019 - 05:44:33 EST


On Mon, Feb 11, 2019 at 03:56:59PM +0530, Viresh Kumar wrote:
> On 11-02-19, 10:30, Peter Zijlstra wrote:
> > On Thu, Feb 07, 2019 at 04:16:06PM +0530, Viresh Kumar wrote:
> > > @@ -6081,10 +6082,14 @@ static int select_idle_core(struct task_struct *p, struct sched_domain *sd, int
> > > for_each_cpu_wrap(core, cpus, target) {
> > > bool idle = true;
> > >
> > > - for_each_cpu(cpu, cpu_smt_mask(core)) {
> > > - cpumask_clear_cpu(cpu, cpus);
> > > - if (!available_idle_cpu(cpu))
> > > + smt = cpu_smt_mask(core);
> > > + cpumask_andnot(cpus, cpus, smt);
> >
> > So where the previous code was like 1-2 stores, you just added 16.
>
> Is the max number of possible threads per core just 2? That's what I
> read just now and I wasn't aware of that earlier. This commit doesn't
> improve anything then. Sorry for the noise.

We've got up to SMT8 in the tree (Sparc64, Power8 and some MIPS IIRC),
but that's still less than having to touch the entire bitmap.

Also, Power9 went back to SMT4 and I think the majory of SMT deployments
is that or less.