Re: [PATCH v4] sched/fair: Consider cpu affinity when allowing NUMA imbalance in find_idlest_group

From: K Prateek Nayak
Date: Thu Feb 17 2022 - 06:24:17 EST


Hello Mel,


Thank you for looking into the patch.

On 2/17/2022 3:35 PM, Mel Gorman wrote:
> Thanks Prateek,
>
> On Thu, Feb 17, 2022 at 11:24:08AM +0530, K Prateek Nayak wrote:
>> [..snip..]
>>
>> Eg: numactl -C 0,16,32,48,64,80,96,112 ./stream8
>>
> In this case the stream threads can use any CPU of the subset, presumably
> this is parallelised with OpenMP without specifying spread or bind
> directives.
Yes it is parallelized using OpenMP without specifying any directive.
> [..snip..]
> One concern I have is that we incur a cpumask setup and cpumask_weight
> cost on every clone whether a restricted CPU mask is used or not. Peter,
> is it acceptable to avoid the cpumask check if there is no restrictions
> on allowed cpus like this?
>
> imb = sd->imb_numa_nr;
> if (p->nr_cpus_allowed != num_online_cpus())
> struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>
> cpumask_and(cpus, sched_group_span(local), p->cpus_ptr);
> imb = min(cpumask_weight(cpus), imb);
> }
Can we optimize this further as:

imb = sd->imb_numa_nr;
if (unlikely(p->nr_cpus_allowed != num_online_cpus()))
struct cpumask *cpus = this_cpu_cpumask_var_ptr(select_idle_mask);

cpumask_and(cpus, sched_group_span(local), p->cpus_ptr);
imb = min(cpumask_weight(cpus), imb);
}

As for most part, p->nr_cpus_allowed will be equal to num_online_cpus()
unless user has specifically pinned the task.

--
Thanks and Regards,
Prateek