Re: [PATCH] sched: Consider capacity for certain load balancing decisions

From: Peter Zijlstra
Date: Fri Feb 03 2023 - 04:51:29 EST


On Tue, Jan 31, 2023 at 05:20:32PM -0800, Xi Wang wrote:
> After load balancing was split into different scenarios, CPU capacity
> is ignored for the "migrate_task" case, which means a thread can stay
> on a softirq heavy cpu for an extended amount of time.
>
> By comparing nr_running/capacity instead of just nr_running we can add
> CPU capacity back into "migrate_task" decisions. This benefits
> workloads running on machines with heavy network traffic. The change
> is unlikely to cause serious problems for other workloads but maybe
> some corner cases still need to be considered.
>
> Signed-off-by: Xi Wang <xii@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 0f8736991427..aad14bc04544 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -10368,8 +10368,9 @@ static struct rq *find_busiest_queue(struct lb_env *env,
> break;
>
> case migrate_task:
> - if (busiest_nr < nr_running) {
> + if (busiest_nr * capacity < nr_running * busiest_capacity) {
> busiest_nr = nr_running;
> + busiest_capacity = capacity;
> busiest = rq;
> }
> break;

I don't think this is correct. The migrate_task case is work-conserving,
and your change can severely break that I think.