Re: [PATCH v6 4/4] sched/fair: Don't double balance_interval for migrate_misfit

From: Vincent Guittot
Date: Fri Feb 23 2024 - 04:32:31 EST


On Tue, 20 Feb 2024 at 23:56, Qais Yousef <qyousef@xxxxxxxxxxx> wrote:
>
> It is not necessarily an indication of the system being busy and
> requires a backoff of the load balancer activities. But pushing it high
> could mean generally delaying other misfit activities or other type of
> imbalances.
>
> Also don't pollute nr_balance_failed because of misfit failures. The
> value is used for enabling cache hot migration and in migrate_util/load
> types. None of which should be impacted (skewed) by misfit failures.
>
> Signed-off-by: Qais Yousef <qyousef@xxxxxxxxxxx>

Reviewed-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>

> ---
> kernel/sched/fair.c | 13 +++++++++++--
> 1 file changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 20006fcf7df2..4c1235a5dd60 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -11467,8 +11467,12 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> * We do not want newidle balance, which can be very
> * frequent, pollute the failure counter causing
> * excessive cache_hot migrations and active balances.
> + *
> + * Similarly for migration_misfit which is not related to
> + * load/util migration, don't pollute nr_balance_failed.
> */
> - if (idle != CPU_NEWLY_IDLE)
> + if (idle != CPU_NEWLY_IDLE &&
> + env.migration_type != migrate_misfit)
> sd->nr_balance_failed++;
>
> if (need_active_balance(&env)) {
> @@ -11551,8 +11555,13 @@ static int load_balance(int this_cpu, struct rq *this_rq,
> * repeatedly reach this code, which would lead to balance_interval
> * skyrocketing in a short amount of time. Skip the balance_interval
> * increase logic to avoid that.
> + *
> + * Similarly misfit migration which is not necessarily an indication of
> + * the system being busy and requires lb to backoff to let it settle
> + * down.
> */
> - if (env.idle == CPU_NEWLY_IDLE)
> + if (env.idle == CPU_NEWLY_IDLE ||
> + env.migration_type == migrate_misfit)
> goto out;
>
> /* tune up the balancing interval */
> --
> 2.34.1
>