Re: [Patch v2 1/6] sched/fair: Determine active load balance for SMT sched groups

From: Peter Zijlstra
Date: Mon Jun 12 2023 - 07:37:08 EST


On Thu, Jun 08, 2023 at 03:32:27PM -0700, Tim Chen wrote:

> +/* One group has more than one SMT CPU while the other group does not */
> +static inline bool smt_vs_nonsmt_groups(struct sched_group *sg1,
> + struct sched_group *sg2)
> +{
> + if (!sg1 || !sg2)
> + return false;
> +
> + return (sg1->flags & SD_SHARE_CPUCAPACITY) !=
> + (sg2->flags & SD_SHARE_CPUCAPACITY);
> +}
> +
> +static inline bool smt_balance(struct lb_env *env, struct sg_lb_stats *sgs,
> + struct sched_group *group)
> +{
> + if (env->idle == CPU_NOT_IDLE)
> + return false;
> +
> + /*
> + * For SMT source group, it is better to move a task
> + * to a CPU that doesn't have multiple tasks sharing its CPU capacity.
> + * Note that if a group has a single SMT, SD_SHARE_CPUCAPCITY
> + * will not be on.
> + */
> + if (group->flags & SD_SHARE_CPUCAPACITY &&
> + sgs->sum_h_nr_running > 1)
> + return true;

AFAICT this does the right thing for SMT>2

> +
> + return false;
> +}
> +
> static inline bool
> sched_reduced_capacity(struct rq *rq, struct sched_domain *sd)
> {

> @@ -9537,6 +9581,18 @@ static bool update_sd_pick_busiest(struct lb_env *env,
> break;
>
> case group_has_spare:
> + /*
> + * Do not pick sg with SMT CPUs over sg with pure CPUs,
> + * as we do not want to pull task off half empty SMT core
> + * and make the core idle.
> + */
> + if (smt_vs_nonsmt_groups(sds->busiest, sg)) {
> + if (sg->flags & SD_SHARE_CPUCAPACITY)
> + return false;
> + else
> + return true;
> + }

However, here I'm not at all sure. Consider SMT-4 with 2 active CPUs, we
still very much would like to pull one task off if we have an idle core
somewhere, no?