Re: [PATCH v2 1/7] sched/fair: Generalize asym_packing logic for SMT local sched group

From: Valentin Schneider
Date: Thu Dec 22 2022 - 11:57:11 EST


On 22/11/22 12:35, Ricardo Neri wrote:
> @@ -8926,25 +8924,16 @@ static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> }
>
> - /* @dst_cpu has SMT siblings. */
> -
> - if (sg_is_smt) {
> - int local_busy_cpus = sds->local->group_weight -
> - sds->local_stat.idle_cpus;
> - int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> -
> - if (busy_cpus_delta == 1)
> - return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> -
> - return false;
> - }
> -
> /*
> - * @sg does not have SMT siblings. Ensure that @sds::local does not end
> - * up with more than one busy SMT sibling and only pull tasks if there
> - * are not busy CPUs (i.e., no CPU has running tasks).
> + * @dst_cpu has SMT siblings. Do asym_packing load balancing only if
> + * all its siblings are idle (moving tasks between physical cores in
> + * which some SMT siblings are busy results in the same throughput).
> + *
> + * If the difference in the number of busy CPUs is two or more, let
> + * find_busiest_group() take care of it. We only care if @sg has
> + * exactly one busy CPU. This covers SMT and non-SMT sched groups.
> */
> - if (!sds->local_stat.sum_nr_running)
> + if (sg_busy_cpus == 1 && !sds->local_stat.sum_nr_running)
> return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
>

Some of this is new to me - I had missed the original series introducing
this. However ISTM that this is conflating two concepts: SMT occupancy
balancing, and asym packing.

Take the !local_is_smt :: sg_busy_cpus >= 2 :: return true; path. It does
not involve asym packing priorities at all. This can end up in an
ASYM_PACKING load balance, which per calculate_imbalance() tries to move
*all* tasks to the higher priority CPU - in the case of SMT balancing,
we don't want to totally empty the core, just its siblings.

Is there an ITMT/big.LITTLE (or however x86 calls it) case that invalidates
the above?

Say, what's not sufficient with the below? AFAICT the only thing that isn't
covered is the sg_busy_cpus >= 2 thing, but IMO that's SMT balancing, not
asym packing - if the current calculate_imbalance() doesn't do it, it
should be fixed to do it. Looking at the

local->group_type == group_has_spare

case, it looks like it should DTRT.

---
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 224107278471f..15eb2d3cff186 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -9176,12 +9176,15 @@ static inline bool
sched_asym(struct lb_env *env, struct sd_lb_stats *sds, struct sg_lb_stats *sgs,
struct sched_group *group)
{
- /* Only do SMT checks if either local or candidate have SMT siblings */
- if ((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
- (group->flags & SD_SHARE_CPUCAPACITY))
- return asym_smt_can_pull_tasks(env->dst_cpu, sds, sgs, group);
+ /*
+ * For SMT, env->idle != CPU_NOT_IDLE isn't sufficient, we need to make
+ * sure the whole core is idle.
+ */
+ if (((sds->local->flags & SD_SHARE_CPUCAPACITY) ||
+ (group->flags & SD_SHARE_CPUCAPACITY)) &&
+ !test_idle_cores(env->dst_cpu))
+ return false;

- /* Neither env::dst_cpu nor group::asym_prefer_cpu have SMT siblings. */
return sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu, false);
}