[PATCH v3 8/9] sched/fair: Split select_task_rq_fair want_affine logic

From: Valentin Schneider
Date: Wed Apr 15 2020 - 17:07:37 EST


The domain loop within select_task_rq_fair() depends on a few bits of
input, namely the SD flag we're looking for and whether we want_affine.

For !want_affine, the domain loop will walk up the hierarchy to reach the
highest domain with the requested sd_flag (SD_BALANCE_{WAKE, FORK, EXEC})
set. In other words, that's a call to highest_flag_domain().
Note that this is a static information wrt a given SD hierarchy, so we can
cache that - but that comes in a later patch to ease reviewing.

For want_affine, we'll walk up the hierarchy to reach the first domain with
SD_LOAD_BALANCE, SD_WAKE_AFFINE, and that spans the tasks's prev_cpu. We
still save a pointer to the last visited domain that had the requested
sd_flag set, which means that if we fail to go through the affine
condition (e.g. no domain had SD_WAKE_AFFINE) we'll use the same SD as we
would have found if we had !want_affine.

Split the domain loop in !want_affine and want_affine paths. As it is,
this leads to two domain walks instead of a single one, but stay tuned for
the next patch.

Signed-off-by: Valentin Schneider <valentin.schneider@xxxxxxx>
---
kernel/sched/fair.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index f20e5cd6515c..6f8cdb99f4a0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6656,26 +6656,33 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int wake_flags)
}

rcu_read_lock();
+
+ sd = highest_flag_domain(cpu, sd_flag);
+
+ /*
+ * If !want_affine, we just look for the highest domain where
+ * sd_flag is set.
+ */
+ if (!want_affine)
+ goto scan;
+
+ /*
+ * Otherwise we look for the lowest domain with SD_WAKE_AFFINE and that
+ * spans both 'cpu' and 'prev_cpu'.
+ */
for_each_domain(cpu, tmp) {
- /*
- * If both 'cpu' and 'prev_cpu' are part of this domain,
- * cpu is a valid SD_WAKE_AFFINE target.
- */
- if (want_affine && (tmp->flags & SD_WAKE_AFFINE) &&
+ if ((tmp->flags & SD_WAKE_AFFINE) &&
cpumask_test_cpu(prev_cpu, sched_domain_span(tmp))) {
if (cpu != prev_cpu)
new_cpu = wake_affine(tmp, p, cpu, prev_cpu, sync);

- sd = NULL; /* Prefer wake_affine over balance flags */
+ /* Prefer wake_affine over SD lookup */
+ sd = NULL;
break;
}
-
- if (tmp->flags & sd_flag)
- sd = tmp;
- else if (!want_affine)
- break;
}

+scan:
if (unlikely(sd)) {
/* Slow path */
new_cpu = find_idlest_cpu(sd, p, cpu, prev_cpu, sd_flag);
--
2.24.0