[PATCH 0/4] Throttle select_idle_sibling when a target domain is overloaded

From: Mel Gorman
Date: Fri Mar 20 2020 - 11:13:00 EST


This is a follow-on from the CPU/NUMA load balancer reconcilation
after I noticed that select_idle_sibling() was doing excessive work. It
was originally part of a larger series that merged select_idle_core,
select_idle_sibling and select_idle_cpu as a single pass. Unfortunately,
fixes have invalidated the tests multiple times so this series covers
only one part for now as the tests are extremely time-consuming.

tip/sched/core as of March 13th was used as the baseline with "sched/fair:
fix condition of avg_load calculation" applied which was just picked up
by tip at the time of writing.

Patches 1-2 add schedstats to track the efficiency of
select_idle_sibling(). Ordinarily they are disabled and are only really
of use to a kernel developer. However, I find them more practical to work
with than perf.

Patch 3 is a trivial micro-optimisation that avoids clearing part of
a cpumask if a core has been found.

Patch 4 tracks whether a domain appeared to be overloaded during
select_idle_cpu() so that future scans can abort early if necessary.
This reduces the number of runqueues that are scanned uselessly when
a domain is overloaded.

include/linux/sched/topology.h | 1 +
kernel/sched/debug.c | 6 +++
kernel/sched/fair.c | 103 +++++++++++++++++++++++++++++++++++------
kernel/sched/features.h | 3 ++
kernel/sched/sched.h | 8 ++++
kernel/sched/stats.c | 9 ++--
6 files changed, 113 insertions(+), 17 deletions(-)

--
2.16.4