Re: [RESEND PATCH] sched/fair: consider RT/IRQ pressure in select_idle_sibling

From: Rohit Jain
Date: Wed Jan 31 2018 - 12:46:55 EST


On 01/30/2018 05:57 PM, Joel Fernandes wrote:

<snip>
Signed-off-by: Rohit Jain<rohit.k.jain@xxxxxxxxxx>
---
kernel/sched/fair.c | 38 ++++++++++++++++++++++++++++----------
1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 26a71eb..ce5ccf8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5625,6 +5625,11 @@ static unsigned long capacity_orig_of(int cpu)
return cpu_rq(cpu)->cpu_capacity_orig;
}

+static inline bool full_capacity(int cpu)
+{
+ return capacity_of(cpu) >= (capacity_orig_of(cpu)*3)/4;
+}
+
static unsigned long cpu_avg_load_per_task(int cpu)
{
struct rq *rq = cpu_rq(cpu);
@@ -6081,7 +6086,7 @@ static int select_idle_core(struct task_struct *p,
struct sched_domain *sd, int

for_each_cpu(cpu, cpu_smt_mask(core)) {
cpumask_clear_cpu(cpu, cpus);
- if (!idle_cpu(cpu))
+ if (!idle_cpu(cpu) || !full_capacity(cpu))
idle = false;
}
There's some difference in logic between select_idle_core and
select_idle_cpu as far as the full_capacity stuff you're adding goes.
In select_idle_core, if all CPUs are !full_capacity, you're returning
-1. But in select_idle_cpu you're returning the best idle CPU that's
the most cap among the !full_capacity ones. Why there is this
different in logic? Did I miss something?


<snip>

Dude :) That is hardly an answer to the question I asked. Hint:
*different in logic*.

Let me re-try :)

For select_idle_core, we are doing a search for a fully idle and full
capacity core, the fail-safe is select_idle_cpu because we will re-scan
the CPUs. The notion is to select an idle CPU no matter what, because
being on an idle CPU is better than waiting on a non-idle one. In
select_idle_core you can be slightly picky about the core because
select_idle_cpu is a fail safe. I measured the performance impact of
choosing the "best among low cap" vs the code changes I have (for
select_idle_core) and could not find a statistically significant impact,
hence went with the simpler code changes.

Hope I answered your question.

Thanks,
Rohit