[tip: sched/core] sched: Fix hotplug vs CPU bandwidth control

From: tip-bot2 for Peter Zijlstra
Date: Wed Nov 11 2020 - 03:24:20 EST


The following commit has been merged into the sched/core branch of tip:

Commit-ID: 120455c514f7321981c907a01c543b05aff3f254
Gitweb: https://git.kernel.org/tip/120455c514f7321981c907a01c543b05aff3f254
Author: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
AuthorDate: Fri, 25 Sep 2020 16:42:31 +02:00
Committer: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
CommitterDate: Tue, 10 Nov 2020 18:38:59 +01:00

sched: Fix hotplug vs CPU bandwidth control

Since we now migrate tasks away before DYING, we should also move
bandwidth unthrottle, otherwise we can gain tasks from unthrottle
after we expect all tasks to be gone already.

Also; it looks like the RT balancers don't respect cpu_active() and
instead rely on rq->online in part, complete this. This too requires
we do set_rq_offline() earlier to match the cpu_active() semantics.
(The bigger patch is to convert RT to cpu_active() entirely)

Since set_rq_online() is called from sched_cpu_activate(), place
set_rq_offline() in sched_cpu_deactivate().

Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Reviewed-by: Valentin Schneider <valentin.schneider@xxxxxxx>
Reviewed-by: Daniel Bristot de Oliveira <bristot@xxxxxxxxxx>
Link: https://lkml.kernel.org/r/20201023102346.639538965@xxxxxxxxxxxxx
---
kernel/sched/core.c | 14 ++++++++++----
kernel/sched/deadline.c | 2 +-
kernel/sched/rt.c | 2 +-
3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 6c89806..dcb88a0 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6977,6 +6977,8 @@ int sched_cpu_activate(unsigned int cpu)

int sched_cpu_deactivate(unsigned int cpu)
{
+ struct rq *rq = cpu_rq(cpu);
+ struct rq_flags rf;
int ret;

set_cpu_active(cpu, false);
@@ -6991,6 +6993,14 @@ int sched_cpu_deactivate(unsigned int cpu)

balance_push_set(cpu, true);

+ rq_lock_irqsave(rq, &rf);
+ if (rq->rd) {
+ update_rq_clock(rq);
+ BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
+ set_rq_offline(rq);
+ }
+ rq_unlock_irqrestore(rq, &rf);
+
#ifdef CONFIG_SCHED_SMT
/*
* When going down, decrement the number of cores with SMT present.
@@ -7072,10 +7082,6 @@ int sched_cpu_dying(unsigned int cpu)
sched_tick_stop(cpu);

rq_lock_irqsave(rq, &rf);
- if (rq->rd) {
- BUG_ON(!cpumask_test_cpu(cpu, rq->rd->span));
- set_rq_offline(rq);
- }
BUG_ON(rq->nr_running != 1);
rq_unlock_irqrestore(rq, &rf);

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index f232305..77880fe 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -543,7 +543,7 @@ static int push_dl_task(struct rq *rq);

static inline bool need_pull_dl_task(struct rq *rq, struct task_struct *prev)
{
- return dl_task(prev);
+ return rq->online && dl_task(prev);
}

static DEFINE_PER_CPU(struct callback_head, dl_push_head);
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 49ec096..40a4663 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -265,7 +265,7 @@ static void pull_rt_task(struct rq *this_rq);
static inline bool need_pull_rt_task(struct rq *rq, struct task_struct *prev)
{
/* Try to pull RT tasks here if we lower this rq's prio */
- return rq->rt.highest_prio.curr > prev->prio;
+ return rq->online && rq->rt.highest_prio.curr > prev->prio;
}

static inline int rt_overloaded(struct rq *rq)