[tip:sched/urgent] sched/core: Correct off by one bug in load migration calculation

From: tip-bot for Thomas Gleixner
Date: Wed Jul 13 2016 - 09:41:54 EST


Commit-ID: d60585c5766e9620d5d83e2b25dc042c7bdada2c
Gitweb: http://git.kernel.org/tip/d60585c5766e9620d5d83e2b25dc042c7bdada2c
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
AuthorDate: Tue, 12 Jul 2016 18:33:56 +0200
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Wed, 13 Jul 2016 14:58:20 +0200

sched/core: Correct off by one bug in load migration calculation

The move of calc_load_migrate() from CPU_DEAD to CPU_DYING did not take into
account that the function is now called from a thread running on the outgoing
CPU. As a result a cpu unplug leakes a load of 1 into the global load
accounting mechanism.

Fix it by adjusting for the currently running thread which calls
calc_load_migrate().

Reported-by: Anton Blanchard <anton@xxxxxxxxx>
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Acked-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Michael Ellerman <mpe@xxxxxxxxxxxxxx>
Cc: Vaidyanathan Srinivasan <svaidy@xxxxxxxxxxxxxxxxxx>
Cc: rt@xxxxxxxxxxxxx
Cc: shreyas@xxxxxxxxxxxxxxxxxx
Fixes: e9cd8fa4fcfd: ("sched/migration: Move calc_load_migrate() into CPU_DYING")
Link: alpine.DEB.2.11.1607121744350.4083@nanos">http://lkml.kernel.org/r/alpine.DEB.2.11.1607121744350.4083@nanos
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/core.c | 6 ++++--
kernel/sched/loadavg.c | 8 ++++----
kernel/sched/sched.h | 2 +-
3 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 51d7105..97ee9ac 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5394,13 +5394,15 @@ void idle_task_exit(void)
/*
* Since this CPU is going 'away' for a while, fold any nr_active delta
* we might have. Assumes we're called after migrate_tasks() so that the
- * nr_active count is stable.
+ * nr_active count is stable. We need to take the teardown thread which
+ * is calling this into account, so we hand in adjust = 1 to the load
+ * calculation.
*
* Also see the comment "Global load-average calculations".
*/
static void calc_load_migrate(struct rq *rq)
{
- long delta = calc_load_fold_active(rq);
+ long delta = calc_load_fold_active(rq, 1);
if (delta)
atomic_long_add(delta, &calc_load_tasks);
}
diff --git a/kernel/sched/loadavg.c b/kernel/sched/loadavg.c
index b0b93fd..a2d6eb7 100644
--- a/kernel/sched/loadavg.c
+++ b/kernel/sched/loadavg.c
@@ -78,11 +78,11 @@ void get_avenrun(unsigned long *loads, unsigned long offset, int shift)
loads[2] = (avenrun[2] + offset) << shift;
}

-long calc_load_fold_active(struct rq *this_rq)
+long calc_load_fold_active(struct rq *this_rq, long adjust)
{
long nr_active, delta = 0;

- nr_active = this_rq->nr_running;
+ nr_active = this_rq->nr_running - adjust;
nr_active += (long)this_rq->nr_uninterruptible;

if (nr_active != this_rq->calc_load_active) {
@@ -188,7 +188,7 @@ void calc_load_enter_idle(void)
* We're going into NOHZ mode, if there's any pending delta, fold it
* into the pending idle delta.
*/
- delta = calc_load_fold_active(this_rq);
+ delta = calc_load_fold_active(this_rq, 0);
if (delta) {
int idx = calc_load_write_idx();

@@ -389,7 +389,7 @@ void calc_global_load_tick(struct rq *this_rq)
if (time_before(jiffies, this_rq->calc_load_update))
return;

- delta = calc_load_fold_active(this_rq);
+ delta = calc_load_fold_active(this_rq, 0);
if (delta)
atomic_long_add(delta, &calc_load_tasks);

diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 7cbeb92..898c0d2 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -28,7 +28,7 @@ extern unsigned long calc_load_update;
extern atomic_long_t calc_load_tasks;

extern void calc_global_load_tick(struct rq *this_rq);
-extern long calc_load_fold_active(struct rq *this_rq);
+extern long calc_load_fold_active(struct rq *this_rq, long adjust);

#ifdef CONFIG_SMP
extern void cpu_load_update_active(struct rq *this_rq);