[tip:sched/core] sched/fair: Fix switched_to_fair()' s per entity load tracking

From: tip-bot for Byungchul Park
Date: Sun Sep 13 2015 - 07:00:26 EST


Commit-ID: 6efdb105d392da3ad5cb4ef951aed373cd049813
Gitweb: http://git.kernel.org/tip/6efdb105d392da3ad5cb4ef951aed373cd049813
Author: Byungchul Park <byungchul.park@xxxxxxx>
AuthorDate: Thu, 20 Aug 2015 20:21:59 +0900
Committer: Ingo Molnar <mingo@xxxxxxxxxx>
CommitDate: Sun, 13 Sep 2015 09:52:48 +0200

sched/fair: Fix switched_to_fair()'s per entity load tracking

Where switched_from_fair() will remove the entity's load from the
runqueue, switched_to_fair() does not currently add it back. This
means that when a task leaves the fair class for a short duration; say
because of PI; we loose its load contribution.

This can ripple forward and disturb the load tracking because other
operations (enqueue, dequeue) assume its factored in. Only once the
runqueue empties will the load tracking recover.

When we add it back in, age the per entity average to match up with
the runqueue age. This has the obvious problem that if the task leaves
the fair class for a significant time, the load will age to 0.

Employ the normal migration rule for inter-runqueue moves in
task_move_group_fair(). Again, there is the obvious problem of the
task migrating while not in the fair class.

The alternative solution would be to to omit the chunk in
attach_entity_load_avg(), which would effectively reset the timestamp
and use whatever avg there was.

Signed-off-by: Byungchul Park <byungchul.park@xxxxxxx>
[ Rewrote the changelog and comments. ]
Signed-off-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Mike Galbraith <efault@xxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: yuyang.du@xxxxxxxxx
Link: http://lkml.kernel.org/r/1440069720-27038-5-git-send-email-byungchul.park@xxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
---
kernel/sched/fair.c | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1e1fe7f..5143ea0 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2712,6 +2712,20 @@ static inline void update_load_avg(struct sched_entity *se, int update_tg)

static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *se)
{
+ /*
+ * If we got migrated (either between CPUs or between cgroups) we'll
+ * have aged the average right before clearing @last_update_time.
+ */
+ if (se->avg.last_update_time) {
+ __update_load_avg(cfs_rq->avg.last_update_time, cpu_of(rq_of(cfs_rq)),
+ &se->avg, 0, 0, NULL);
+
+ /*
+ * XXX: we could have just aged the entire load away if we've been
+ * absent from the fair class for too long.
+ */
+ }
+
se->avg.last_update_time = cfs_rq->avg.last_update_time;
cfs_rq->avg.load_avg += se->avg.load_avg;
cfs_rq->avg.load_sum += se->avg.load_sum;
@@ -7945,6 +7959,9 @@ static void switched_to_fair(struct rq *rq, struct task_struct *p)
se->depth = se->parent ? se->parent->depth + 1 : 0;
#endif

+ /* Synchronize task with its cfs_rq */
+ attach_entity_load_avg(cfs_rq_of(&p->se), &p->se);
+
if (!task_on_rq_queued(p)) {

/*
@@ -8044,6 +8061,12 @@ static void task_move_group_fair(struct task_struct *p, int queued)
/* Synchronize task with its prev cfs_rq */
detach_entity_load_avg(cfs_rq, se);
set_task_rq(p, task_cpu(p));
+
+#ifdef CONFIG_SMP
+ /* Tell se's cfs_rq has been changed -- migrated */
+ p->se.avg.last_update_time = 0;
+#endif
+
se->depth = se->parent ? se->parent->depth + 1 : 0;
cfs_rq = cfs_rq_of(se);
if (!queued)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/