[tip:sched/core] sched: Fix cgroup movement of forking process

From: tip-bot for Daisuke Nishimura
Date: Wed Dec 21 2011 - 06:44:45 EST


Commit-ID: 4fc420c91f53e0a9f95665c6b14a1983716081e7
Gitweb: http://git.kernel.org/tip/4fc420c91f53e0a9f95665c6b14a1983716081e7
Author: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx>
AuthorDate: Thu, 15 Dec 2011 14:36:55 +0900
Committer: Ingo Molnar <mingo@xxxxxxx>
CommitDate: Wed, 21 Dec 2011 10:34:49 +0100

sched: Fix cgroup movement of forking process

There is a small race between task_fork_fair() and sched_move_task(),
which is trying to move the parent.

task_fork_fair() sched_move_task()
--------------------------------+---------------------------------
cfs_rq = task_cfs_rq(current)
-> cfs_rq is the "old" one.
curr = cfs_rq->curr
-> curr is set to the parent.
task_rq_lock()
dequeue_task()
->parent.se.vruntime -= (old)cfs_rq->min_vruntime
enqueue_task()
->parent.se.vruntime += (new)cfs_rq->min_vruntime
task_rq_unlock()
raw_spin_lock_irqsave(rq->lock)
se->vruntime = curr->vruntime
-> vruntime of the child is set to that of the parent
which has already been updated by sched_move_task().
se->vruntime -= (old)cfs_rq->min_vruntime.
raw_spin_unlock_irqrestore(rq->lock)

As a result, vruntime of the child becomes far bigger than expected,
if (new)cfs_rq->min_vruntime >> (old)cfs_rq->min_vruntime.

This patch fixes this problem by setting "cfs_rq" and "curr" after
holding the rq->lock.

Signed-off-by: Daisuke Nishimura <nishimura@xxxxxxxxxxxxxxxxx>
Acked-by: Paul Turner <pjt@xxxxxxxxxx>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Link: http://lkml.kernel.org/r/20111215143655.662676b0.nishimura@xxxxxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/sched/fair.c | 7 +++++--
1 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cea2fa8..525d69e 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5190,8 +5190,8 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
*/
static void task_fork_fair(struct task_struct *p)
{
- struct cfs_rq *cfs_rq = task_cfs_rq(current);
- struct sched_entity *se = &p->se, *curr = cfs_rq->curr;
+ struct cfs_rq *cfs_rq;
+ struct sched_entity *se = &p->se, *curr;
int this_cpu = smp_processor_id();
struct rq *rq = this_rq();
unsigned long flags;
@@ -5200,6 +5200,9 @@ static void task_fork_fair(struct task_struct *p)

update_rq_clock(rq);

+ cfs_rq = task_cfs_rq(current);
+ curr = cfs_rq->curr;
+
if (unlikely(task_cpu(p) != this_cpu)) {
rcu_read_lock();
__set_task_cpu(p, this_cpu);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/