Re: [PATCH v2] sched: fix first task of a task group is attached twice

From: Dietmar Eggemann
Date: Fri May 27 2016 - 11:48:57 EST


On 25/05/16 16:01, Vincent Guittot wrote:
> The cfs_rq->avg.last_update_time is initialize to 0 with the main effect
> that the 1st sched_entity that will be attached, will keep its
> last_update_time set to 0 and will attached once again during the
> enqueue.
> Initialize cfs_rq->avg.last_update_time to 1 instead.
>
> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> ---
>
> v2:
> - rq_clock_task(rq_of(cfs_rq)) can't be used because lock is not held
>
> kernel/sched/fair.c | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 218f8e8..3724656 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8586,6 +8586,14 @@ void init_tg_cfs_entry(struct task_group *tg, struct cfs_rq *cfs_rq,
> se->depth = parent->depth + 1;
> }
>
> + /*
> + * Set last_update_time to something different from 0 to make
> + * sure the 1st sched_entity will not be attached twice: once
> + * when attaching the task to the group and one more time when
> + * enqueueing the task.
> + */
> + tg->cfs_rq[cpu]->avg.last_update_time = 1;
> +
> se->my_q = cfs_rq;
> /* guarantee group entities always have weight */
> update_load_set(&se->load, NICE_0_LOAD);

So why not setting the last_update_time value for those cfs_rq's when
we have the lock? E.g. in task_move_group_fair() or attach_task_cfs_rq().

@@ -8490,12 +8493,20 @@ void init_cfs_rq(struct cfs_rq *cfs_rq)
#ifdef CONFIG_FAIR_GROUP_SCHED
static void task_move_group_fair(struct task_struct *p)
{
+#ifdef CONFIG_SMP
+ struct cfs_rq *cfs_rq = NULL;
+#endif
+
detach_task_cfs_rq(p);
set_task_rq(p, task_cpu(p));

#ifdef CONFIG_SMP
/* Tell se's cfs_rq has been changed -- migrated */
p->se.avg.last_update_time = 0;
+
+ cfs_rq = cfs_rq_of(&p->se);
+ if (!cfs_rq->avg.last_update_time)
+ cfs_rq->avg.last_update_time = rq_clock_task(rq_of(cfs_rq));
#endif

or

@@ -8423,6 +8423,9 @@ static void attach_task_cfs_rq(struct task_struct *p)
se->depth = se->parent ? se->parent->depth + 1 : 0;
#endif

+ if (!cfs_rq->avg.last_update_time)
+ cfs_rq->avg.last_update_time = rq_clock_task(rq_of(cfs_rq));
+
/* Synchronize task with its cfs_rq */
attach_entity_load_avg(cfs_rq, se);