Re: [PATCH RFC] sched/fair: let cpu's cfs_rq to reflect task migration

From: Morten Rasmussen
Date: Wed Apr 06 2016 - 04:34:34 EST


On Tue, Apr 05, 2016 at 06:00:40PM +0100, Dietmar Eggemann wrote:
> @@ -2893,8 +2906,12 @@ static void attach_entity_load_avg(struct cfs_rq *cfs_rq, struct sched_entity *s
> se->avg.last_update_time = cfs_rq->avg.last_update_time;
> cfs_rq->avg.load_avg += se->avg.load_avg;
> cfs_rq->avg.load_sum += se->avg.load_sum;
> - cfs_rq->avg.util_avg += se->avg.util_avg;
> - cfs_rq->avg.util_sum += se->avg.util_sum;
> +
> + if (!entity_is_task(se))
> + return;
> +
> + rq_of(cfs_rq)->cfs.avg.util_avg += se->avg.util_avg;
> + rq_of(cfs_rq)->cfs.avg.util_sum += se->avg.util_sum;

To me it seems that you cannot be sure that the rq_of(cfs_rq)->cfs.avg
time stamp is aligned with se->avg time stamp, which is necessary before
you can add/subtract two geometric series without introducing an error.

attach_entity_load_avg() is called (through a couple of other functions)
from the for_each_sched_entity() loop in enqueue_task_fair() which works
its way towards the root cfs_rq, i.e. rq_of(cfs_rq)->cfs. So in the loop
iteration where you attach the task sched_entity, we haven't yet visited
and updated rq_of(cfs_rq)->cfs.avg.

If you just add the task contribution and discover later that there is a
time delta when you update rq_of(cfs_rq)->cfs.avg you end up decaying
the task contribution which was already up-to-date and its util
contribution to rq_of(cfs_rq)->cfs.avg ends up being smaller than it
should be.

Am I missing something?