Re: [PATCH] sched/pelt: sync util/runnable_sum with PELT window when propagating

From: Dietmar Eggemann
Date: Fri Apr 24 2020 - 08:07:45 EST


On 23/04/2020 18:17, Vincent Guittot wrote:
> On Thu, 23 Apr 2020 at 16:30, Dietmar Eggemann <dietmar.eggemann@xxxxxxx> wrote:
>>
>> On 22/04/2020 17:14, Vincent Guittot wrote:

[..]

>> gcfs --> tg --> gse
>> ________________|
>> |
>> V
>>
>> cfs ---> tg (root)
>>
>> |
>> V
>>
>> rq
>>
>
> child cfs_rq aka gcfs_rq
> |
> gse: group entity that represents child cfs_rq in parent cfs_rq
> |
> v
> parent cfs_rq aka cfs_rq

OK, I see. Maybe it's clearer to refer to child cfs_rq as gcfs_rq in
this context.

[...]

>>> /* Set new sched_entity's utilization */
>>> se->avg.util_avg = gcfs_rq->avg.util_avg;
>>> - se->avg.util_sum = se->avg.util_avg * LOAD_AVG_MAX;
>>> + se->avg.util_sum = se->avg.util_avg * divider;
>>
>> divider uses cfs_rq but we sync se->avg.util_avg with gcfs_rq here.
>
> we sync the util_avg of gse with the new util_avg of gcfs_rq but gse
> is attached to cfs_rq and as a result we have to use cfs_rq's
> period_contrib

I agree.

But the decay windows (avg.last_update_time, avg.period_contrib) of
cfs_rq and gcfs_rq are not always aligned, I guess?

I see they are not after the online_fair_sched_group() ->
attach_entity_cfs_rq() but later the are in sync as well.

I ran a couple of different rt-app taskgroup tests and try to

BUG_ON(se->avg.period_contrib != gcfs_rq->avg.period_contrib);
BUG_ON(se->avg.last_update_time != gcfs_rq->avg.last_update_time)

in update_tg_cfs_util() but they didn't trigger so far.

Both, cfs_rq and gcfs_rq are in sync in update_tg_cfs_util() before and
during a task runs in gcfs_rq.

Are there cases where this wouldn't necessary happen in
update_tg_cfs_util(), maybe a more complicated testcase?

>> But since avg.period_contrib of cfs_rq and gcfs_rq are the same this
>> should work.
>>
>>> /* Update parent cfs_rq utilization */
>>> add_positive(&cfs_rq->avg.util_avg, delta);
>>> - cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * LOAD_AVG_MAX;
>>> + cfs_rq->avg.util_sum = cfs_rq->avg.util_avg * divider;
>>
>> Looks like that avg.last_update_time of se (group entity), it's gcfs_rq
>> and cfs_rq is always the same in update_tg_cfs_[util\|runnable].
>>
>> So that means the PELT windows are aligned for cfs_rqs and group se's?
>
> We want to align util_avg with util_sum and period_contrib otherwise
> we might have some unalignment. It's quite similarly to what is done
> in attach_entity_load_avg()

I agree.

>> And if we want to enforce this for cfs_rq and task, we have
>> sync_entity_load_avg().
>
> It's not a matter of syncing the last_update_time

I agree, this is not what you want to achieve here.
But syncing 'last_update_time' and 'period_contrib' is what I understand
as aligning the decay window (like in attach_entity_load_avg()).