Re: [PATCH] sched/fair: Fix the wrong throttled clock time for cfs_rq_clock_task()

From: Xunlei Pang
Date: Mon May 30 2016 - 05:37:52 EST


Ping

On 2016/05/12 at 11:36, Xunlei Pang wrote:
> On 2016/05/11 at 14:49, Peter Zijlstra wrote:
>> On Tue, May 10, 2016 at 11:19:44AM -0700, bsegall@xxxxxxxxxx wrote:
>>> Xunlei Pang <xlpang@xxxxxxxxxx> writes:
>>>
>>>> Two minor fixes for cfs_rq_clock_task().
>>>> 1) If cfs_rq is currently being throttled, we need to subtract the cfs
>>>> throttled clock time.
>>>>
>>>> 2) Make "throttled_clock_task_time" update SMP unrelated. Now UP cases
>>>> need it as well.
>>>>
>>>> Signed-off-by: Xunlei Pang <xlpang@xxxxxxxxxx>
>>>> ---
>>>> kernel/sched/fair.c | 4 +---
>>>> 1 file changed, 1 insertion(+), 3 deletions(-)
>>>>
>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>> index 1708729e..fb80a12 100644
>>>> --- a/kernel/sched/fair.c
>>>> +++ b/kernel/sched/fair.c
>>>> @@ -3655,7 +3655,7 @@ static inline struct cfs_bandwidth *tg_cfs_bandwidth(struct task_group *tg)
>>>> static inline u64 cfs_rq_clock_task(struct cfs_rq *cfs_rq)
>>>> {
>>>> if (unlikely(cfs_rq->throttle_count))
>>>> - return cfs_rq->throttled_clock_task;
>>>> + return cfs_rq->throttled_clock_task - cfs_rq->throttled_clock_task_time;
>>>>
>>>> return rq_clock_task(rq_of(cfs_rq)) - cfs_rq->throttled_clock_task_time;
>>>> }
>> The alternative is obviously to do the subtraction in
>> tg_throttle_down(), were we set ->throttled_clock_task.
> It is possible, but throttled_clock_task is a timestamp, I think doing it here is semantically better.
>
>>>> @@ -3793,13 +3793,11 @@ static int tg_unthrottle_up(struct task_group *tg, void *data)
>>>> struct cfs_rq *cfs_rq = tg->cfs_rq[cpu_of(rq)];
>>>>
>>>> cfs_rq->throttle_count--;
>>>> -#ifdef CONFIG_SMP
>>>> if (!cfs_rq->throttle_count) {
>>>> /* adjust cfs_rq_clock_task() */
>>>> cfs_rq->throttled_clock_task_time += rq_clock_task(rq) -
>>>> cfs_rq->throttled_clock_task;
>>>> }
>>>> -#endif
>>>>
>>>> return 0;
>>>> }
>>> [Cc: pjt@xxxxxxxxxx]
>>>
>>> This looks reasonable to me (at least the first part; I'm not
>>> certain why the CONFIG_SMP ifdef was put in place).
>> 64660c864f46 ("sched: Prevent interactions with throttled entities")
>>
>> Introduced it, because at that time it was about updating shares, which
>> is only present on SMP. Then:
>>
>> f1b17280efbd ("sched: Maintain runnable averages across throttled periods")
>>
>> Added the clock thing inside it, and:
>>
>> 82958366cfea ("sched: Replace update_shares weight distribution with per-entity computation")
>>
>> took out the shares update and left the clock update, resulting in the
>> current code.
>>
>>
> Thanks,
> Xunlei
>