Re: [PATCH 2/2 v3] sched: Rewrite per entity runnable load average tracking

From: bsegall
Date: Wed Jul 16 2014 - 14:53:34 EST


Morten Rasmussen <morten.rasmussen@xxxxxxx> writes:

> On Wed, Jul 16, 2014 at 02:50:47AM +0100, Yuyang Du wrote:
>
> [...]
>
>> +/*
>> + * Update load_avg of the cfs_rq along with its own se. They should get
>> + * synchronized: group se's load_avg is used for task_h_load calc, and
>> + * group cfs_rq's load_avg is used for task_h_load (and update_cfs_share
>> + * calc).
>> + */
>> +static inline int update_cfs_rq_load_avg(u64 now, struct cfs_rq *cfs_rq)
>> {
>> - long old_contrib = se->avg.load_avg_contrib;
>> + int decayed;
>>
>> - if (entity_is_task(se)) {
>> - __update_task_entity_contrib(se);
>> - } else {
>> - __update_tg_runnable_avg(&se->avg, group_cfs_rq(se));
>> - __update_group_entity_contrib(se);
>> + if (atomic_long_read(&cfs_rq->removed_load_avg)) {
>> + long r = atomic_long_xchg(&cfs_rq->removed_load_avg, 0);
>> + cfs_rq->avg.load_avg = subtract_until_zero(cfs_rq->avg.load_avg, r);
>> + r *= LOAD_AVG_MAX;
>> + cfs_rq->avg.load_sum = subtract_until_zero(cfs_rq->avg.load_sum, r);
>> }
>>
>> - return se->avg.load_avg_contrib - old_contrib;
>> -}
>> + decayed = __update_load_avg(now, &cfs_rq->avg, cfs_rq->load.weight);
>> +#ifndef CONFIG_64BIT
>> + if (cfs_rq->avg.last_update_time != cfs_rq->load_last_update_time_copy)
>> + sa_q->last_update_time_copy = sa_q->last_update_time;
>
> This doesn't build on 32 bit. You need:
>
> - sa_q->last_update_time_copy = sa_q->last_update_time;
> + cfs_rq->load_last_update_time_copy = cfs_rq->avg.last_update_time;
>
> to make it build. But I'm not convinced that this synchronization is
> right.
>
> First let me say that I'm not an expert on synchronization. It seems to
> me that there is nothing preventing reordering of the writes in
> __update_load_avg() which sets cfs_rq->avg.last_update_time and the
> update of cfs_rq->avg.load_last_update_time_copy.

You're correct, this needs to be if(...) { smp_wmb(); copy = time; },
the same as update_min_vruntime.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/