Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig

From: bsegall
Date: Thu Sep 10 2015 - 13:25:55 EST


Morten Rasmussen <morten.rasmussen@xxxxxxx> writes:

> On Wed, Sep 09, 2015 at 03:23:43PM -0700, bsegall@xxxxxxxxxx wrote:
>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>>
>> > On Tue, Sep 08, 2015 at 03:31:58PM +0100, Morten Rasmussen wrote:
>> >> On Tue, Sep 08, 2015 at 02:52:05PM +0200, Peter Zijlstra wrote:
>> >> > > Tricky that, LOAD_AVG_MAX very much relies on the unit being 1<<10.
>> >>
>> >> I don't get why LOAD_AVG_MAX relies on the util_avg shifting being
>> >> 1<<10, it is just the sum of the geometric series and the upper bound of
>> >> util_sum?
>> >
>> > It needs a 1024, it might just have been the 1024 ns we use a period
>> > instead of the scale unit though.
>> >
>> > The LOAD_AVG_MAX is the number where adding a next element to the series
>> > doesn't change the result anymore, so scaling it up will allow more
>> > significant elements to the series before we bottom out, which is the _N
>> > thing.
>> >
>>
>> Yes, as the comments say, the 1024ns unit is arbitrary (and is an
>> average of not-quite-microseconds instead of just nanoseconds to allow
>> more bits to load.weight when we multiply load.weight by this number).
>> In fact there are two arbitrary 1024 units here, which are technically
>> unrelated and are both unrelated to SCHED_LOAD_RESOLUTION/etc - we
>> operate on units of almost-microseconds and we also do decays every
>> almost-millisecond.
>>
>> There appears to be a bunch of confusion in the current code around
>> util_sum/util_avg which appears to using SCHED_LOAD_SCALE
>> for a fixed-point percentage or something, which is at least reasonable,
>> but is initializing it as scale_load_down(SCHED_LOAD_SCALE), which
>> results in either initializing as 100% or .1% depending on RESOLUTION.
>> This'll get clobbered on first update, but if it needs to be
>> initialized, it should either get initialized to something sane or at
>> least consistent.
>
> This is what I thought too. The whole geometric series math is completely
> independent of the scale used for priority in load_avg and the fixed
> point shifting used for util_avg.
>
>> load_sum/load_avg appear to be scale_load_down()ed properly, and appear
>> to be used as such at a quick glance.
>
> I don't think shifting by SCHED_LOAD_SHIFT in __update_load_avg() is
> right:
>
> sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX;
>
> util_avg is initialized to low resolution (>> SCHED_LOAD_RESOLUTION):
>
> sa->util_avg = scale_load_down(SCHED_LOAD_SCALE);
>
> so it appear to be intended to be using low resolution like load_avg
> (weight is scaled down before it is passed into __update_load_avg()),
> but util_avg is shifted up to high resolution. It should be:
>
> sa->util_avg = (sa->util_sum << (SCHED_LOAD_SHIFT -
> SCHED_LOAD_SHIFT)) / LOAD_AVG_MAX;
>
> to be consistent.

Yeah, util_avg was/is screwed up in terms of either the initialization
or which shift to use there. The load ones however appear to be fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/