Re: [PATCH 5/6] sched/fair: Get rid of scaling utilization by capacity_orig

From: Morten Rasmussen
Date: Thu Sep 10 2015 - 08:06:32 EST


On Thu, Sep 10, 2015 at 01:11:01PM +0200, Vincent Guittot wrote:
> On 10 September 2015 at 13:06, Morten Rasmussen
> <morten.rasmussen@xxxxxxx> wrote:
> > On Wed, Sep 09, 2015 at 03:23:43PM -0700, bsegall@xxxxxxxxxx wrote:
> >> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
> >>
> >> > On Tue, Sep 08, 2015 at 03:31:58PM +0100, Morten Rasmussen wrote:
> >> >> On Tue, Sep 08, 2015 at 02:52:05PM +0200, Peter Zijlstra wrote:
> >> >> > > Tricky that, LOAD_AVG_MAX very much relies on the unit being 1<<10.
> >> >>
> >> >> I don't get why LOAD_AVG_MAX relies on the util_avg shifting being
> >> >> 1<<10, it is just the sum of the geometric series and the upper bound of
> >> >> util_sum?
> >> >
> >> > It needs a 1024, it might just have been the 1024 ns we use a period
> >> > instead of the scale unit though.
> >> >
> >> > The LOAD_AVG_MAX is the number where adding a next element to the series
> >> > doesn't change the result anymore, so scaling it up will allow more
> >> > significant elements to the series before we bottom out, which is the _N
> >> > thing.
> >> >
> >>
> >> Yes, as the comments say, the 1024ns unit is arbitrary (and is an
> >> average of not-quite-microseconds instead of just nanoseconds to allow
> >> more bits to load.weight when we multiply load.weight by this number).
> >> In fact there are two arbitrary 1024 units here, which are technically
> >> unrelated and are both unrelated to SCHED_LOAD_RESOLUTION/etc - we
> >> operate on units of almost-microseconds and we also do decays every
> >> almost-millisecond.
> >>
> >> There appears to be a bunch of confusion in the current code around
> >> util_sum/util_avg which appears to using SCHED_LOAD_SCALE
> >> for a fixed-point percentage or something, which is at least reasonable,
> >> but is initializing it as scale_load_down(SCHED_LOAD_SCALE), which
> >> results in either initializing as 100% or .1% depending on RESOLUTION.
> >> This'll get clobbered on first update, but if it needs to be
> >> initialized, it should either get initialized to something sane or at
> >> least consistent.
> >
> > This is what I thought too. The whole geometric series math is completely
> > independent of the scale used for priority in load_avg and the fixed
> > point shifting used for util_avg.
> >
> >> load_sum/load_avg appear to be scale_load_down()ed properly, and appear
> >> to be used as such at a quick glance.
> >
> > I don't think shifting by SCHED_LOAD_SHIFT in __update_load_avg() is
> > right:
> >
> > sa->util_avg = (sa->util_sum << SCHED_LOAD_SHIFT) / LOAD_AVG_MAX;
> >
> > util_avg is initialized to low resolution (>> SCHED_LOAD_RESOLUTION):
> >
> > sa->util_avg = scale_load_down(SCHED_LOAD_SCALE);
> >
> > so it appear to be intended to be using low resolution like load_avg
> > (weight is scaled down before it is passed into __update_load_avg()),
> > but util_avg is shifted up to high resolution. It should be:
> >
> > sa->util_avg = (sa->util_sum << (SCHED_LOAD_SHIFT -
> > SCHED_LOAD_SHIFT)) / LOAD_AVG_MAX;
>
> you probably mean (SCHED_LOAD_SHIFT - SCHED_LOAD_RESOLUTION)

Yes. Thanks for providing the right expression. There seems to be enough
confusion in this thread already :)

> The goal of this patchset is to be able to scale util_avg in the range
> of cpu capacity so why don't we directly initialize it with
> sa->util_avg = SCHED_CAPACITY_SCALE;
>
> and then use
>
> sa->util_avg = (sa->util_sum << SCHED_CAPACITY_SHIFT) / LOAD_AVG_MAX;
>
> so we don't have to take care of high and low load resolution

That works for me, except that the left-shift has gone be PeterZ's
optimization patch posted earlier in this thread. It is changing
util_sum to scaled by capacity instead of being the pure geometric
series which requires the left shift at the end when we divide by
LOAD_AVG_MAX. So it should be equivalent to what you are proposing if we
change the initialization to your proposal too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/