Re: [PATCH] sched/fair: Make PELT signal more accurate

From: Vincent Guittot
Date: Mon Aug 07 2017 - 09:24:43 EST


Hi Joel,

On 4 August 2017 at 17:40, Joel Fernandes <joelaf@xxxxxxxxxx> wrote:
> The PELT signal (sa->load_avg and sa->util_avg) are not updated if the amount
> accumulated during a single update doesn't cross a period boundary. This is
> fine in cases where the amount accrued is much smaller than the size of a
> single PELT window (1ms) however if the amount accrued is high then the
> relative error (calculated against what the actual signal would be had we
> updated the averages) can be quite high - as much 3-6% in my testing. On
> plotting signals, I found that there are errors especially high when we update
> just before the period boundary is hit. These errors can be significantly
> reduced if we update the averages more often.
>
> Inorder to fix this, this patch does the average update by also checking how
> much time has elapsed since the last update and update the averages if it has
> been long enough (as a threshold I chose 128us).

Why 128us and not 512us as an example ?

128us threshold means that util/load_avg can be computed 8 times more
often and this means up to 16 times more call to div_u64

>
> In order to compare the signals with/without the patch I created a synthetic
> test (20ms runtime, 100ms period) and analyzed the signals and created a report
> on the analysis data/plots both with and without the fix:
> http://www.linuxinternals.org/misc/pelt-error.pdf

The glitch described in page 2 shows a decrease of the util_avg which
is not linked to accuracy of the calculation but due to the use of the
wrong range when computing util_avg.
commit 625ed2bf049d "sched/cfs: Make util/load_avg more stable" fixes
this glitch.
And the lower peak value in page 3 is probably linked to the inaccuracy
I agree that there is an inaccuracy (the max absolute value of 22) but
that's in favor of less overhead. Have you seen wrong behavior because
of this inaccuracy ?

>
> With the patch, the error in the signal is significantly reduced, and is
> non-existent beyond a small negligible amount.
>
> Cc: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Juri Lelli <juri.lelli@xxxxxxx>
> Cc: Brendan Jackman <brendan.jackman@xxxxxxx>
> Cc: Dietmar Eggeman <dietmar.eggemann@xxxxxxx>
> Signed-off-by: Joel Fernandes <joelaf@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f1825d60937..1347643737f3 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2882,6 +2882,7 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> unsigned long weight, int running, struct cfs_rq *cfs_rq)
> {
> u64 delta;
> + int periods;
>
> delta = now - sa->last_update_time;
> /*
> @@ -2908,9 +2909,12 @@ ___update_load_avg(u64 now, int cpu, struct sched_avg *sa,
> * accrues by two steps:
> *
> * Step 1: accumulate *_sum since last_update_time. If we haven't
> - * crossed period boundaries, finish.
> + * crossed period boundaries and the time since last update is small
> + * enough, we're done.
> */
> - if (!accumulate_sum(delta, cpu, sa, weight, running, cfs_rq))
> + periods = accumulate_sum(delta, cpu, sa, weight, running, cfs_rq);
> +
> + if (!periods && delta < 128)
> return 0;
>
> /*
> --
> 2.14.0.rc1.383.gd1ce394fe2-goog
>