[PATCH v10 05/11] sched: make scale_rt invariant with frequency

From: Vincent Guittot
Date: Fri Feb 27 2015 - 10:55:22 EST


The average running time of RT tasks is used to estimate the remaining compute
capacity for CFS tasks. This remaining capacity is the original capacity scaled
down by a factor (aka scale_rt_capacity). This estimation of available capacity
must also be invariant with frequency scaling.

A frequency scaling factor is applied on the running time of the RT tasks for
computing scale_rt_capacity.

In sched_rt_avg_update, we now scale the RT execution time like below:
rq->rt_avg += rt_delta * arch_scale_freq_capacity() >> SCHED_CAPACITY_SHIFT

Then, scale_rt_capacity can be summarized by:
scale_rt_capacity = SCHED_CAPACITY_SCALE * available / total
with available = total - rq->rt_avg

This has been been optimized in current code by
scale_rt_capacity = available / (total >> SCHED_CAPACITY_SHIFT)

But we can also developed the equation like below
scale_rt_capacity = SCHED_CAPACITY_SCALE -
((rq->rt_avg << SCHED_CAPACITY_SHIFT) / total)

and we can optimize the equation by removing SCHED_CAPACITY_SHIFT shift in
the computation of rq->rt_avg and scale_rt_capacity

so rq->rt_avg += rt_delta * arch_scale_freq_capacity()
and
scale_rt_capacity = SCHED_CAPACITY_SCALE - (rq->rt_avg / total)

arch_scale_frequency_capacity will be called in the hot path of the scheduler
which implies to have a short and efficient function.
As an example, arch_scale_frequency_capacity should return a cached value that
is updated periodically outside of the hot path.

Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
Acked-by: Morten Rasmussen <morten.rasmussen@xxxxxxx>
---
kernel/sched/fair.c | 17 +++++------------
kernel/sched/sched.h | 4 +++-
2 files changed, 8 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7f031e4..dc7c693 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6004,7 +6004,7 @@ unsigned long __weak arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
static unsigned long scale_rt_capacity(int cpu)
{
struct rq *rq = cpu_rq(cpu);
- u64 total, available, age_stamp, avg;
+ u64 total, used, age_stamp, avg;
s64 delta;

/*
@@ -6020,19 +6020,12 @@ static unsigned long scale_rt_capacity(int cpu)

total = sched_avg_period() + delta;

- if (unlikely(total < avg)) {
- /* Ensures that capacity won't end up being negative */
- available = 0;
- } else {
- available = total - avg;
- }
+ used = div_u64(avg, total);

- if (unlikely((s64)total < SCHED_CAPACITY_SCALE))
- total = SCHED_CAPACITY_SCALE;
+ if (likely(used < SCHED_CAPACITY_SCALE))
+ return SCHED_CAPACITY_SCALE - used;

- total >>= SCHED_CAPACITY_SHIFT;
-
- return div_u64(available, total);
+ return 1;
}

static void update_cpu_capacity(struct sched_domain *sd, int cpu)
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 65fa7b5..23c6dd7 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1374,9 +1374,11 @@ static inline int hrtick_enabled(struct rq *rq)

#ifdef CONFIG_SMP
extern void sched_avg_update(struct rq *rq);
+extern unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu);
+
static inline void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
{
- rq->rt_avg += rt_delta;
+ rq->rt_avg += rt_delta * arch_scale_freq_capacity(NULL, cpu_of(rq));
sched_avg_update(rq);
}
#else
--
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/