Re: [PATCH] sched: avg_overlap decay

From: Mike Galbraith
Date: Wed Mar 11 2009 - 00:09:49 EST


On Tue, 2009-03-10 at 19:18 +0100, Peter Zijlstra wrote:
> Mike, are you good with this patch as it stands?

Yes, works for me.

-Mike

> ---
> Subject: sched: avg_overlap decay
> From: Mike Galbraith <efault@xxxxxx>
> Date: Tue Mar 10 19:08:11 CET 2009
>
> avg_overlap is used to measure the runtime overlap of the waker and wakee.
>
> However, when a process changes behaviour, eg a pipe becomes un-congested
> and we don't need to go to sleep after a wakeup for a while, the avg_overlap
> value grows stale.
>
> When running we use the avg runtime between preemption as a measure for
> avg_overlap since the amount of runtime can be correlated to cache footprint.
>
> The longer we run, the less likely we'll be wanting to be migrated to another
> CPU.
>
> Signed-off-by: Mike Galbraith <efault@xxxxxx>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> kernel/sched.c | 24 +++++++++++++++++++++++-
> 1 file changed, 23 insertions(+), 1 deletion(-)
>
> Index: linux-2.6/kernel/sched.c
> ===================================================================
> --- linux-2.6.orig/kernel/sched.c
> +++ linux-2.6/kernel/sched.c
> @@ -4692,6 +4692,28 @@ static inline void schedule_debug(struct
> #endif
> }
>
> +static void put_prev_task(struct rq *rq, struct task_struct *prev)
> +{
> + if (prev->state == TASK_RUNNING) {
> + u64 runtime = prev->se.sum_exec_runtime;
> +
> + runtime -= prev->se.prev_sum_exec_runtime;
> + runtime = min_t(u64, runtime, 2*sysctl_sched_migration_cost);
> +
> + /*
> + * In order to avoid avg_overlap growing stale when we are
> + * indeed overlapping and hence not getting put to sleep, grow
> + * the avg_overlap on preemption.
> + *
> + * We use the average preemption runtime because that
> + * correlates to the amount of cache footprint a task can
> + * build up.
> + */
> + update_avg(&prev->se.avg_overlap, runtime);
> + }
> + prev->sched_class->put_prev_task(rq, prev);
> +}
> +
> /*
> * Pick up the highest-prio task:
> */
> @@ -4768,7 +4790,7 @@ need_resched_nonpreemptible:
> if (unlikely(!rq->nr_running))
> idle_balance(cpu, rq);
>
> - prev->sched_class->put_prev_task(rq, prev);
> + put_prev_task(rq, prev);
> next = pick_next_task(rq);
>
> if (likely(prev != next)) {
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/