Re: [PATCH RFC] sched,numa: decay wakee_flips instead of zeroing

From: Peter Zijlstra
Date: Fri May 16 2014 - 09:22:27 EST


On Fri, May 16, 2014 at 12:13:32AM -0400, Rik van Riel wrote:
> Affine wakeups have the potential to interfere with NUMA placement.
> If a task wakes up too many other tasks, affine wakeups will get
> disabled.
>
> However, regardless of how many other tasks it wakes up, it gets
> re-enabled once a second, potentially interfering with NUMA
> placement of other tasks.
>
> By decaying wakee_wakes in half instead of zeroing it, we can avoid
> that problem for some workloads.

See https://lkml.org/lkml/2013/7/2/110 and further

> Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 4f01e2f1..0381b11 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4009,7 +4009,7 @@ static void record_wakee(struct task_struct *p)
> * about the loss.
> */
> if (jiffies > current->wakee_flip_decay_ts + HZ) {
> - current->wakee_flips = 0;
> + current->wakee_flips >>= 1;
> current->wakee_flip_decay_ts = jiffies;
> }

Would it make sense to do something like:

now = jiffies;
while (current->wakee_flips && now > current->wakee_flip_decay_ts + HZ) {
current->wakee_flips >>= 1;
current->wakee_flip_decay_ts += HZ;
}
if (unlikely(now > current->wakee_flip_decay_ts + HZ))
current->wakee_flip_decay_ts = now;

Or is that over engineering things?


Attachment: pgpQAyYA7DHZw.pgp
Description: PGP signature