Re: [PATCH] sched,numa: update migrate_improves/degrades_locality

From: Peter Zijlstra
Date: Fri May 16 2014 - 09:46:48 EST


On Thu, May 15, 2014 at 01:03:06PM -0400, Rik van Riel wrote:
> Update the migrate_improves/degrades_locality functions with
> knowledge of pseudo-interleaving.
>
> Do not consider moving tasks around within the set of group's active
> nodes as improving or degrading locality. Instead, leave the load
> balancer free to balance the load between a numa_group's active nodes.
>
> Also, switch from the group/task_weight functions to the group/task_fault
> functions. The "weight" functions involve a division, but both calls use
> the same divisor, so there's no point in doing that from these functions.
>
> On a 4 node (x10 core) system, performance of SPECjbb2005 seems
> unaffected, though the number of migrations with 2 8-warehouse wide
> instances seems to have almost halved, due to the scheduler running
> each instance on a single node.
>
> Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
> ---
> kernel/sched/fair.c | 42 +++++++++++++++++++++++++++++-------------
> 1 file changed, 29 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 6504015..4f01e2f1 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -4971,6 +4971,7 @@ task_hot(struct task_struct *p, u64 now, struct sched_domain *sd)
> /* Returns true if the destination node has incurred more faults */
> static bool migrate_improves_locality(struct task_struct *p, struct lb_env *env)
> {
> + struct numa_group *numa_group = ACCESS_ONCE(p->numa_group);

That wants to be rcu_dereference() to match the rcu_assign_pointer() we
use to set it.

Same in that wake_numa patch

> int src_nid, dst_nid;
>
> if (!sched_feat(NUMA_FAVOUR_HIGHER) || !p->numa_faults_memory ||

Attachment: pgpBkiE8BWnN0.pgp
Description: PGP signature