[PATCH 41/63] sched: numa: fix placement of workloads spread across multiple nodes

From: Mel Gorman
Date: Fri Sep 27 2013 - 09:37:08 EST


From: Rik van Riel <riel@xxxxxxxxxx>

The load balancer will spread workloads across multiple NUMA nodes,
in order to balance the load on the system. This means that sometimes
a task's preferred node has available capacity, but moving the task
there will not succeed, because that would create too large an imbalance.

In that case, other NUMA nodes need to be considered.

Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
Signed-off-by: Mel Gorman <mgorman@xxxxxxx>
---
kernel/sched/fair.c | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 99b6711..8ebed0a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1104,13 +1104,12 @@ static int task_numa_migrate(struct task_struct *p)
imp = task_faults(env.p, env.dst_nid) - faults;
update_numa_stats(&env.dst_stats, env.dst_nid);

- /*
- * If the preferred nid has capacity then use it. Otherwise find an
- * alternative node with relatively better statistics.
- */
- if (env.dst_stats.has_capacity) {
+ /* If the preferred nid has capacity, try to use it. */
+ if (env.dst_stats.has_capacity)
task_numa_find_cpu(&env, imp);
- } else {
+
+ /* No space available on the preferred nid. Look elsewhere. */
+ if (env.best_cpu == -1) {
for_each_online_node(nid) {
if (nid == env.src_nid || nid == p->numa_preferred_nid)
continue;
--
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/