[PATCH 0/7] sched,numa: improve NUMA convergence times

From: riel
Date: Mon Jun 23 2014 - 11:43:24 EST

Next message: Rob Herring: "Re: [PATCH v7 1/2] video: ARM CLCD: Add DT support"
Previous message: Theodore Ts'o: "Re: [regression] fix 32-bit breakage in block device read(2) (was Re: 32-bit bug in iovec iterator changes)"
Next in thread: riel: "[PATCH 1/7] sched,numa: use group's max nid as task's preferred nid"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Running things like the below pointed out a number of situations in
which the current NUMA code has extremely slow task convergence, and
even some situations in which tasks do not converge at all.

###
# 160 tasks will execute (on 4 nodes, 80 CPUs):
# -1x 0MB global shared mem operations
# -1x 1000MB process shared mem operations
# -1x 0MB thread local mem operations
###

###
#
# 0.0% [0.2 mins] 0/0 1/1 36/2 0/0 [36/3 ] l: 0-0 ( 0) {0-2}
# 0.0% [0.3 mins] 43/3 37/2 39/2 41/3 [ 6/10] l: 0-1 ( 1) {1-2}
# 0.0% [0.4 mins] 42/3 38/2 40/2 40/2 [ 4/9 ] l: 1-2 ( 1) [50.0%] {1-2}
# 0.0% [0.6 mins] 41/3 39/2 40/2 40/2 [ 2/9 ] l: 2-4 ( 2) [50.0%] {1-2}
# 0.0% [0.7 mins] 40/2 40/2 40/2 40/2 [ 0/8 ] l: 3-5 ( 2) [40.0%] ( 41.8s converged)

In this example, convergence requires that a task be moved from node
0 to node 1. Before this patch series, the load balancer would have
to perform that task move, because the NUMA code would only consider
a task swap when all the CPUs on a target node are busy...

Various related items have been fixed, and task convergence times are
way down now with various numbers of proceses and threads when doing
"perf bench numa mem -m -0 -P 1000 -p X -t Y" runs.

Before the patch series, convergence sometimes did not happen at all,
or randomly got delayed by many minutes.

With the patch series, convergence generally happens in 10-20 seconds,
with a few spikes up to 30-40 seconds, and very rare instances where
things take a few minutes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Rob Herring: "Re: [PATCH v7 1/2] video: ARM CLCD: Add DT support"
Previous message: Theodore Ts'o: "Re: [regression] fix 32-bit breakage in block device read(2) (was Re: 32-bit bug in iovec iterator changes)"
Next in thread: riel: "[PATCH 1/7] sched,numa: use group's max nid as task's preferred nid"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]