Re: NUMA, migrate/N, and tuned-adm

From: Rik van Riel
Date: Tue Dec 17 2013 - 15:41:50 EST


On 12/17/2013 01:10 PM, David Timothy Strauss wrote:

> System specs:
> * Fedora 19 with the 3.11.10-200.fc19.x86_64 kernel (just the stock RPM)
> * Bare-metal servers with 128GB RAM split between two NUMA regions,
> each region with one hex-core processor
> * More than 700 processes, a couple hundred of which are active
> fairly frequently. The systems were at 7000 processes, but we've
> dropped it while we dive into this issue.
> * Many of the processes are short-lived. The long-lived ones
> experience spikes in CPU and memory usage while processing requests.
>
> Here's what we've tried, to no avail:
> * tuned-adm on latency-performance and virtual-host profiles; this
> places the system on the deadline scheduler, but this problem occurred
> on the default one too
> * kernel.sched_migration_cost_ns=5000000 (which tuned will do for
> those profiles in v3.3/Fedora 20)
> * numad to balance between regions
> * Global use of sched_relax_domain_level=1 and sched_relax_domain_level=2
> * Splitting the system with cpuset into management tasks (6 virtual
> cores) and workload tasks (18 virtual cores) with
> sched_relax_domain_level=2. This is based on recommendations for NUMA
> systems in the cpuset man page.

Just for a quick sanity check, can you try disabling the
automatic numa balancing code?

# echo NO_NUMA > /sys/kernel/debug/sched_features

> Here's what we've used for analysis:
> * powertop
> * top/htop
> * perf record -a -g

Does "perf report -g" show where the calls to the
migration code are coming from? Something must be
migrating tasks around, and it will be good to know
what it is...

--
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/