Re: [PATCH 6/9] numa,sched: normalize faults_cpu stats and weigh byCPU use

From: Mel Gorman
Date: Tue Jan 28 2014 - 05:01:42 EST

Next message: Michal Simek: "[GIT PULL] arch/microblaze changes for 3.14"
Previous message: Mel Gorman: "Re: [PATCH 5/9] numa,sched,mm: use active_nodes nodemask to limitnuma migrations"
In reply to: riel: "[PATCH 6/9] numa,sched: normalize faults_cpu stats and weigh by CPU use"
Next in thread: tip-bot for Rik van Riel: "[tip:sched/numa] sched/numa: Normalize faults_cpu stats and weigh by CPU use"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Jan 27, 2014 at 05:03:45PM -0500, riel@xxxxxxxxxx wrote:
> From: Rik van Riel <riel@xxxxxxxxxx>
>
> Tracing the code that decides the active nodes has made it abundantly clear
> that the naive implementation of the faults_from code has issues.
>
> Specifically, the garbage collector in some workloads will access orders
> of magnitudes more memory than the threads that do all the active work.
> This resulted in the node with the garbage collector being marked the only
> active node in the group.
>
> This issue is avoided if we weigh the statistics by CPU use of each task in
> the numa group, instead of by how many faults each thread has occurred.
>
> To achieve this, we normalize the number of faults to the fraction of faults
> that occurred on each node, and then multiply that fraction by the fraction
> of CPU time the task has used since the last time task_numa_placement was
> invoked.
>
> This way the nodes in the active node mask will be the ones where the tasks
> from the numa group are most actively running, and the influence of eg. the
> garbage collector and other do-little threads is properly minimized.
>
> On a 4 node system, using CPU use statistics calculated over a longer interval
> results in about 1% fewer page migrations with two 32-warehouse specjbb runs
> on a 4 node system, and about 5% fewer page migrations, as well as 1% better
> throughput, with two 8-warehouse specjbb runs, as compared with the shorter
> term statistics kept by the scheduler.
>
> Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Cc: Mel Gorman <mgorman@xxxxxxx>
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: Chegu Vinod <chegu_vinod@xxxxxx>
> Signed-off-by: Rik van Riel <riel@xxxxxxxxxx>

Major changes are related to the weight calculations to avoid overflow
and the avg runtime is calculated based on a longer runtime than the v4
version. Both seem sane so

Acked-by: Mel Gorman <mgorman@suse>

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Michal Simek: "[GIT PULL] arch/microblaze changes for 3.14"
Previous message: Mel Gorman: "Re: [PATCH 5/9] numa,sched,mm: use active_nodes nodemask to limitnuma migrations"
In reply to: riel: "[PATCH 6/9] numa,sched: normalize faults_cpu stats and weigh by CPU use"
Next in thread: tip-bot for Rik van Riel: "[tip:sched/numa] sched/numa: Normalize faults_cpu stats and weigh by CPU use"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]