Re: [PATCH -mm] do_migrate_pages() calls migrate_to_node() even iftask is already on a correct node

From: Larry Woodman
Date: Thu Mar 22 2012 - 15:07:13 EST


On 03/22/2012 02:51 PM, Christoph Lameter wrote:
On Thu, 22 Mar 2012, KOSAKI Motohiro wrote:

CC to Christoph.

While moving tasks between cpusets I noticed some strange behavior.
Specifically if the nodes of the destination
cpuset are a subset of the nodes of the source cpuset do_migrate_pages()
will move pages that are already on a node
in the destination cpuset. The reason for this is do_migrate_pages() does
not check whether each node in the source
nodemask is in the destination nodemask before calling migrate_to_node(). If
we simply do this check and skip them
when the source is in the destination moving we wont move nodes that dont
need to be moved.

Adding a little debug printk to migrate_to_node():

Without this change migrating tasks from a cpuset containing nodes 0-7 to a
cpuset containing nodes 3-4, we migrate
from ALL the nodes even if they are in the both the source and destination
nodesets:

Migrating 7 to 4
Migrating 6 to 3
Migrating 5 to 4
Migrating 4 to 3
Migrating 1 to 4
Migrating 3 to 4
Migrating 0 to 3
Migrating 2 to 3
Wait.

This may be non-optimal for cpusets, but maybe optimal migrate_pages,
especially
the usecase is HPC. I guess this is intended behavior. I think we need to hear
Christoph's intention.

But, I'm not against this if he has no objection.
The use case for this is if you have an app running on nodes 3,4,5 on your
machine and now you want to shift it to 4,5,6. The expectation is that the
location of the pages relative to the first node stay the same.
Application may manage their locality given a range of nodes and each of
the x .. x+n nodes has their particular purpose.
So to be clear on this, in that case the intention would be move 3 to 4, 4 to 5 and 5 to 6
to keep the node ordering the same?

Larry
If you justd copy 3 to 6 then the app may get confused when doing
additional allocations since different types of information is now stored
on the "first" node (which is now 4).



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email:<a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx</a>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/