Re: [PATCH -mm] do_migrate_pages() calls migrate_to_node() even iftask is already on a correct node

From: Christoph Lameter
Date: Thu Mar 22 2012 - 14:51:49 EST


On Thu, 22 Mar 2012, KOSAKI Motohiro wrote:

> CC to Christoph.
>
> > While moving tasks between cpusets I noticed some strange behavior.
> > Specifically if the nodes of the destination
> > cpuset are a subset of the nodes of the source cpuset do_migrate_pages()
> > will move pages that are already on a node
> > in the destination cpuset. The reason for this is do_migrate_pages() does
> > not check whether each node in the source
> > nodemask is in the destination nodemask before calling migrate_to_node(). If
> > we simply do this check and skip them
> > when the source is in the destination moving we wont move nodes that dont
> > need to be moved.
> >
> > Adding a little debug printk to migrate_to_node():
> >
> > Without this change migrating tasks from a cpuset containing nodes 0-7 to a
> > cpuset containing nodes 3-4, we migrate
> > from ALL the nodes even if they are in the both the source and destination
> > nodesets:
> >
> > Migrating 7 to 4
> > Migrating 6 to 3
> > Migrating 5 to 4
> > Migrating 4 to 3
> > Migrating 1 to 4
> > Migrating 3 to 4
> > Migrating 0 to 3
> > Migrating 2 to 3
>
> Wait.
>
> This may be non-optimal for cpusets, but maybe optimal migrate_pages,
> especially
> the usecase is HPC. I guess this is intended behavior. I think we need to hear
> Christoph's intention.
>
> But, I'm not against this if he has no objection.

The use case for this is if you have an app running on nodes 3,4,5 on your
machine and now you want to shift it to 4,5,6. The expectation is that the
location of the pages relative to the first node stay the same.
Application may manage their locality given a range of nodes and each of
the x .. x+n nodes has their particular purpose.

If you justd copy 3 to 6 then the app may get confused when doing
additional allocations since different types of information is now stored
on the "first" node (which is now 4).



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/