Re: [2.6.24-rc8-mm1][regression?] numactl --interleave=all doesn'tworks on memoryless node.

From: David Rientjes
Date: Tue Feb 05 2008 - 15:08:34 EST


On Tue, 5 Feb 2008, Paul Jackson wrote:

> Since any of those future patches only add optional modes
> with new flags, while preserving current behaviour if you
> don't use one of the new flags, therefore the current behavior
> has to work as best it can.
>

There's a subtlety to this issue that allows it to be fixed and easily
extended for two upcoming changes:

- Paul Jackson's mempolicy and cpuset interactions change that will
probably allow set_mempolicy() callers to specify with a MPOL_*
flag whether they are referring to "dynamic" or "static" nodemasks[*],
and

- node hotplug (both add and remove) that will change the state of a
node with an identical id.

Paul, with his patch, will need to preserve the "intent" of the mempolicy
as the nodemask that was passed by the user and attempt on all successive
rebinds to accomodate that intent as much as possible.

So at the time of rebind it is quite simple to intersect the set of system
nodes that have memory with the intent of the mempolicy to yield the
effected nodemask. This nodemask is saved in the mempolicy (pol->v.nodes
in this case for interleave) and only steps through the set of nodes that
can allow interleaved allocations.

When the available nodes changes, either by cpuset change or node hotplug,
the rebind is quite simple when the intent is preserved. So we're going
to need an additional nodemask_t added to struct mempolicy that saves this
intent and modify contextualize_policy() to allow it. This will basically
make any set_mempolicy() call succeed even if the application does not
have access to any of the mempolicy nodes because it is possible that they
will become accessible in the future. In that case the mempolicy is
effectively MPOL_DEFAULT until the desired nodes become available and it
is effected.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/