Re: [patch 2/2] cpusets: add interleave_over_allowed option

From: David Rientjes
Date: Thu Oct 25 2007 - 22:11:49 EST


On Thu, 25 Oct 2007, Paul Jackson wrote:

> David - could you describe the real world situation in which you
> are finding that this new 'interleave_over_allowed' option, aka
> 'memory_spread_user', is useful? I'm not always opposed to special
> case solutions; but they do usually require special case needs to
> justify them ;).
>

Yes, when a task with MPOL_INTERLEAVE has its cpuset mems_allowed expanded
to include more memory. The task itself can't access all that memory with
the memory policy of its choice.

Since the cpuset has changed the mems_allowed of the task without its
knowledge, it would require a constant get_mempolicy() and set_mempolicy()
loop in the application to catch these changes. That's obviously not in
the best interest of anyone.

So my change allows those tasks that have already expressed the desire to
interleave their memory with MPOL_INTERLEAVE to always use the full range
of memory available that is dynamically changing beneath them as a result
of cpusets. Keep in mind that it is still possible to request an
interleave only over a subset of allowed mems: but you must do it when you
create the interleaved mempolicy after it has been attached to the cpuset.
set_mempolicy() changes are always honored.

The only other way to support such a feature is through a modification to
mempolicies themselves, which Lee has already proposed. The problem with
that is it requires mempolicy support for cpuset cases and modification to
the set_mempolicy() API. My solution presents a cpuset fix for a cpuset
problem.

> I suspect that the general case solution would require having the user
> pass in two nodemasks, call them ALL and SUBSET, requesting that
> relative to the ALL nodes, interleave be done on the SUBSET nodes.
> That way, even if say the task happened to be running in a cpuset with
> a -single- allowed memory node at the moment, it could express its user
> memory interleave memory needs for the general case of any number of
> nodes. Then for whatever nodes were currently allowed by the cpuset
> to that task at any point, the nodes_remap() logic could be done to
> derive from the ALL and SUBSET masks, and the current allowed mask,
> what nodes to interleave that tasks user allocations over.
>

I find it hard to believe that a single cpuset with a single
memory_spread_user boolean is going to include multiple tasks that request
interleaved mempolicies over differing nodes within the cpuset's
mems_allowed. That, to me, is the special case.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/