Re: [patch 2/2] cpusets: add interleave_over_allowed option

From: Christoph Lameter
Date: Thu Oct 25 2007 - 20:28:20 EST


On Thu, 25 Oct 2007, David Rientjes wrote:

> The problem occurs when you add cpusets into the mix and permit the
> allowed nodes to change without knowledge to the application. Right now,
> a simple remap is done so if the cardinality of the set of nodes
> decreases, you're interleaving over a smaller number of nodes. If the
> cardinality increases, your interleaved nodemask isn't expanded. That's
> the problem that we're facing. The remap itself is troublesome because it
> doesn't take into account the user's desire for a custom nodemask to be
> used anyway; it could remap an interleaved policy over several nodes that
> will already be contended with one another.

Right. So I think we are fine if the application cannot setup boundaries
for interleave.


> Normally, MPOL_INTERLEAVE is used to reduce bus contention to improve the
> throughput of the application. If you remap the number of nodes to
> interleave over, which is currently how it's done when mems_allowed
> changes, you could actually be increasing latency because you're
> interleaving over the same bus.

Well you may hit some nodes more than others so a slight performance
degradataion.

> This isn't a memory policy problem because all it does is effect a
> specific policy over a set of nodes. With my change, cpusets are required
> to update the interleaved nodemask if the user specified that they desire
> the feature with interleave_over_allowed. Cpusets are, after all, the
> ones that changed the mems_allowed in the first place and invalidated our
> custom interleave policy. We simply can't make inferences about what we
> should do, so we allow the creator of the cpuset to specify it for us. So
> the proper place to modify an interleaved policy is in cpusets and not
> mempolicy itself.

With that MPOL_INTERLEAVE would be context dependent and no longer
needs translation. Lee had similar ideas. Lee: Could we make
MPOL_INTERLEAVE generally cpuset context dependent?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/