Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Rick Lindsley
Date: Thu Oct 07 2004 - 05:58:07 EST


> I don't see what non-exclusive cpusets buys us.

One can nest them, overlap them, and duplicate them ;)

For example, we could do the following:

Once you have the exclusive set in your example, wouldn't the existing
functionality of CKRM provide you all the functionality the other
non-exclusive sets require?

Seems to me, we need a way to *restrict use* of certain resources
(exclusive) and a way to *share use* of certain resources (non-exclusive.)
CKRM does the latter right now, I believe, but not the former. (Does
CKRM support sharing hierarchies as in the dept/group/individual example
you used?)

What about this model:

* All exclusive sets exist at the "top level" (non-overlapping,
non-hierarchical) and each is represented by a separate sched_domain
hierarchy suitable for the hardware used to create the cpuset.
I can't imagine anything more than an academic use for nested
exclusive sets.

* All non-exclusive sets are rooted at the "top level" but may
subdivide their range as needed in a tree fashion (multiple levels
if desired). Right now I believe this functionality could be
provided by CKRM.

Observations:

* There is no current mechanism to create exclusive sets; cpus_allowed
alone won't cut it. A combination of Matt's patch plus Paul's
code could probably resolve this.

* There is no clear policy on how to amiably create an exclusive set.
The main problem is what to do with the tasks already there.
I'd suggest they get forcibly moved. If their current cpus_allowed
mask does not allow them to move, then if they are a user process
they are killed. If they are a system process and cannot be
moved, they stay and gain squatter's rights in the newly created
exclusive set.

* Interrupts are not under consideration right now. They land where
they land, and this may affect exclusive sets. If this is a
problem, for now, you simply lay out your hardware and exclusive
sets more intelligently.

* Memory allocation has a tendency and preference, but no hard policy
with regards to where it comes from. A task which starts on one
part of the system but moves to another may have all its memory
allocated relatively far away. In unusual cases, it may acquire
remote memory because that's all that's left. A memory allocation
policy similar to cpus_allowed might be needed. (Martin?)

* If we provide a means for creating exclusive sets, I haven't heard
a good reason why CKRM can't manage this.

Rick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/