Re: 2.6.13-rc3-mm1 (ckrm)

From: Mark Hahn
Date: Sun Jul 17 2005 - 14:04:19 EST


> I suspect that the main problem is that this patch is not a mainstream
> kernel feature that will gain multiple uses, but rather provides
> support for a specific vendor middleware product used by that
> vendor and a few closely allied vendors. If it were smaller or
> less intrusive, such as a driver, this would not be a big problem.
> That's not the case.

yes, that's the crux. CKRM is all about resolving conflicting resource
demands in a multi-user, multi-server, multi-purpose machine. this is a
huge undertaking, and I'd argue that it's completely inappropriate for
*most* servers. that is, computers are generally so damn cheap that
the clear trend is towards dedicating a machine to a specific purpose,
rather than running eg, shell/MUA/MTA/FS/DB/etc all on a single machine.

this is *directly* in conflict with certain prominent products, such as
the Altix and various less-prominent Linux-based mainframes. they're all
about partitioning/virtualization - the big-iron aesthetic of splitting up
a single machine. note that it's not just about "big", since cluster-based
approaches can clearly scale far past big-iron, and are in effect statically
partitioned. yes, buying a hideously expensive single box, and then chopping
it into little pieces is more than a little bizarre, and is mainly based
on a couple assumptions:

- that clusters are hard. really, they aren't. they are not
necessarily higher-maintenance, can be far more robust, usually
do cost less. just about the only bad thing about clusters is
that they tend to be somewhat larger in size.

- that partitioning actually makes sense. the appeal is that if
you have a partition to yourself, you can only hurt yourself.
but it also follows that burstiness in resource demand cannot be
overlapped without either constantly tuning the partitions or
infringing on the guarantee.

CKRM is one of those things that could be done to Linux, and will benefit a
few, but which will almost certainly hurt *most* of the community.

let me say that the CKRM design is actually quite good. the issue is whether
the extensive hooks it requires can be done (at all) in a way which does
not disporportionately hurt maintainability or efficiency.

CKRM requires hooks into every resource-allocation decision fastpath:
- if CKRM is not CONFIG, the only overhead is software maintenance.
- if CKRM is CONFIG but not loaded, the overhead is a pointer check.
- if CKRM is CONFIG and loaded, the overhead is a pointer check
and a nontrivial callback.

but really, this is only for CKRM-enforced limits. CKRM really wants to
change behavior in a more "weighted" way, not just causing an
allocation/fork/packet to fail. a really meaningful CKRM needs to
be tightly integrated into each resource manager - effecting each scheduler
(process, memory, IO, net). I don't really see how full-on CKRM can be
compiled out, unless these schedulers are made fully pluggable.

finally, I observe that pluggable, class-based resource _limits_ could
probably be done without callbacks and potentially with low overhead.
but mere limits doesn't meet CKRM's goal of flexible, wide-spread resource
partitioning within a large, shared machine.

regards, mark hahn.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/