Re: [Lse-tech] [PATCH] cpusets - big numa cpu and memory placement

From: Paul Jackson
Date: Tue Oct 05 2004 - 21:49:55 EST


Matthew wrote:
>
> I feel that the actual implementation, however, is taking
> a wrong approach, because it attempts to use the cpus_allowed mask to
> override the scheduler in the general case. cpus_allowed, in my
> estimation, is meant to be used as the exception, not the rule.

I agree that big chunks of a large system that are marching to the beat
of two distinctly different drummers would better have their schedulers
organized along the domains that you describe, than by brute force abuse
of the cpus_allowed mask.

I look forward to your RFC, Matthew. Though not being a scheduler guru,
I will mostly have to rely on the textual commentary in order to
understand what it means.

Existing finer grain placement of CPUs (sched_setaffinity) and Memory
(mbind, set_mempolicy) already exists, and is required by parallel
threaded applications such as OpenMP and MPI are commonly used to
develop.

The finer grain use of non-exclusive cpusets, in order to support
such workload managers as PBS and LSF in managing this finer grained
placement on a system (domain) wide basis should not be placing any
significantly further load on the schedulers or resource managers.

The top level cpusets must provide additional isolation properties so
that separate scheduler and resource manager domains can work in
relative isolation. I've tried hard to speculate what these additional
isolation properties might be. I look forward to hearing from the CKRM
and scheduler folks on this. I agree that simple unconstrained (ab)use
of the cpus_allowed and mems_allowed masks, at that scale, places an
undo burden on the schedulers, allocators and resource managers.

--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@xxxxxxx> 1.650.933.1373
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/