Re: [Lse-tech] [patch] sched-domain cleanups, sched-2.6.5-rc2-mm2-A3

From: Erich Focht
Date: Wed Mar 31 2004 - 16:28:38 EST


On Tuesday 30 March 2004 17:01, Martin J. Bligh wrote:
> > I don't think it's worth to wait and hope that somebody shows up with
> > a magic algorithm which balances every kind of job optimally.
>
> Especially as I don't believe that exists ;-) It's not deterministic.

Right, so let's choose the initial balancing policy on a per process
basis.

> > Benchmarks simulating "user work" like SPECsdet, kernel compile, AIM7
> > are not relevant for HPC. In a compute center it actually doesn't
> > matter much whether some shell command returns 10% faster, it just
> > shouldn't disturb my super simulation code for which I bought an
> > expensive NUMA box.
>
> OK, but the scheduler can't know the difference automatically, I don't
> think ... and whether we should tune the scheduler for "user work" or
> HPC is going to be a hotly contested point ;-) We need to try to find
> something that works for both. And suppose you have a 4 node system,
> with 4 HPC apps running? Surely you want each app to have one node to
> itself?

If the machine is 100% full all the time and all apps demand the same
amount of bandwidth, yes, I want 1 job per node. If the average load is
less than 100% (sometimes only 2-3 jobs are running) then I'd prefer to
spread the processes of a job across the machine. The average bandwidth
per process will be higher. Modern NUMA machines have big bandwidth to
neighboring nodes and not too bad latency penalties for remote accesses.

Regards,
Erich

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/