Re: Remote fork() and Parallel programming

Larry McVoy (lm@bitmover.com)
Tue, 16 Jun 1998 13:17:41 -0700


: > Quick - name me one parallel computation that would leave a CPU idle
: > for 60 seconds while the other ones are busy on an SMP. The point?
: > All the parallel computations I've seen statically load balance at
: > initialization time and then never move.
:
: Realworld system load.. (there 1 second flat)..
:
: Think users a b c and d start up apps.. Suppose apps dont migrate.
:
: cpu1 cpu2
: a,b c,d
:
: Then users a and b's apps finish (the just ran ls) while c and d are
: running long standing computation..

Well, first of all, that's an answer to a different question. I asked
for one parallel computation, since that was the topic under discussion,
and you've replied with two unrelated applications.

But anyway, it's a good question. And the answer is still the same,
you leave them where they landed. I challenge you to show me an example
of what you are implying should happen (that c or d should move) that
doesn't just make the system thrash.

For time sharing loads, moving things around dynamically has been shown
to be a lose and I have independently verified it both for synthetic
loads like AIM and MusBus, and for real loads (multiple parallel makes).

I'd love to hear about real world examples that contradict my experience.
I'm sorry to say that I'm completely uninterested in theoretical examples,
which seem to be more plentiful, especially on migration side of the fence.

: We just refering to SMP.. But, still.. On a system with 10000 users why
: not migrate vi.. You advocated remote exec..

Let's be clear about migrate. If by migrate, you mean "start this job up
on a less loaded node" then I'm 100% in agreement with you, that's a great
model and gets you most of the way to a balanced system with the least
amount of work.

If by migrate, you mean "let's take this already running vi job and move it",
I don't think that is a very good idea.

: > So explain to me what application will be nicely balanced for 30
: > minutes and then need to be moved to be balanced? How would one go
: > about deisgning such an application?
:
: You are thinking single purpose computers.

Yes and no. For parallel computations, the model I want is that if one
process/thread is running, they are all running, and nothing else is
running. When you switch out one process, then you shut down all of
them.

You could say that is single purpose, but throw gang scheduling into the
mix and it is no longer single purpose. Now each parallel computation
is a "gang" of processes, and when one process in a gang is running, they
all are running. You can switch between gangs and have multiple parallel
computations running, time sliced. The Linux on the AP 1000 showed
absolutely linear performance doing this (if you ran N jobs serial and
added up the time, then ran all N in parallel gang scheduled and added
up the time, you got the same number in both cases. However, if you
didn't gang schedule, the system thrashed horribly).

If you want to add regular load into the mix, that's fine too, when the
scheduler on one node switches from a ganged process to some random vi
job or whatever, it shuts down the rest of the gang (letting whatever
random processes there are on the other nodes get a time slice at the
same time).

At this point, I think it is pretty hard to view it as a single purpose
computer.

So, I go back to the same question:

: > So explain to me what application will be nicely balanced for 30
: > minutes and then need to be moved to be balanced? How would one go
: > about designing such an application?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu