Re: Remote fork() and Parallel programming

yodaiken@chelm.cs.nmt.edu
Sun, 14 Jun 1998 14:57:06 -0600


On Mon, Jun 15, 1998 at 01:43:54AM +0330, mshar@vax.ipm.ac.ir wrote:
> What wrong opinion about DSM? Please refer to DIPC to know of my opinions
> about DSM.

It's inherently slow. To paraphrase St. Seymour:
Ok(slow) -> !NeedComputer

> For process migartion, a simple load-measurement will do for the first
> implementations. All computers are polled periodically, and the jobs are
> migrated if some thresholds are exceeded in a machine. Because of the hint
> mechanism, the application programmer can inform the system not to migrate
> processes are are short lived, or that use many local resouces.

A) Data shows this does not work. See thirty years of literature on
process migration.
B) Why are "hints" good, but user space directives bad?

> >x[0] = 1;
> >if( remote_fork())
> > while(x[0]) == 1); /*where x points to distributed shared memory */
> >else x[0] = 0;
> >
> >Works one way for real shared memory, another way for DSM. How do
> >you fix it?
>
> It works perfectly when using DSM with strict consistency, but it could be
> slow. Like many OS text books, I'd tell the programmer to use semaphores

So the user notes that mysterious changes in performance happen under
the process migration/DSM system, while the programmer in the MPI world
get predictable speedups. MPI wins.

And most OS textbooks are nonsense. If you can use semaphore synchronization,
why do you need shared memory? Just send data in messages.

> Yes, synchronizing via DSM can kill a program. A bit of programming
> discipline is all that is needed to mitigate such problems. I believe the
> advantages of DSM by far outweigh the disadvantages.

But DSM hides the difference. And DSM with process migration makes it
impossible to predict performance.

> >Only if the OS can properly make the cost tradeoff calculation. How
> >does it do that?
>
> It depends on many factors, like the network speed, other computers' speeds,
> their current loads, etc. The algorithms that will decide on migration
> should be tuned gradually to have an acceptable output.

In other words, you have no idea.

> If the user sees a drop in performance, after allowing process migration in
> one run of the program (an error in the migration-decision algorithm), then
> it will allow the program to run only locally the next time. This works
> because most programs are executed many times.

If the program can run locally, why use the network at all?

> You really think so? _Any_ elementary OS text book informs its readers that
> using shared memory for synchronization is bad even in a single computer.
> This is called busy waiting. So the answer is: use mechanisms like semaphores
> and messages for this.

Elementary OS textbooks, as a rule, are hand waving gibberish.
Consider, for example, a circular linked list containing live measurement
data that is collected by process A, displayed by process B, and
where stale data is simply overwitten by A. Trivial with shared memory.
No busy waiting at all.

---------------------------------
Victor Yodaiken
Department of Computer Science
New Mexico Institute of Mining and Technology
Socorro NM 87801
Homepage http://www.cs.nmt.edu/~yodaiken
PowerPC Linux page http://linuxppc.cs.nmt.edu
Real-Time Page http://rtlinux.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu