Re: Remote fork() and Parallel programming

Larry McVoy (lm@bitmover.com)
Sun, 14 Jun 1998 22:33:10 -0700

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Vadim E. Kogan: "[patch] /proc fix and experimental security patch"
Previous message: Colin Plumb: "Re: /dev/tsc and timekeeping"

Let's move this discussion to the clusters alias, shall we?

---------

: The second is a pseudo-numa platform, here process migration might be a
: win..

: Switched gigabit myrnet doesn't compair too badily to the 60ns edo ram and
: 64bit memory bus.. Sure it's slower.. But by what.. A factor of 3 maby?

SGI Origins are numa machines. Local memory is about 400ns from the
processor; remote memory is 400+100ns/hop - it's a hypercube so only
very large systems are more than one hop away. Next gen are 1/2 the
latency.

PC's have main memory latencies on the order of 170ns.

The fastest TCP latency I know of is about 80,000ns. The fastest Unet
(no protocols, network mapped into the process' address space) latency
is about 30,000ns.

Looks more like a factor of 175 times slower to me. Not to mention that
getting some memory costs 1 CPU instruction and two bus transactions.
How many CPU instructions do you think it is to receive a packet? Handle
an interrupt? Go into the kernel?

: Imagine a AGP SAN card that runs at full agp speed. That device could move
: data at the speed our ram currently does.. I'm sure such cards could be
: available within 5 years, if there was the approiate demand..

You can do the /bandwidth/ right now. SGI's have HIPPI cards that go
at 100Mbyte/sec sustained. Next gen is 800MB/sec sustained.

It's not the bandwidth that's the issue, it's the latency. Bandwidth is
easy. Latency is hard. DSM systems /all/ die because of latency issues.

It is my claim that 100% of the DSM systems cn be proven to be a bad idea
if the designers had sat down and measured the local versus remote memory
latency. Numbers talk. And DSM numbers just show you that it isn't a
very useful idea.

: If you say that process migration is okay on SMP, then it must be okay on
: a cluster if the cluster's interconnect bandwidth is like that of smp.

Process migration on SMP is a horrible idea. As I have mentioned
repeatedly, read any one of the dozens of papers on a cache affinity.
They all show how in almost all cases, the absolute worst thing you
could do for performance is to reschedule a process on another CPU.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu

Next message: Vadim E. Kogan: "[patch] /proc fix and experimental security patch"
Previous message: Colin Plumb: "Re: /dev/tsc and timekeeping"