Re: Remote fork() and Parallel Programming

mshar@vax.ipm.ac.ir
Wed, 17 Jun 1998 13:58:50 +0330

Messages sorted by: [ date ][ thread ][ subject ][ author ]
Next message: Andreas Schwab: "Re: undeletable files in /lost+found"
Previous message: Anthony Barbachan: "Re: gcc 2.7.2.3 (I need the tar file)"
Next in thread: Alan Cox: "Re: select/threads in 2.0.34 and newer 2.1 kernels"
Reply: Alan Cox: "Re: select/threads in 2.0.34 and newer 2.1 kernels"

Hi,

Distributed programming has traditionally been a very specialized and
"private" branch of computer science. Usually very big organizations or
universities are/were fortunate enough to have access to custom-designed
hardware suitable for distributed programming. Such hardware are/were
very expensive and mostly used to solve big problems (that is how they
justify their costs). Getting good performance is of paramount imprortance
here; ease of programming is placed very low on their list of priorities.
They are willing to hire very talented programmers and spend good money for
training and program development. Programs are tuned for a special harware,
and redesigning them is usually very hard.

Most of the current tools and mentality for distributed programming stems
from these environments. We can call this E1 (environment 1). In E1,
distributed programs are nearly always parallel programs. Here developing
and maintaing applications is very costly and hard. The number of programmers
willing and able to enter this environment is not very high. Ease of
programming can be neglected because these programmers can be trained for
the job. The programmers are _expected_ to work a lot on distributed
application development.

I think the E1 mentality can not be applied to the rapidly changing (read:
improving) world of distributed computing.

Think about all those networked PCs and workstations all over the world.
They are used in offices, universities, and even homes (call them E2). Each
of these computers are usually used by a single person, and most of the time
they have nothing to run, which means that they waste resources. They mostly
use state of the art hardware and networking technology (many people
frequesntly upgarde their PCs, and gigabit Ethernet is coming fast). This is
an _enourmous_ source of computing power! It is very interesting if we can
use their collective powers by forming clusters.

If this is done, then suddenly thousands of developers can start writing
distributed programs. Now we can hope for scores of new, useful applications;
something we can not have if distributed programming stays in the domain of a
minority.

A good analogy can be found in the realm of programming languages. at one
time there was a very small number of programmers in the world, and
they programmed computers in machine language or assembly. But look at the
situation now. Do you think there would be this many programmers and useful
software if we had continued to use assembly language instead of higher level
languages? How many people were able to learn and use assembly? How much
harder would it be to develop some very big applications we are using today?

Assembly language gives the programmers a lot of control over the programs'
behaviour, and so they can write very efficient programs. Higher level
languages (like pascal and prolog) provide a higher level of abstraction, and
create a view of the system that can be very different from the actual
hardware. This does result in some inefficiency. One has to compile a high
level program, and it is always less efficient than when written in assembly.
Inspite of all this, the trend has been to _add_ to the abstraction. Look at
Delphi and visual basic as examples.

Programmers still _can_ use assembly if they think they have to. Providing
higer levels of abstraction does not prohibit us from using the blocks that
were employed to build those abstractions. There is nothing mutually
exclusive here.

The case in the move from normal systems to distributed clusters is very
similar. Do we want to have applications that use more than one computer in
a cluster to solve a problem or not? If no, then there is little sense in
building a cluster in the first place and the discussion ends, but if the
answer is yes, then we have to think about a suitable programming model,
because the presence of a network introduces many new complications into
programming. Commonsense tells us that the closer a model is to what people
are used to, the easier it is to be used.

Most of us are trained to think of shared memory as the main mechanism of
data exchange. We use gloabl varibales or the arguments to procedures to
let different parts of our application to receive data and later return
any results. I know anybody can learn to use other methods of programming,
but considering the number of people we are talking about here, it is better
if we keep the conventional techniques of programming as much as possible.

Some of what we achive by doing so is:

*) Application programmers don't need to learn new programming mentalities
and techniques.

*) We can continue to use many of the conventional algorithms

*) The source code of normal and distributed applications will be very
similar. This will be a great help in debugging and maintanance of
distributed program (and we are think of thousands of such applications)

Object Oriented Programming is one model to use for distributed application
development. They have some rather nice properties, but here we are talking
about Linux and its thousands of applications. Linux is not object oriented,
so we can get it out of our consideration. Now we come to the message
passing vs DSM arguments. Here is what we observe:

*) For distributed programming to make sense, we _have_ to transfer data
over the network. We _have_ to tolerate the difference in speed between a
network and a local computer bus. This is an inherent property that has
_nothing_ to do with the programming model we use.

*) Networks are becoming faster everyday, and they do reach our PCs. Even
Terabit networks _do_ exist. Latency is becoming the dominant factor in
transfer times. In other words, the time of actual data transfer is starting
to become negligible.

*) Messages (in the sense of TCP/IP packets) can be considered the privimite
method of data transfer in a network. Other methods, like PVM's messages or
DIPC's shared memories, are implemented on top of this mechanism. PVM and
DIPC both use TCP/IP to provide some abstractions of the underlaying
hardware. PVM's messages are not very different from TCP/IP's packets in the
way programmer use them (both are essentially routines that take some
arguments and transfer data), but PVM's messages offer many services that
TCP/IP does not. Examples include allowing the application to use logical
computer address instead of IP addresses, or the conversion of the data
contents to the receiver's suitable representation format. This eases the
work of a distributed application programmer a lot. It is no wonder that so
many people prefer to use systems like PVM and MPI than to use only TCP/IP.
Obviously a PVM message has to cross the network, which means the program
initiating the transfer has to tolerate the network latency and the time
required to transfer the data.

A message passing programmer has to worry about what the application is
supposed to do in the first place. He also has to worry about making sure
that the data needed for a compution is transfered to the right computer at
the right time. He has to do such transfers explicitly, meaning that most
probably the source code will be very different from the case where
no such explicit data movements were needed.

DSM is another step up the abstraction ladder. It completely hides the
message passing requirements of a clutser, and allows the programmer to
develop his distributed application the same way as a normal application.
The developer continues to use shared memory to transfer data between
different parts of the application, and can mostly forget about the presence
of a network. I say "mostly forget", because the fact that the programmer
does not see a network will not cause it to vanish. The same costs in data
transfer to other computers are still there, so using certain techniques like
busy waiting will result in poor peformance, because though the prgrammer can
not see, his application is generating a lot of network traffic. Thus it is
very clear that DSM can not relieve the programmer of all concerns about
distributed programming, but it is still useful, because it makes the average
programmer's job very easy and also produces a more readable and maintainable
program.

DSM to message passing can be considered like assembly to a high level
language. Assembly language has its own advantages, but because higher
level languages are easier to learn and program with, most people prefer
them. In fact, the trend has always been to provide even higher level
languages. Write a small program in Delphi or visual basic and you get a
_very_ big compiled language. For most people it does not matter much that
their executable is well over 100KB, while if written in assembly, it would
be a few hundred bytes.

Still, no one can be forced to use a high level language. If someone thinks
that using assembly language is the answer to his application's needs, then
he can very well do so. The same is true for DSM. You want to simulate an
atomic explosion and need every bit of resource you've got? No problem, pay
a lot of money to build custom hardware and hire good programmers; then wait
till the application is designed and implemented.

But not all those people with access to a cluster of PCs or workstations
need all the performance their computers can give them. Most can live with
some overhead and instead get ease of programming. In fact, in most cases if
there is no easy way to program a cluster, then there will be no clutser at
all.

If we offer the harder programming models, then we are depriving ourselves
of a lot of useful applications that could be developed by thousands of
programmer who can't or don't want to learn unfamiliar programming models.
This is a waste of both programming talent and computer equipment (no
program, no resource usage).

The arguments about the merits of having transparent process migration
follow from the same way of thinking: that applications should not worry
about where they are running. The programmer should not have to get involved
in checkpointing and restarting his application. These issues add to the
visible complexity of a distributed program. This will make developing
distributed applications harder; something I and a lot of other people don't
like at all.

No method I know of is perfect, but some methods are preferable to others
accroding to one's priorities. For me, it is the ease of programming.
Experience in the computer industry has shown that one need not worry too
much about hardware performance and its rate of improvement. It is software
development that needs consideration because a hard-to-use programming model
does not improve every two years or so. As evidence, just consider that
distributed programming is a rather old branch of computer science, but it
has not yet found widespread usage by application programmers. It will
probably remain so as long as people consider distributed programming as
something that _should_ have the active involvement of the programmer in the
process of making the application work in a distributed environment.

I think "we" having thousands of distributed applications available for our
PC clusters that are not %100 efficient in their resource usage is better
than "they" having a few very efficient and high performance distributed
programs running in national laboratories.

Everything is up to the Linux community.

-Kamran Karimi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu

Next message: Andreas Schwab: "Re: undeletable files in /lost+found"
Previous message: Anthony Barbachan: "Re: gcc 2.7.2.3 (I need the tar file)"
Next in thread: Alan Cox: "Re: select/threads in 2.0.34 and newer 2.1 kernels"
Reply: Alan Cox: "Re: select/threads in 2.0.34 and newer 2.1 kernels"