Re: Interesting Technical Question...

Larry McVoy (lm@neteng.engr.sgi.com)
1 Mar 1997 05:01:38 GMT


This is called checkpoint/restart, it's been done in other Unix systems
for some time. It is mostly used in the super computing world. If
you go off to implement it, a few words of advice:

. Look at Unicos (I'll post man pages if you like), I think that
one is widely used (for some definition of widely)
. Sockets are hard/impossible. However, there is a useful hack
for sockets that sometimes works. Checkpoint both ends of the
socket at the same time. For cluster based applications this
works fine.

Kirk Bauer (kirk@kaybee.gt.ed.net) wrote:
: Today, I was realizing that it would be nice to be able to take
: a given process and basically save it to disk. This way, if you
: have some big process that has been running for weeks and you must
: reboot for some reason, you can save that process to disk and then
: reboot. Once rebooted, you could reload that process from its saved
: image...

--
---
Larry McVoy     lm@sgi.com     http://reality.sgi.com/lm     (415) 933-1804