Re: Linux Checkpoint-Restart - v19

From: Jiro SEKIBA
Date: Mon Mar 15 2010 - 04:55:36 EST


Hi,

I'm trying to evaluate external checkpoint/restart with cr-v19 kernel.
However, when I restart, I got "Killed" message in stdout.
Do you have any tips or clue that are not in
Documentation/checkpoint/usage.txt ?

I'm using kernel pulled from
git://git.ncl.cs.columbia.edu/pub/git/linux-cr.git .
checkout tag named "ckpt-v19". Base distro is ubuntu 9.10.

I ran self checkpioint/restart sample program in Documentation/checkpint.
It works as written in usage.txt.
However, I can not make external checkpint/restart work properly.

I made a simple test program bellow and create checkpoint externally using
the program in Documentation/checkpoint/, it looks checkpoint file is
created properly.
However, when I ran self_restart < ckpt.image, I got "Killed" message.

Is there any extra configurations other than cgroup freezer and
checkpint/restart ?
Or any limitation other than closing stdout,err,in ?

what I did is following:

# mount -t cgroup -o freezer cgroup /cgroup
# mkdir /cgroup/0
..
# ./test &
# PID=$(ps | grep test | cut -f 2 -d' ')
# echo $PID > /cgroup/0/tasks
# sleep 3
# echo FROZEN > /cgroup/0/freezer.state
# ./checkpoint $PID > ckpt.image
# mv /tmp/test.out /tmp/test.out.orig
# cp /tmp/test.out.orig /tmp/test.out
# echo THAWED > /cgroup/0/freezer.state
# ./self_restart < ckpt.image
Killed

----- test.c -----
int main(void)
{
FILE *fp;
int i;

close(0);
// close(1); // I got SEGV when I uncomment this line, when restarting
close(2);

fp = fopen("/tmp/test.out","w+");

for(i=0;i<10;i++) {
fprintf(fp,"%d\n",i);
fflush(fp);
sleep(1);
}

fclose(fp);
return 0;
}
----- test.c -----

Thank you very much in advance

2010/2/23 Oren Laadan <orenl@xxxxxxxxxxxxxxx>:
> Hi Andrew,
>
> We've put a stake in the ground for our next set of checkpoint/restart
> patches, v19. It has some great new stuff, and we put extra effort to
> address your concerns. We would like to have the code included in -mm
> for wider feedback and testing.
>
> This one is able to checkpoint/restart screen and vnc sessions, and
> live-migrate network servers between hosts. It also adds support for
> x86-64 (in addition to x86-32, s390x and powerpc). It is rebased to
> kernel 2.6.33-rc8.
>
> Since one of your main concerns was about what is not yet implemented
> and how complicated or ugly it will be to support that, we've put up
> a wiki page to address that. In it there is a simple table that lists
> what is not implemented and the anticipated solution impact, and for
> some entries a link to more details.
>
> The page is here:   http://ckpt.wiki.kernel.org/index.php/Checklist
>
> We want to stress that the patchset is already very useful as-is. We
> will keep working to implement more features cleanly. Some features we
> are working on include network namespaces and device configurations,
> mounts and mounts namespaces, and file locks. Should a complicated
> feature prove hard to implement, users have alternatives systems like
> kvm, until we manage to come up with a clean solution.
>
> We believe that maintenance is best addressed through testing. We now
> have a comprehensive test-suite to automatically find regressions.
> In addition, we ran LTP and the results are the same with CHECKPOINT=n
> and =y.
>
> If desired we'll send the whole patchset to lkml, but the git trees
> can be seen at:
>
>  kernel:       http://www.linux-cr.org/git/?p=linux-cr.git;a=summary
>  user tools:   http://www.linux-cr.org/git/?p=user-cr.git;a=summary
>  tests suite:  http://www.linux-cr.org/git/?p=tests-cr.git;a=summary
>
> Thanks,
>
> Application checkpoint/restart team
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/