Re: [Ksummit-2010-discuss] checkpoint-restart: naked patch

From: Christoph Hellwig
Date: Tue Nov 02 2010 - 17:50:17 EST


Thanks Tejun,

your writeup brought up a lot of the same issues that I see with
the in-kernel C/R. Various C/R implementations that are entirely
in userspace or with limited kernel assistance have been in production
in HPC environments for years. I think especially for these workloads
C/R is an extremly useful feature, and a standard implementation would
do Linux well.

But I think the "transparent" in-kernel one is the wrong approach. It
tries to give the illusion that C/R will just work, while a lot of
things are simply not support. In this case whitelisting the allowed
state by requiring special APIs for all I/O (or even just standard
APIs as long as they are supposed by the C/R lib you're linked against)
is the more pragmatic, and I think faithful aproach. In addition to
the amount of state not supported despite looking transparant the
other big problem with the patchset is that it saves the kernel internal
state which changes all the time from one release to another. The
handwaiving is that a userspace tool will solve it. I'm pretty sure
that's not the case; it might solve a few cases but the general
version n to version m conversion is impossible to maintain. Just look
at the problem qemu has migration between just a handfull of version
of the relatively well (compared to random kernel state) defined vmstate
format.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/