Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

From: Serge E. Hallyn
Date: Thu Feb 12 2009 - 15:48:54 EST

Next message: Andrew Morton: "Re: +work_on_cpu-rewrite-it-to-create-a-kernel-thread-on-demand.patch added to-mm tree"
Previous message: Mike Anderson: "Re: Deadlock during multipath failover"
In reply to: Dave Hansen: "Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart"
Next in thread: Ingo Molnar: "Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Quoting Dave Hansen (dave@xxxxxxxxxxxxxxxxxx):
> Patch 12/14 is supposed to address this *concept*. But, it hasn't been
> carried through so that it currently works. My expectation was that we
> would go through and add things over time. I'll go make sure I push it
> to the point that it actually works for at least the simple test
> programs that we have.
>
> What I will probably do is something BKL-style. Basically put a "this
> can't be checkpointed" marker over most everything I can think of and
> selectively remove it as we add features.

So the question is: when can we unset the uncheckpointable flag?

In your patch you suggest clone(CLONE_NEWPID). But that would
require that we at that point do a slew of checks for other
things like open files of a type which are not supported.

I'm wondering whether we should instead stick to calculating
whether a task is checkpointable or not at checkpoint time.
To help an application figure out whether it can be checkpointed,
we can hook /proc/$$/checkpointable to the same function, and
have the file output list all of the reasons the task is not
checkpointable. i.e.

mmap MAP_SHARED file which is not yet supported
open file from another mounts namespace
open TCP socket which is not yet supported
open epoll fd which is not yet supported
TASK NOT FROZEN

So now every time we do a checkpoint we have to do all these
checks, but that's better than at clone time.

You suggested on irc having a fops->is_checkpointable()
fn, which is imo a good idea to help implement the above.
The default value can be a fn returning false. I suppose
we want to pass back a char* with the file type as well.

-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Andrew Morton: "Re: +work_on_cpu-rewrite-it-to-create-a-kernel-thread-on-demand.patch added to-mm tree"
Previous message: Mike Anderson: "Re: Deadlock during multipath failover"
In reply to: Dave Hansen: "Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart"
Next in thread: Ingo Molnar: "Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]