Re: [RFC v2][PATCH 2/9] General infrastructure for checkpoint restart

From: Oren Laadan
Date: Sun Aug 24 2008 - 02:03:21 EST




Louis Rilling wrote:
On Wed, Aug 20, 2008 at 11:04:13PM -0400, Oren Laadan wrote:
Add those interfaces, as well as helpers needed to easily manage the
file format. The code is roughly broken out as follows:

ckpt/sys.c - user/kernel data transfer, as well as setup of the
checkpoint/restart context (a per-checkpoint data structure for
housekeeping)

ckpt/checkpoint.c - output wrappers and basic checkpoint handling

ckpt/restart.c - input wrappers and basic restart handling

Patches to add the per-architecture support as well as the actual
work to do the memory checkpoint follow in subsequent patches.


[...]

diff --git a/checkpoint/sys.c b/checkpoint/sys.c
new file mode 100644
index 0000000..2891c48
--- /dev/null
+++ b/checkpoint/sys.c

[...]

+/*
+ * helpers to manage CR contexts: allocated for each checkpoint and/or
+ * restart operation, and persists until the operation is completed.
+ */
+
+static atomic_t cr_ctx_count; /* unique checkpoint identifier */

I thought we agreed that this counter should be per-container. Perhaps add a
TODO here?

True.


+
+void cr_ctx_free(struct cr_ctx *ctx)
+{
+
+ if (ctx->file)
+ fput(ctx->file);
+ if (ctx->vfsroot)
+ path_put(ctx->vfsroot);
+
+ free_pages((unsigned long) ctx->tbuf, CR_TBUF_ORDER);
+ free_pages((unsigned long) ctx->hbuf, CR_HBUF_ORDER);
+
+ kfree(ctx);
+}
+
+struct cr_ctx *cr_ctx_alloc(pid_t pid, struct file *file, unsigned long flags)
+{
+ struct cr_ctx *ctx;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx)
+ return NULL;
+
+ ctx->tbuf = (void *) __get_free_pages(GFP_KERNEL, CR_TBUF_ORDER);
+ ctx->hbuf = (void *) __get_free_pages(GFP_KERNEL, CR_HBUF_ORDER);
+ if (!ctx->tbuf || !ctx->hbuf)
+ goto nomem;
+
+ ctx->pid = pid;
+ ctx->flags = flags;
+
+ ctx->file = file;
+ get_file(file);
+
+ /* assume checkpointer is in container's root vfs */

I'm a bit puzzled by this assumption. I would say: either this is a
self-checkpoint (only current process), or this is a container checkpoint. In
the latter case, I expect that in the general case the checkpointer lives
outside the container (and the interface of sys_checkpoint() below confirms
this), so it's root fs is probably not the container's one.

Does it differ from what you're planning?

You are correct. We lack infrastructure for what I'd call "container-object",
and the patchset does not yet tackle a container and multiple tasks, so this
is an interim solution. Will add a FIXME comment.

Thanks,

Oren.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/