Re: CLONE_PARENT after setns(CLONE_NEWPID)

From: Christian Seiler
Date: Wed Nov 06 2013 - 18:41:47 EST


Hi there,

Having used bash as an init process I know it can handle unexpeted
children. However using CLONE_PARENT in this way still seems a little
dodgy. Or am I misunderstanding why you are using CLONE_PARENT?

Since I (re)wrote that part of LXC, I should perhaps clarify how that is
used: In case of LXC, the grandparent is lxc-attach itself. The logic
goes as follows:

- user calls lxc-attach -n $container -- /bin/command/to/execute
- lxc-attach does a fork()
- child process does setns()
- child process does clone(CLONE_PARENT)
- child process exits
- new process is now in all of the correct namespaces
- new process does some IPC (socketpair() from before fork/clone) to
tell original lxc-attach process to finish initialization
(mainly: add new process to the proper cgroups)
- new process exec()s to /bin/command/to/execute
- original lxc-attach process waitpid()s for the attached process
to exit

So the only process that needs to handle a new child is going to be
lxc-attach itself, but that is designed in such a way that it expects
the new child.

(The initial fork is necessary because once setns(userns, mntns) has
occurred, the cgroup tree may not be writable anymore (depening on
further circumstances), so it would be impossible to just do setns() and
then fork() if one then wants to add the new process to the proper cgroups.)

That trick sounds like it might be worth adding to nsenter in util-linux
just to simplify the code.

I think nsenter currently only does setns() and then fork(), which is
simpler than lxc-attach - mainly because there's no need to attach the
process to cgroups etc. lxc-attach's approach does not eliminate the
need for the original process wait()ing on the attached process, the
CLONE_PARENT is really just used internally to simplify the process
hierarchy and also the IPC required.

Also, re: general point in this thread: I don't see how CLONE_PARENT
could be harmful in any way when used after setns() moreso than it might
already be harmful without setns(). I could always write a program that
just does clone(CLONE_PARENT) (and nothing else) and then the calling
process would also get an unexpected child - I don't see how the pid
namespace status of that child would change anything here. So I'd
definitely be in favor of allowing CLONE_PARENT after setns().

Christian

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/