Re: CGroup Namespaces (v4)

From: Eric W. Biederman
Date: Mon Nov 16 2015 - 17:32:46 EST


"Serge E. Hallyn" <serge@xxxxxxxxxx> writes:

> On Mon, Nov 16, 2015 at 09:50:55PM +0100, Richard Weinberger wrote:
>> Am 16.11.2015 um 21:46 schrieb Serge E. Hallyn:
>> > On Mon, Nov 16, 2015 at 09:41:15PM +0100, Richard Weinberger wrote:
>> >> Serge,
>> >>
>> >> On Mon, Nov 16, 2015 at 8:51 PM, <serge@xxxxxxxxxx> wrote:
>> >>> To summarize the semantics:
>> >>>
>> >>> 1. CLONE_NEWCGROUP re-uses 0x02000000, which was previously CLONE_STOPPED
>> >>>
>> >>> 2. unsharing a cgroup namespace makes all your current cgroups your new
>> >>> cgroup root.
>> >>>
>> >>> 3. /proc/pid/cgroup always shows cgroup paths relative to the reader's
>> >>> cgroup namespce root. A task outside of your cgroup looks like
>> >>>
>> >>> 8:memory:/../../..
>> >>>
>> >>> 4. when a task mounts a cgroupfs, the cgroup which shows up as root depends
>> >>> on the mounting task's cgroup namespace.
>> >>>
>> >>> 5. setns to a cgroup namespace switches your cgroup namespace but not
>> >>> your cgroups.
>> >>>
>> >>> With this, using github.com/hallyn/lxc #2015-11-09/cgns (and
>> >>> github.com/hallyn/lxcfs #2015-11-10/cgns) we can start a container in a full
>> >>> proper cgroup namespace, avoiding either cgmanager or lxcfs cgroup bind mounts.
>> >>>
>> >>> This is completely backward compatible and will be completely invisible
>> >>> to any existing cgroup users (except for those running inside a cgroup
>> >>> namespace and looking at /proc/pid/cgroup of tasks outside their
>> >>> namespace.)
>> >>> cgroupns-root.
>> >>
>> >> IIRC one downside of this series was that only the new "sane" cgroup
>> >> layout was supported
>> >> and hence it was useless for everything which expected the default layout.
>> >> Hence, still no systemd for us. :)
>> >>
>> >> Is this now different?
>> >
>> > Yes, all hierachies are no supported.
>> >
>>
>> Should read "now"? :-)
>> If so, *awesome*!
>
> D'oh! Yes, now :-)

I am glad to see multiple hierarchy support, that is something people
can use today.

A couple of quick questions before I delve into a review.

Does this allow mixing of cgroupfs and cgroupfs2? That is can I: "mount
-t cgroupfs" inside a container and "mount -t cgroupfs2" outside a
container? and still have reasonable things happen? I suspect the
semantics of cgroups prevent this but I am interested to know what happens.

Similary have you considered what it required to be able to safely set
FS_USERNS_MOUNT?

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/