Re: [PATCH v2] cgroup: allow management of subtrees by new cgroup namespaces

From: Aleksa Sarai
Date: Wed May 04 2016 - 05:50:00 EST


Perhaps what you should to be arguing then that the default
permissions of the cgroup directories need to be all rwx for
everyone and then your patch becomes unnecessary?

I don't think that would be the nicest way of dealing with this
(then a process can make very large numbers of cgroups all over
the tree, which might not cause huge issues but would still be a
pain for administrators and systemds alike).

Beware of what you cite as a problem. Any user can enter a user
namespace and then unshare a cgroup namespace. This means that
what you seem to want is equivalent to any user at all being able
to create a cgroup hierarchy.

They should only be allowed to make subtrees of the cgroup *they
currently reside in* IMO.

For the usual case that is the top level cgroup because most processes
don't get initially confined. If there is initial confinement by
something, then whatever it is could alter the permissions as well.

So if the default case is equivalent to making all the initial top
level cgroups rwx, we should understand the implications of that and
the best way to concentrate minds is to ask what happens if it were the
default.

A patchset I worked on (and then trashed) before writing this one would create a cgroup under your current cgroup, then would make you the owner of the new cgroup (and move you to it, making it the root of the namespace). This would alleviate this particular issue, but brings up many others (such as making sure there's no name clashes, and the fact that processes will start moving around in cgroups and whether or not userspace will be sufficiently alerted to the changes). In addition, the code was quite bad.

My ideal solution would be something like the above, because it means that we don't have to have disagreement about who "owns" a particular node in the cgroup hierarchy. Then we don't even have to virtualise /sys/fs/cgroups because there can be a global agreement on who owns what.

The only issue I could think of was the name clashes, and the fact that processes will now be moving around cgroups without explicitly writing to cgroup.procs.

If we decide to implement both, we have to agree on the restrictions
*immediately* because the cgroup namespace was merged in 4.6-rc1 so
changing the restrictions on it in 4.7 would probably be frowned
upon.

No, that horse has left the stable: the cgroup namespace applies to
both v1 and v2.

I was referring to the "what restrictions should apply to cgroup.procs in a cgroup namespace" question, because if we don't agree on this before 4.7 we would break back-compat.

My thinking was that rename(2) would make this a simple decision, but
I just realised that rename(2) doesn't let you change the hierarchy.
But it should be noted that cgroupv2 has a fix for this: you can't
move a task to another cgroup unless you have attach rights
(cgroup.procs) to the common ancestor of the current cgroup and the
target cgroup.

Currently the decision is made in cgroup_procs_write_permission() and
actually is blind to the user namespace, so this needs updating anyway.

Yeah, but we can't apply it (the common ancestor restriction) to cgroupv1 (back-compat). Maybe we could combine both updates as one "correcting the semantics" patch?

--
Aleksa Sarai
Software Engineer (Containers)
SUSE Linux GmbH
https://www.cyphar.com/