Re: [PATCH v1 3/3] cgroup: relax common ancestor restriction for direct descendants

From: Tejun Heo
Date: Thu Jul 21 2016 - 10:52:52 EST


Hello, Aleksa.

On Thu, Jul 21, 2016 at 05:49:36PM +1000, Aleksa Sarai wrote:
> > > The reason I'm doing this is so that we might be able to _practically_ use
> > > cgroups as an unprivileged user (something that will almost certainly be
> > > useful to not just the container crowd, but people also planning on using
> > > cgroups as advanced forms of rlimits).
> >
> > I don't get why we need this fragile dance with permissions at all
> > when the same functionality can be achieved by delegating explicitly.
>
> The key words being "unprivileged user". Currently, if I am a regular user
> on a system and I want to use the freezer cgroup to pause a process I am
> running, I have to *go to the administrator and ask them to give me
> permission to do that*. Why is that necessary? I find it quite troubling
> that the usecase of an ordinary user on a system trying to use something as
> useful as cgroups is considered to be "solved" by asking your administrator
> (or systemd) to do it for you. "Delegating explicitly" is punting on the
> problem, by saying "just get the administrator to do the setup for you".
> What if you don't have the opportunity to do that, and it takes you 4 weeks
> of sending emails for you to get the administrator to do _anything_?
>
> This is something I'm trying to fix with my recent work with rootless
> containers (and quite a few other people are trying to fix it too).
> Currently we just simply can't do certain operations as an unprivileged user
> that would be possible *if we could just use cgroups*. Things like the
> freezer cgroup would be invaluable for containers, and I guarantee that the
> Chromium and Firefox folks would find it useful to be able to limit browser
> processes in a similar way.

I understand what you're trying to achieve but don't think cgroup's
filesystem interface can accomodate that. To support that level of
automatic delegation, the API should be providing enough isolation so
that operations in one domain (user-specific operations) are
transparent from the other (system-wide administration), which simply
isn't true for cgroupfs. As a simple example, imagine a process being
moved to another cgroup racing against the special operations you're
describing ahead. Both sides are multi-step operations and there are
no ways of synchronizing against each other from kernel side and the
outcomes can easily be non-sensical.

It is unfortunate but we started with and are bound to carry the
current vfs based interface which was never designed to support the
use cases you're describing in a seamless way and that's why cgroup
supports explicit delegation so that userland can take over the
necessary coordination and implement more complex operations atop.

Thanks.

--
tejun