Re: [RFC] [PATCH -mm] cgroup: uid-based rules to add processesefficiently in the right cgroup

From: Vivek Goyal
Date: Tue Aug 26 2008 - 12:10:00 EST


On Tue, Aug 26, 2008 at 08:05:12PM +0530, Balbir Singh wrote:
> Vivek Goyal wrote:
> > On Mon, Aug 25, 2008 at 05:54:39PM -0700, Paul Menage wrote:
> >> On Tue, Aug 19, 2008 at 5:57 AM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> >>> Same thing will happen if we implement the daemon in user space. A task
> >>> who does seteuid(), can be swept away to a different cgroup based on
> >>> rules specified in /etc/cgrules.conf.
> >> Yes, I'm not so keen on a daemon magically pulling things into a
> >> cgroup based on uid either, for the same reasons.
> >>
> >> But a user-space based solution can be much more flexible (e.g. easier
> >> to configure it to only move tasks from certain source cgroups).
> >>
> >>> What do you mean by risk? This is the policy set up by system admin and
> >>> behaviour would seem consistent as per the policy. If an admin decides
> >>> that tasks of user "apache" should run into /container/cpu/apache cgroup and
> >>> if a "root" tasks does seteuid(apache), then it manes sense to move task
> >>> to /container/cpu/apache.
> >> The kind of unexpected behaviour I was imagining was when some other
> >> daemon (e.g. ftpd?) unexpectedly does a setuid to one of the
> >> magically-controlled users, and results in that daemon being pulled
> >> into the specified cgroup. For something like cpu maybe that's mostly
> >> benign (but what moves it back into its original group after it
> >> switches back to root?)
> >
> > Once ftpd does seteuid() or setreuid() again to switch effective user to
> > "root", it will be moved back to original group (root's group).
> >
> > So basic question is if a program changes its effective user id temporarily
> > to user B than all the resource consumption should take place from the
> > resources of user B or should continue to take place from original cgroup.
> >
> > I would think that we should move the task temporarily to B's cgroup and
> > bring back again upon identity change.
> >
> > At the same time I can also understand that this behavior can probably
> > be considered over-intrusive and some people might want to avoid that.
> >
> > Two things come to my mind.
> >
> > - Users who find it too intrusive, can just shut down the rules based
> > daemon.
> >
>
> Yes, I would say administrators should do that. Classification via setuid(),
> does make a lot of sense, but at the same time it might be too aggressive if an
> application frequently uses setuid()
>

Just minor clarification. Right now all the classification is being done
based on effective uid and effective gid.

[..]
> >>> Exactly what kind of scenario do you have in mind when you want the policy
> >>> to be enforced selectively based on task (tid)?
> >> I was thinking of something like possibly a per-cgroup file (that also
> >> affected child cgroups) rather than a global file. That would also
> >> automatically handle multiple hierarchies.
> >>
> >
> > So there can be two kind of controls.
> >
> > - Create a per cgroup file say "group_pinned", where if 1 is written to
> > "group_pinned" that means daemon will not move tasks from this cgroup upon
> > effective uid/gid changes.
> >
> > - Provide more fine grained control where task movement is not controlled
> > per cgroup, rather per thread id. In that case every cgroup will contain
> > another file "tasks_pinned" which will contain all the tids which cannot
> > be moved from this cgroup by daemon. By default this file will be empty
> > and all the tids are movable.
> >
> > I think initially we can keep things simple and implement "group_pinned"
> > which provides coarse control on the whole group and pins all the tasks
> > in that cgroup.
> >
>
> Hmm... I wonder if we are providing too many knobs. Can't we do something simpler?

I also fear that we are probably providing too many knobs. Until we get
a strong use case, to keep things simple I recommend that for the time
being let us stick to simple user space daemon and user can turn it on
or off based on his needs (whether user wants a cgroup change upon seteuid()
related events). No controls based on group_pinned or tasks_pinned
etc. It is all or none.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/