Re: [PATCH 0/8 v3] cgroups: Task counter subsystem (was: New maxnumber of tasks subsystem)

From: Kay Sievers
Date: Tue Aug 16 2011 - 12:02:14 EST


On Fri, Aug 12, 2011 at 23:11, Tim Hockin <thockin@xxxxxxxxxx> wrote:
> On Mon, Aug 1, 2011 at 4:19 PM, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
>> On Fri, 29 Jul 2011 18:13:22 +0200
>> Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>>
>>> Reminder:
>>>
>>> This patchset is aimed at reducing the impact of a forkbomb to a
>>> cgroup boundaries, thus minimizing the consequences of such an attack
>>> against the rest of the system.
>>>
>>> This can be useful when cgroups are used to stage some processes or run
>>> untrustees.
>>
>> Really? ÂHow useful? ÂWhy is it useful enough to justify adding code
>> such as this to the kernel?
>>
>> Is forkbomb-prevention the only use? ÂOthers have proposed different
>> ways of preventing forkbombs which were independent of cgroups - is
>> this way better and if so, why?
>
> I certainly want this for exactly the proposed use - putting a bounds
> on threads/tasks per container. ÂIt's rlimits but more useful.
>
> IMHO, most every limit that can be set at a system level should be
> settable at a cgroup level. ÂThis is just one more isolation leak.

Such functionality in general sounds useful. System management tools
want to be able to race-free stop a service. A 'service' in the sense
of 'a group of processes and all the future processes it creates'.

A common problem here are user sessions that a logins creates. For
some systems it is required, that after logout of the user, all
processes the user has started are properly cleaned up. Common example
for such enforcements are servers at schools universities that do not
want to allow users to leave things like file sharing programs running
in the background after they log out.

We currently do that in systemd by tracking these session in a cgroup
and kill all pids in that group. This currently requires some
cooperation of the services to be successful. If they would fork
faster than we kill them, we would never be able to finish the task.

Such user sessions are generally untrusted code and processes, and the
system management that cleans up after the end of the session runs
privileged. It would be nice, to be allow trusted code to race-free
kill all remaining processes of such an untrusted session. This is not
so much about fork-bombs, things might not even have bad things in
mind, this would be more like a rlimit for a 'group of pids', that
allows race-free resource management of the services.

For the actual implementation, I think it would be nicer to use to
have such functionality at the core of cgroups, and not require a
specific controller to be set up. We already track every single
service in its own cgroup in a custom hierarchy. These groups just act
as the container for all the pids belonging to the service, so we can
track the service properly.

Naively looking at it as a user of it, we would like to be able to
apply these limits for every cgroup right away, not needing to create
another controller/subsystem/hierarchy.

Thanks,
Kay
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/