Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

From: Waiman Long
Date: Thu Jun 01 2017 - 16:48:56 EST


On 06/01/2017 04:38 PM, Tejun Heo wrote:
> Hello,
>
> On Thu, Jun 01, 2017 at 03:27:35PM -0400, Waiman Long wrote:
>> As said in an earlier email, I agreed that masking it on the kernel side
>> may not be the best solution. I offer 2 other alternatives:
>> 1) Document on how to work around the resource domains issue by proper
>> setup of the cgroup hierarchy.
> We can definitely improve documentation.
>
>> 2) Mark those controllers that require the no internal process
>> competition constraint and disallow internal process only when those
>> controllers are active.
> We *can* do that but wouldn't this be equivalent to enabling thread
> mode implicitly when only thread aware controllers are enabled?
>
>> I prefer the first alternative, but I can go with the second if necessary.
>>
>> The major rationale behind my enhanced thread mode patch was to allow
>> something like
>>
>> R -- A -- B
>> \
>> T1 -- T2
>>
>> where you can have resource domain controllers enabled in the thread
>> root as well as some child cgroups of the thread root. As no internal
>> process rule is currently not applicable to the thread root, this
>> creates the dilemma that we need to deal with internal process competition.
>>
>> The container invariant that PeterZ talked about will also be a serious
>> issue here as I don't think we are going to set up a container root
>> cgroup that will have no process allowed in it because it has some child
>> cgroups. IMHO, I don't think cgroup v2 will get wide adoption without
>> getting rid of that no internal process constraint.
> The only thing which is necessary from inside a container is putting
> the management processes into their own cgroups so that they can be
> controlled (ie. the same thing you did with your patch but doing that
> explicitly from userland) and userland management sw can do the same
> thing whether it's inside a container or on a bare system. BTW,
> systemd already does so and works completely fine in terms of
> containerization on cgroup2. It is arguable whether we should make
> this more convenient from kernel side but using cgroup2 for resource
> control already requires the userspace tools to be adapted to it, so
> I'm not sure how much benefit we'd gain from adding that compared to
> explicitly documenting it.

I think we are on agreement here. I should we should just document how
userland can work around the internal process competition issue by
setting up the cgroup hierarchy properly. Then we can remove the no
internal process constraint.

Cheers,
Longman