Re: [RFC PATCH v2 11/17] cgroup: Implement new thread mode semantics

From: Waiman Long
Date: Thu Jun 01 2017 - 15:55:49 EST


On 06/01/2017 02:44 PM, Waiman Long wrote:
> On 06/01/2017 11:10 AM, Peter Zijlstra wrote:
>> On Thu, Jun 01, 2017 at 10:50:42AM -0400, Tejun Heo wrote:
>>> Hello, Waiman.
>>>
>>> A short update. I tried making root special while keeping the
>>> existing threaded semantics but I didn't really like it because we
>>> have to couple controller enables/disables with threaded
>>> enables/disables. I'm now trying a simpler, albeit a bit more
>>> tedious, approach which should leave things mostly symmetrical. I'm
>>> hoping to be able to post mostly working patches this week.
>> I've not had time to look at any of this. But the question I'm most
>> curious about is how cgroup-v2 preserves the container invariant.
>>
>> That is, each container (namespace) should look like a 'real' machine.
>> So just like userns allows to have a uid-0 (aka root) for each container
>> and pidns allows a pid-1 for each container, cgroupns should provide a
>> root group for each container.
>>
>> And cgroup-v2 has this 'exception' (aka wart) for the root group which
>> needs to be replicated for each namespace.
> One of the changes that I proposed in my patches was to get rid of the
> no internal process constraint. I think that will solve a big part of
> the container invariant problem that we have with cgroup v2.
>
> Cheers,
> Longman

Another idea that I have to further solve this container invariant
problem is do a cgroup setup like

CP -- CR

CP - container parent belong to the host
CR - container root

We can enable the pass-through mode at the subtree_control file of CP to
force all CR controllers in pass-through mode. In this case, those
controllers are not enabled in the CR like the root. However, the
container can enable those in the child cgroups just like the root
controller. By enabling those controller in the CP level, the host can
control how much resource is being allowed in the container without the
container being aware that its resources are being controlled as all the
control knobs will show up in the CP, but not in CR.

Cheers,
Longman