Re: [RFC PATCH 0/5] cgroup/cpuset: A new "isolcpus" paritition

From: Waiman Long
Date: Wed Apr 12 2023 - 21:56:40 EST


On 4/12/23 21:17, Tejun Heo wrote:
Hello, Waiman.

On Wed, Apr 12, 2023 at 08:55:55PM -0400, Waiman Long wrote:
Sounds a bit contrived. Does it need to be something defined in the root
cgroup?
Yes, because we need to take away the isolated CPUs from the effective cpus
of the root cgroup. So it needs to start from the root. That is also why we
have the partition rule that the parent of a partition has to be a partition
root itself. With the new scheme, we don't need a special cgroup to hold the
I'm following. The root is already a partition root and the cgroupfs control
knobs are owned by the parent, so the root cgroup would own the first level
cgroups' cpuset.cpus.reserve knobs. If the root cgroup wants to assign some
CPUs exclusively to a first level cgroup, it can then set that cgroup's
reserve knob accordingly (or maybe the better name is
cpuset.cpus.exclusive), which will take those CPUs out of the root cgroup's
partition and give them to the first level cgroup. The first level cgroup
then is free to do whatever with those CPUs that now belong exclusively to
the cgroup subtree.

I am OK with the cpuset.cpus.reserve name, but not that much with the cpuset.cpus.exclusive name as it can get confused with cgroup v1's cpuset.cpu_exclusive. Of course, I prefer the cpuset.cpus.isolated name a bit more. Once an isolated CPU gets used in an isolated partition, it is exclusive and it can't be used in another isolated partition.

Since we will allow users to set cpuset.cpus.reserve to whatever value they want. The distribution of isolated CPUs is only valid if the cpus are present in its parent's cpuset.cpus.reserve and all the way up to the root. It is a bit expensive, but it should be a relatively rare operation.


isolated CPUs. The new root cgroup file will be enough to inform the system
what CPUs will have to be isolated.

My current thinking is that the root's "cpuset.cpus.isolated" will start
with whatever have been set in the "isolcpus" or "nohz_full" boot command
line and can be extended from there but not shrank below that as there can
be additional isolation attributes with those isolated CPUs.
I'm not sure we wanna tie with those automatically. I think it'd be
confusing than helpful.

Yes, I am fine with taking this off for now.

Cheers,
Longman