Re: [PATCH 2/5] cgroup/cpuset: Add new cpus.partition type with no load balancing

From: Peter Zijlstra
Date: Thu Jun 10 2021 - 15:40:16 EST


On Thu, Jun 10, 2021 at 03:16:29PM -0400, Waiman Long wrote:
> On 6/10/21 2:50 PM, Peter Zijlstra wrote:
> > On Thu, Jun 03, 2021 at 05:24:13PM -0400, Waiman Long wrote:
> > > Cpuset v1 uses the sched_load_balance control file to determine if load
> > > balancing should be enabled. Cpuset v2 gets rid of sched_load_balance
> > > as its use may require disabling load balancing at cgroup root.
> > >
> > > For workloads that require very low latency like DPDK, the latency
> > > jitters caused by periodic load balancing may exceed the desired
> > > latency limit.
> > >
> > > When cpuset v2 is in use, the only way to avoid this latency cost is to
> > > use the "isolcpus=" kernel boot option to isolate a set of CPUs. After
> > > the kernel boot, however, there is no way to add or remove CPUs from
> > > this isolated set. For workloads that are more dynamic in nature, that
> > > means users have to provision enough CPUs for the worst case situation
> > > resulting in excess idle CPUs.
> > >
> > > To address this issue for cpuset v2, a new cpuset.cpus.partition type
> > > "root-nolb" is added which allows the creation of a cpuset partition with
> > > no load balancing. This will allow system administrators to dynamically
> > > adjust the size of the no load balancing partition to the current need
> > > of the workload without rebooting the system.
> > I'm confused, why do you need this? Just create a parition for each cpu.
> >
> From a management point of view, it is more cumbersome to do one cpu per
> partition. I have suggested this idea of 1 cpu per partition to the
> container developers, but they don't seem to like it.

Oh, because it then creates a cgroup tree per CPU and you get to move
tasks between cgroups?

OK I suppose.