Re: [RFC PATCH 6/6] cpuset: Add cpuset.isolation_mask file

From: Frederic Weisbecker
Date: Wed Jul 14 2021 - 19:13:43 EST


On Wed, Jul 14, 2021 at 06:52:43PM +0200, Peter Zijlstra wrote:
> On Wed, Jul 14, 2021 at 03:54:20PM +0200, Frederic Weisbecker wrote:
> > Add a new cpuset.isolation_mask file in order to be able to modify the
> > housekeeping cpumask for each individual isolation feature on runtime.
> > In the future this will include nohz_full, unbound timers,
> > unbound workqueues, unbound kthreads, managed irqs, etc...
> >
> > Start with supporting domain exclusion and CPUs passed through
> > "isolcpus=".
> >
> > The cpuset.isolation_mask defaults to 0. Setting it to 1 will exclude
> > the given cpuset from the domains (they will be attached to NULL domain).
> > As long as a CPU is part of any cpuset with cpuset.isolation_mask set to
> > 1, it will remain isolated even if it overlaps with another cpuset that
> > has cpuset.isolation_mask set to 0. The same applies to parent and
> > subdirectories.
> >
> > If a cpuset is a subset of "isolcpus=", it automatically maps it and
> > cpuset.isolation_mask will be set to 1. This subset is then cleared from
> > the initial "isolcpus=" mask. The user is then free to override
> > cpuset.isolation_mask to 0 in order to revert the effect of "isolcpus=".
> >
> > Here is an example of use where the CPU 7 has been isolated on boot and
> > get re-attached to domains later from cpuset:
> >
> > $ cat /proc/cmdline
> > isolcpus=7
> > $ cd /sys/fs/cgroup/cpuset
> > $ mkdir cpu7
> > $ cd cpu7
> > $ cat cpuset.cpus
> > 0-7
> > $ cat cpuset.isolation_mask
> > 0
> > $ ls /sys/kernel/debug/domains/cpu7 # empty because isolcpus=7
> > $ echo 7 > cpuset.cpus
> > $ cat cpuset.isolation_mask # isolcpus subset automatically mapped
> > 1
> > $ echo 0 > cpuset.isolation_mask
> > $ ls /sys/kernel/debug/domains/cpu7/
> > domain0 domain1
> >
>
> cpusets already has means to create paritions; why are you creating
> something else?

I was about to answer that the semantics of isolcpus, which reference
a NULL domain, are different from SD_LOAD_BALANCE implied by
cpuset.sched_load_balance. But then I realize that SD_LOAD_BALANCE has
been removed.

How cpuset.sched_load_balance is implemented then? Commit
e669ac8ab952df2f07dee1e1efbf40647d6de332 ("sched: Remove checks against
SD_LOAD_BALANCE") advertize that setting cpuset.sched_load_balance to 0
ends up creating NULL domain but that's not what I get. For example if I
mount a single cpuset root (no other cpuset mountpoints):

$ mount -t cgroup none ./cpuset -o cpuset
$ cd cpuset
$ cat cpuset.cpus
0-7
$ cat cpuset.sched_load_balance
1
$ echo 0 > cpuset.sched_load_balance
$ ls /sys/kernel/debug/domains/cpu1/
domain0 domain1

I still get the domains on all CPUs...