Re: Writing "+pids" to cgroup.subtree_control flie yields EINVAL

From: Michael Kerrisk (man-pages)
Date: Tue Dec 05 2017 - 02:45:37 EST


[dropping Lennart into CC]

Hello Tejun,

On 12/04/2017 10:47 PM, Tejun Heo wrote:
> Hello, Michael.
>
> On Mon, Dec 04, 2017 at 10:35:13PM +0100, Michael Kerrisk (man-pages) wrote:
>> I was trying to do some simple testing ot the CPU controller
>> that is merged into 4.15, and ran immediately into some confusion.
>> In the root cgroup on a freshly booted 4.150-rc1, I try the following:
>>
>> # pwd
>> /sys/fs/cgroup/unified
>> # echo '+cpu' > cgroup.subtree_control
>> sh: echo: write error: Invalid argument
>>
>> What am I missing> I presume I'm missing something obvious, although
>> nothing jumped out at me as I read the cgroups-v2.txt file.
>
> Checking whether I messed up something really basic... hmmm doesn't
> seem that way. What do /sys/fs/cgroup/unified/cgroup.controllers and
> /proc/cgroups say?

Oh -- they're all sensible:

In the root cgroup:

# cat cgroup.controllers
cpu io memory pids

$ cat /proc/cgroups
#subsys_name hierarchy num_cgroups enabled
cpuset 0 142 1
cpu 0 142 1
cpuacct 0 142 1
blkio 0 142 1
memory 0 142 1
devices 0 142 1
freezer 0 142 1
net_cls 0 142 1
perf_event 0 142 1
net_prio 0 142 1
hugetlb 0 142 1
pids 0 142 1

But, I through some trial and error and printk() I worked out

a) If I first move all tasks to the root cgroup, then I can
write '+cpu' to the cgroup.subtree_control file in the root
cgroup.

b) The reason for my initial problems was this test in
the kernel in cpu_cgroup_can_attach():

#ifdef CONFIG_RT_GROUP_SCHED
if (!sched_rt_can_attach(css_tg(css), task))
return -EINVAL;
#else
/* We don't support RT-tasks being in separate groups */
if (task->sched_class != &fair_sched_class)
return -EINVAL;
#endif

I don't have CONFIG_RT_GROUP_SCHED, and the second 'if' was yielding
false because of some SCHED_RR processes that are in some of the nonroot
cgroups created by systemd, namely:

# ps ax -L -o 'pid tid cls rtprio comm'|grep RR
685 723 RR 99 rtkit-daemon
972 979 RR 5 alsa-sink-ALC26
972 982 RR 5 alsa-source-ALC
1594 1597 RR 5 alsa-sink-ALC26
1594 1600 RR 5 alsa-source-ALC

So, one solution is to move those processes to the root cgroup,
and then it's possible to write '+pids' to cgroup.subtree_control.

Is enabling CONFIG_RT_GROUP_SCHED also a solution? (I have
not had a chance to test that yet.)

Anyway, it seems like this should be documented somewhere in the
kernel Documentation files, since it may be that others will run
into this as well. I'm not quite sure what should be added to the
documentation. Do you have some idea?

Thanks,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/