Re: [PATCH 1/2] sched/cgroup: move sched_online_group() back into css_online()

From: Konstantin Khlebnikov
Date: Thu Jan 26 2017 - 05:27:35 EST




On 26.01.2017 13:17, Peter Zijlstra wrote:
On Thu, Jan 26, 2017 at 12:41:41PM +0300, Konstantin Khlebnikov wrote:
Commit 2f5177f0fd7e ("sched/cgroup: Fix/cleanup cgroup teardown/init") moved
sched_online_group() from css_online() to css_alloc(). It exposes half-baked
task group into global lists before initializing generic cgroup stuff.

LTP testcase (third in cgroup_regression_test) written for testing
similar race in kernels 2.6.26-2.6.28 easily triggers this oops:


So nobody's run LTP against the kernel for almost a year?

Yep. Nobody runs LTP =)

CONFIG_RT_GROUP_SCHED must be y (which almost impossible to use IRL, I have some patches for it out of tree)

Also systemd by default binds cgroups cpu and cpuacct together - this breaks testcase.



Here task group already linked into global RCU-protected list task_groups
but pointer css->cgroup is still NULL.

This patch reverts this chunk and moves online back to css_online().

Maybe put a comment with it that explains why this is needed?

Something along the lines of this perhaps?


Actually online is called before complete initialization. See second patch. =)
I don't know what to do with this. cgroups are messy as always.

/*
* Don't expose the cgroup until initialization of it is complete in the
* cgroup core. Otherwise things like cgroup_path() will return NULL
* pointers and the like.
*/

+static int cpu_cgroup_css_online(struct cgroup_subsys_state *css)
+{
+ struct task_group *tg = css_tg(css);
+ struct task_group *parent = css_tg(css->parent);
+
+ if (parent)
+ sched_online_group(tg, parent);
+ return 0;
+}