Re: regression 4.4: deadlock in with cgroup percpu_rwsem

From: Tejun Heo
Date: Wed Jan 20 2016 - 10:30:17 EST


Hello,

On Wed, Jan 20, 2016 at 11:47:58AM +0100, Peter Zijlstra wrote:
> TJ, is css_offline guaranteed to be called in hierarchical order? I

No, they aren't. The ancestors of a css are guaranteed to stay around
until css_free is called on the css and that's the only ordering
guarantee.

> got properly lost in the whole cgroup destroy code. There's endless
> workqueues and rcu callbacks there.

Yeah, it's hairy. I wondered about adding support for bouncing to
workqueue in both percpu_ref and rcu which would make things easier to
follow. Not sure how often this pattern happens tho.

> So the current place in free_fair_sched_group() is far too late to be
> calling remove_entity_load_avg(). But I'm not sure where I should put
> it, it needs to be in a place where we know the group is going to die
> but its parent is guaranteed to still exist.
>
> Would offline be that place?

Hmmm... css_free would be with the following patch.

diff -u b/kernel/cgroup.c work/kernel/cgroup.c
--- b/kernel/cgroup.c
+++ work/kernel/cgroup.c
@@ -4725,14 +4725,14 @@

if (ss) {
/* css free path */
+ struct cgroup_subsys_state *parent = css->parent;
int id = css->id;

- if (css->parent)
- css_put(css->parent);
-
ss->css_free(css);
cgroup_idr_remove(&ss->css_idr, id);
cgroup_put(cgrp);
+ if (parent)
+ css_put(parent);
} else {
/* cgroup free path */
atomic_dec(&cgrp->root->nr_cgrps);


Thanks.

--
tejun