Re: [BUG] sched: leaf_cfs_rq_list use after free

From: Tejun Heo
Date: Fri Mar 11 2016 - 13:20:43 EST


Hello, Peter.

On Thu, Mar 10, 2016 at 01:54:17PM +0100, Peter Zijlstra wrote:
> > I've reproduced this on v4.4, but I've also managed to reproduce the bug
> > after cherry-picking the following patches
> > (all but one were marked for v4.4 stable):
> >
> > 6fe1f34 sched/cgroup: Fix cgroup entity load tracking tear-down
> > d6e022f workqueue: handle NUMA_NO_NODE for unbound pool_workqueue lookup
> > 041bd12 Revert "workqueue: make sure delayed work run in local cpu"
> > 8bb5ef7 cgroup: make sure a parent css isn't freed before its children
> > aa226ff cgroup: make sure a parent css isn't offlined before its children
> > e93ad19 cpuset: make mm migration asynchronous
>
> Hmm, that is most unfortunate indeed.
>
> Can you describe a reliable reproducer?
>
> So we only call list_add_leaf_cfs_rq() through enqueue_task_fair(),
> which means someone is still running inside that cgroup.
>
> TJ, I thought we only call offline when the cgroup is empty, don't we?

Yeap, populated csses shouldn't be being offlined.

Thanks.

--
tejun