Re: [BUG] cgroup/workques/fork: deadlock when moving cgroups

From: Tejun Heo
Date: Thu Apr 14 2016 - 11:32:36 EST


Hello,

On Thu, Apr 14, 2016 at 09:06:23AM +0200, Michal Hocko wrote:
> On Wed 13-04-16 21:48:20, Michal Hocko wrote:
> [...]
> > I was thinking about something like flush_per_cpu_work() which would
> > assert on group_threadgroup_rwsem held for write.
>
> I have thought about this some more and I guess this is not limitted to
> per cpu workers. Basically any flush_work with group_threadgroup_rwsem
> held for write is dangerous, right?

Whether per-cpu or not doesn't matter. What matters is whether the
workqueue has WQ_MEM_RECLAIM or not. That said, I think what we want
to do is avoiding performing heavy operations in migration path. It's
where the core and all controllers have to synchronize, so performing
operations with many external dependencies is bound to get messy. I
wonder whether memory charge moving can be restructured in a similar
fashion to how cpuset node migration is made async. However, given
that charge moving has always been a best effort thing, for now, I
think it'd be best to drop lru_add_drain.

Thanks.

--
tejun