Re: [PATCH 3/3] cpuset: fix possible deadlock inasync_rebuild_sched_domains

From: Ingo Molnar
Date: Sun Jan 18 2009 - 04:07:23 EST



* Lai Jiangshan <laijs@xxxxxxxxxxxxxx> wrote:

> Lockdep reported some possible circular locking info when we tested cpuset on
> NUMA/fake NUMA box.
>
> =======================================================
> [ INFO: possible circular locking dependency detected ]
> 2.6.29-rc1-00224-ga652504 #111
> -------------------------------------------------------
> bash/2968 is trying to acquire lock:
> (events){--..}, at: [<ffffffff8024c8cd>] flush_work+0x24/0xd8
>
> but task is already holding lock:
> (cgroup_mutex){--..}, at: [<ffffffff8026ad1e>] cgroup_lock_live_group+0x12/0x29
>
> which lock already depends on the new lock.
> ......
> -------------------------------------------------------
>
> Steps to reproduce:
> # mkdir /dev/cpuset
> # mount -t cpuset xxx /dev/cpuset
> # mkdir /dev/cpuset/0
> # echo 0 > /dev/cpuset/0/cpus
> # echo 0 > /dev/cpuset/0/mems
> # echo 1 > /dev/cpuset/0/memory_migrate
> # cat /dev/zero > /dev/null &
> # echo $! > /dev/cpuset/0/tasks
>
> This is because async_rebuild_sched_domains has the following lock sequence:
> run_workqueue(async_rebuild_sched_domains)
> -> do_rebuild_sched_domains -> cgroup_lock
>
> But, attaching tasks when memory_migrate is set has following:
> cgroup_lock_live_group(cgroup_tasks_write)
> -> do_migrate_pages -> flush_work
>
> This can be fixed by using a separate workqueue thread.
>
> But queuing a work to an other thread is adding some overhead for cpuset.

Can you measure any overhead from that? In any case, this is triggered on
admin activities (when reconfiguring cpusets), so it's a slowpath and thus
using existing infrastructure is preferred in the 99.9% of the cases.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/