Lockdep and rw_semaphores

From: Vladislav Bolkhovitin
Date: Sat Sep 10 2011 - 21:32:06 EST


Hello,

Looks like lockdep somehow over-restrictive for rw_semaphores in case when they
are taken for read (down_read()) and requires them to follow the same inner-outer
rules as for plain locks.

For instance, code like:

DECLARE_RWSEM(QQ_sem);
DECLARE_RWSEM(QQ1_sem);

thread1:

down_read(&QQ_sem);
down_read(&QQ1_sem);

msleep(60000);

up_read(&QQ1_sem);
up_read(&QQ_sem);

thread2:

down_read(&QQ1_sem);
down_read(&QQ_sem);

fn();

up_read(&QQ_sem);
up_read(&QQ1_sem);

will trigger possible circular locking dependency detected warning, like:

=======================================================
[ INFO: possible circular locking dependency detected ]
3.0.0 #20
-------------------------------------------------------
thread2/5290 is trying to acquire lock:
(QQ_sem){.+.+..}, at: [<ffffffffa04453a5>] thread2+0x135/0x220

but task is already holding lock:
(QQ1_sem){.+.+..}, at: [<ffffffffa0445399>] thread2+0x129/0x220

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (QQ1_sem){.+.+..}:
[<ffffffff81080fb1>] validate_chain+0x6a1/0x7a0
[<ffffffff8108139b>] __lock_acquire+0x2eb/0x4c0
[<ffffffff81081c07>] lock_acquire+0x97/0x140
[<ffffffff8141871c>] down_read+0x4c/0xa0
[<ffffffffa0418b3e>] thread1+0x4e/0xb0
[<ffffffff81068976>] kthread+0xa6/0xb0
[<ffffffff81422254>] kernel_thread_helper+0x4/0x10

-> #0 (QQ_sem){.+.+..}:
[<ffffffff810808e7>] check_prev_add+0x517/0x540
[<ffffffff81080fb1>] validate_chain+0x6a1/0x7a0
[<ffffffff8108139b>] __lock_acquire+0x2eb/0x4c0
[<ffffffff81081c07>] lock_acquire+0x97/0x140
[<ffffffff8141871c>] down_read+0x4c/0xa0
[<ffffffffa04455c7>] thread2+0x137/0x2d0
[<ffffffff81068976>] kthread+0xa6/0xb0
[<ffffffff81422254>] kernel_thread_helper+0x4/0x10

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(QQ1_sem);
lock(QQ_sem);
lock(QQ1_sem);
lock(QQ_sem);

*** DEADLOCK ***

1 lock held by thread2/5290:
#0: (QQ1_sem){.+.+..}, at: [<ffffffffa0445399>] thread2+0x129/0x220
stack backtrace:
Pid: 5290, comm: thread2 Not tainted 3.0.0 #20
Call Trace:
[<ffffffff8107e9b7>] print_circular_bug+0x107/0x110
...

Is it by design or just something overlooked? I don't see how reverse order of
down_read()'s can lead to any deadlock. Or am I missing anything?

Thanks,
Vlad
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/