Re: 3.1.0-rc3 -- INFO: possible circular locking dependency detected

From: Josh Boyer
Date: Tue Aug 23 2011 - 07:59:27 EST

Next message: Sergey Senozhatsky: "khugepaged: inconsistent lock {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}usage"
Previous message: Rob Landley: "Re: m68k with mmu doesn't compile after 66d857b08b8c3ed"
In reply to: Dave Chinner: "Re: 3.1.0-rc3 -- INFO: possible circular locking dependency detected"
Next in thread: Dave Chinner: "Re: 3.1.0-rc3 -- INFO: possible circular locking dependency detected"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Aug 23, 2011 at 7:49 AM, Dave Chinner <david@xxxxxxxxxxxxx> wrote:
> On Tue, Aug 23, 2011 at 07:35:00AM -0400, Josh Boyer wrote:
>> On Tue, Aug 23, 2011 at 12:39 AM, Miles Lane <miles.lane@xxxxxxxxx> wrote:
>> > [ INFO: possible circular locking dependency detected ]
>> > 3.1.0-rc3 #2
>> > -------------------------------------------------------
>> > dconf-service/1836 is trying to acquire lock:
>> > (&sb->s_type->i_mutex_key#12){+.+.+.}, at: [<ffffffff8116df1a>]
>> > ext4_evict_inode+0x88/0x32b
>> >
>> > but task is already holding lock:
>> > (&mm->mmap_sem){++++++}, at: [<ffffffff810d4393>] sys_munmap+0x36/0x5b
>> >
>> > which lock already depends on the new lock.
>> >
>> > the existing dependency chain (in reverse order) is:
>> >
>> > -> #1 (&mm->mmap_sem){++++++}:
>> > [<ffffffff8106933a>] lock_acquire+0x129/0x14e
>> > [<ffffffff810cddbd>] might_fault+0x68/0x8b
>> > [<ffffffff810fcf5e>] filldir+0x6a/0xc2
>> > [<ffffffff811651a1>] call_filldir+0x91/0xb8
>> > [<ffffffff811654bf>] ext4_readdir+0x1af/0x510
>> > [<ffffffff810fd1a4>] vfs_readdir+0x76/0xac
>> > [<ffffffff810fd2b6>] sys_getdents+0x79/0xc9
>> > [<ffffffff814162fb>] system_call_fastpath+0x16/0x1b
>> >
>> > -> #0 (&sb->s_type->i_mutex_key#12){+.+.+.}:
>> > [<ffffffff81068b10>] __lock_acquire+0xa5e/0xd52
>> > [<ffffffff8106933a>] lock_acquire+0x129/0x14e
>> > [<ffffffff8140f1a2>] __mutex_lock_common+0x64/0x413
>> > [<ffffffff8140f5b0>] mutex_lock_nested+0x16/0x18
>> > [<ffffffff8116df1a>] ext4_evict_inode+0x88/0x32b
>> > [<ffffffff81102d8a>] evict+0x94/0x14e
>> > [<ffffffff81102fd0>] iput+0x18c/0x195
>> > [<ffffffff810ffdd4>] dentry_kill+0x11e/0x140
>> > [<ffffffff8110019b>] dput+0xd4/0xe4
>> > [<ffffffff810efac3>] fput+0x1a5/0x1bd
>> > [<ffffffff810d3214>] remove_vma+0x37/0x5f
>> > [<ffffffff810d4239>] do_munmap+0x2ed/0x306
>> > [<ffffffff810d43a1>] sys_munmap+0x44/0x5b
>> > [<ffffffff814162fb>] system_call_fastpath+0x16/0x1b
>> >
>> > other info that might help us debug this:
>> >
>> > Possible unsafe locking scenario:
>> >
>> > CPU0 CPU1
>> > ---- ----
>> > lock(&mm->mmap_sem);
>> > lock(&sb->s_type->i_mutex_key);
>> > lock(&mm->mmap_sem);
>> > lock(&sb->s_type->i_mutex_key);
>> >
>> > *** DEADLOCK ***
>>
>> This one was reported yesterday: https://lkml.org/lkml/2011/8/21/163
>> and we're hoping Ted (or someone else from the ext4 camp) can comment
>> on why ext4_evict_inode is holding i_mutex.
>
> Actually, the problem has nothing to do with ext4. the problem is
> that remove_vma() is holding the mmap_sem while calling fput(). The
> correct locking order is i_mutex->mmap_sem, as documented in
> mm/filemap.c:
>
> * ->i_mutex (generic_file_buffered_write)
> * ->mmap_sem (fault_in_pages_readable->do_page_fault)
>
>
> The way remove_vma() calls fput() also triggers lockdep reports in
> XFS and it will do so with any filesystem that takes an inode
> specific lock in it's evict() processing. IOWs, remove_vma() needs
> fixing, not ext4....

Er... ok. So the remove_vma code hasn't changed since 2008. We're
only seeing this issue now because the debugging code has improved,
or?

At any rate, the proposed solution is to make remove_vma drop mmap_sem
before calling fput, or make it not call fput, or?

josh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Sergey Senozhatsky: "khugepaged: inconsistent lock {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W}usage"
Previous message: Rob Landley: "Re: m68k with mmu doesn't compile after 66d857b08b8c3ed"
In reply to: Dave Chinner: "Re: 3.1.0-rc3 -- INFO: possible circular locking dependency detected"
Next in thread: Dave Chinner: "Re: 3.1.0-rc3 -- INFO: possible circular locking dependency detected"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]