Re: NFS lockdep lock misordering mmap_sem<->i_mutex_key with2.6.32-git1

From: Andi Kleen
Date: Tue Dec 15 2009 - 18:38:57 EST


On Tue, Dec 15, 2009 at 10:21:34PM +0000, Al Viro wrote:
> On Mon, Dec 07, 2009 at 02:20:09PM +0100, Andi Kleen wrote:
> > > nfs_readdir
> > > nfs_do_filldir
> > > filldir
> > > copy_to_user
> > > [page_fault] [grab mmap_sem]
> > >
> > > sys_mmap [grab mmap_sem]
> > > do_mmap_pgoff
> > > mmap_region
> > > nfs_file_mmap
> > > nfs_revalidate_mapping
> > > nfs_invalidate_mapping [grab i_mutex]
> > >
> > > I guess recent lockdep improvement find old bug.
> >
> > Thanks for the analysis.
> >
> > I guess should never do copy_*_user while holding i_mutex? There might
> > be lots of cases like that.
>
> No. mmap_sem inside i_mutex is the normal order; NFS mmap is doing the
> wrong thing here. Note that readdir() vs. NFS (file-only, thankfully ;-)
> mmap() is a non-issue; NFS mmap() vs. write() is much more interesting.

I see.

>
> Again, a lot of mm/* code expects i_mutex, then mmap_sem order. It's not
> just readdir().

I suppose an easy workaround would be to not revalidate in mmap,
because open should have already done that?

Very lightly tested RFC patch attached.

-Andi

---

NFS: don't revalidate in mmap

nfs_revalidate_mapping takes i_mutex, but mmap already has mmap_sem
hold and taking i_mutex inside mmap_sem is not allowed by the VFS.

So don't revalidate on mmap time and trust it has been already done.

Signed-off-by: Andi Kleen <ak@xxxxxxxxxxxxxxx>

---
fs/nfs/file.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

Index: linux-2.6.32-ak/fs/nfs/file.c
===================================================================
--- linux-2.6.32-ak.orig/fs/nfs/file.c
+++ linux-2.6.32-ak/fs/nfs/file.c
@@ -297,14 +297,9 @@ nfs_file_mmap(struct file * file, struct
dprintk("NFS: mmap(%s/%s)\n",
dentry->d_parent->d_name.name, dentry->d_name.name);

- /* Note: generic_file_mmap() returns ENOSYS on nommu systems
- * so we call that before revalidating the mapping
- */
status = generic_file_mmap(file, vma);
- if (!status) {
+ if (!status)
vma->vm_ops = &nfs_file_vm_ops;
- status = nfs_revalidate_mapping(inode, file->f_mapping);
- }
return status;
}

--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/