Re: [PATCH/RFC 00/19] Support loop-back NFS mounts

From: Dave Chinner
Date: Wed Apr 16 2014 - 21:27:56 EST

Next message: Stephen Rothwell: "linux-next: manual merge of the ipsec tree with Linus' tree"
Previous message: Jason Gunthorpe: "Re: [PATCH v5 2/4] arm64: dts: APM X-Gene PCIe device tree nodes"
In reply to: NeilBrown: "Re: [PATCH/RFC 00/19] Support loop-back NFS mounts"
Next in thread: NeilBrown: "Re: [PATCH/RFC 00/19] Support loop-back NFS mounts"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Thu, Apr 17, 2014 at 10:20:48AM +1000, NeilBrown wrote:
> A good example is the deadlock with the flush-* threads.
> flush-* will lock a page, and then call ->writepage. If ->writepage
> allocates memory it can enter reclaim, call ->releasepage on NFS, and block
> waiting for a COMMIT to complete.
> The COMMIT might already be running, performing fsync on that same file that
> flush-* is flushing. It locks each page in turn. When it gets to the page
> that flush-* has locked, it will deadlock.

It's nfs_release_page() again....

> In general, if nfsd is allowed to block on local filesystem, and local
> filesystem is allowed to block on NFS, then a deadlock can happen.
> We would need a clear hierarchy
>
> __GFP_NETFS > __GFP_FS > __GFP_IO
>
> for it to work. I'm not sure the extra level really helps a lot and it would
> be a lot of churn.

I think you are looking at this the wrong way - it's not the other
filesystems that have to avoid memory reclaim recursion, it's the
NFS client mount that is on loopback that needs to avoid recursion.

IMO, the fix should be that the NFS client cannot block on messages sent to the NFSD
on the same host during memory reclaim. That is, nfs_release_page()
cannot send commit messages to the server if the server is on
localhost. Instead, it just tells memory reclaim that it can't
reclaim that page.

If nfs_release_page() no longer blocks in memory reclaim, and all
these nfsd-gets-blocked-in-GFP_KERNEL-memory-allocation recursion
problems go away. Do the same for all the other memory reclaim
operations in the NFS client, and you've got a solution that should
work without needing to walk all over the rest of the kernel....

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Stephen Rothwell: "linux-next: manual merge of the ipsec tree with Linus' tree"
Previous message: Jason Gunthorpe: "Re: [PATCH v5 2/4] arm64: dts: APM X-Gene PCIe device tree nodes"
In reply to: NeilBrown: "Re: [PATCH/RFC 00/19] Support loop-back NFS mounts"
Next in thread: NeilBrown: "Re: [PATCH/RFC 00/19] Support loop-back NFS mounts"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]