Re: nfs3 problem with -rc{2,3} : blame

From: Myklebust, Trond
Date: Tue Jun 19 2012 - 13:46:41 EST


On Tue, 2012-06-19 at 17:55 +0100, Ken Moffat wrote:
> On Tue, Jun 19, 2012 at 04:23:23PM +0000, Myklebust, Trond wrote:
> > On Tue, 2012-06-19 at 12:20 -0400, Trond Myklebust wrote:
> > >
> > > However you are saying that the problem is there when you compile a
> > > kernel with this commit as the head, and it goes away when you compile a
> > > kernel with commit 3e9e0ca3f19e911ce13c2e6c9858fcb41a37496c as the head?
> > >
> Provided I apply 4f97615d as well, so that it compiles, yes.
>
> > > I'm confused as to how a bug in that patch could depend on
> > > CONFIG_NFS_V4, but I'll see what I can find.
>
> Thanks
> >
> > By the way, I thought your test-case was doing firefox downloads. Do
> > those really use O_DIRECT?
> >
> I originally saw the problem doing that, but it was on the second
> download. Or perhaps third or fourth - I tend not to remember
> successful downloads when I've got a lot of packages to check for new
> versions. Using my backup script seemed a more reliable way to
> trigger a problem (but, only if there is something substantial to
> back up, such as a new vmlinuz).
>
> Thinking about this, it is almost certain that between the first
> download and the one that failed (several hours later) my backup
> script did run, from fcron, so I now think the rsync problem is what
> leads to issues when other programs later try to update the same nfs
> directory.

Does the following patch make any difference?

You probably want to ensure that you also have commit
906369e43c29001c39c7dfed8a01b9dff24ace75 (which is in 3.5-rc3) since
that corrects a similar issue.

Cheers
Trond
8<------------------------------------------------------------
From ed3b97f9af6421f326de413e6d6556d1ecc3399d Mon Sep 17 00:00:00 2001
From: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
Date: Tue, 19 Jun 2012 13:39:14 -0400
Subject: [PATCH] NFS: Fix a refcounting issue in O_DIRECT

In nfs_direct_write_reschedule(), the requests from nfs_scan_commit_list
have a refcount of 2, whereas the operations in
nfs_direct_write_completion_ops expect them to have a refcount of 1.

This patch adds a call to release the extra references.

Signed-off-by: Trond Myklebust <Trond.Myklebust@xxxxxxxxxx>
---
fs/nfs/direct.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 3168f6e..9a4cbfc 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -490,6 +490,7 @@ static void nfs_direct_write_reschedule(struct nfs_direct_req *dreq)
dreq->error = -EIO;
spin_unlock(cinfo.lock);
}
+ nfs_release_request(req);
}
nfs_pageio_complete(&desc);

--
1.7.10.2


--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@xxxxxxxxxx
www.netapp.com

N‹§²æìr¸›yúèšØb²X¬¶ÇvØ^–)Þ{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü}©ž²ÆzÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ßf”ù^jÇy§m…á@A«a¶Úÿ 0¶ìh®å’i