Re: [PATCH 0/4] 2.6.21-rc7 NFS writes: fix a series of issues

From: Trond Myklebust
Date: Wed Apr 18 2007 - 10:12:25 EST


On Wed, 2007-04-18 at 08:42 -0500, Florin Iucha wrote:
> On Wed, Apr 18, 2007 at 09:15:31AM -0400, Trond Myklebust wrote:
> > There is only one request on the 'pending' queue. That would usually
> > indicate that the connection to the server is down. Can you check using
> > "netstat -t" whether or not there is a connection in the 'ESTABLISHED'
> > state to the server? Please also repeat the command a couple of times in
> > order to see if the socket/port number on the connection changes.
>
> This is with your fifth patch on top of the previous four patches:
>
> http://iucha.net/nfs/21-rc7-nfs3/big-copy
>
> Again, it has memory, stack traces and rpc_debug.
>
> The iostat 5 output:
>
> http://iucha.net/nfs/21-rc7-nfs3/iostat
>
> The netstat outputs are stable (not changed in 5 minutes):
>
> http://iucha.net/nfs/21-rc7-nfs3/netstat-server :
>
> tcp 1 0 hermes.iucha.org:nfs zeus.iucha.org:799 CLOSE_WAIT
> tcp 0 0 hermes.iucha.org:nfs zeus.iucha.org:976 ESTABLISHED
>
> http://iucha.net/nfs/21-rc7-nfs3/netstat-client
>
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address Foreign Address State
> tcp 0 0 zeus.iucha.org:976 hermes.iucha.org:nfs ESTABLISHED
> tcp 0 0 zeus.iucha.org:ssh hermes.iucha.org:56880 ESTABLISHED
> tcp 0 0 zeus.iucha.org:ssh hermes.iucha.org:45176 ESTABLISHED
>
> Could the port in CLOSE_WAIT state be the culprit? (FWIW
> the server has been up for 38 days and subjected to
> this nfs test quite a bit without showing any stress).

The port in CLOSE_WAIT shows that a socket was closed down recently, but
once the connection is re-established, the client should start sending
data.
Do you have a copy of wireshark or ethereal on hand? If so, could you
take a look at whether or not any NFS traffic is going between the
client and server once the hang happens?
Note that the timeout value is 60 seconds, so if you see no immediate
traffic, then let the ethereal/wireshark session keep running for a
couple more minutes.

Cheers,
Trond
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/