Re: [NFS] Re: lockd: couldn't create RPC handle for (host)

From: Dan Stromberg
Date: Mon Dec 19 2005 - 13:52:44 EST




Sometimes what'll happen is that the daemon (or kernel component) that
services requests on an RPC port will die or otherwise stop servicing
requests, but I believe there's nothing in the portmapper or rpcbind
daemon(s) that will periodically poll to see if those requests are still
being serviced.

I'd hop over to another machine that is advertising the same service,
and use this URL:

http://dcs.nac.uci.edu/~strombrg/What-program-is-active-on-that-port.html

...to see what -should- be handling those requests, and then check on
that same facility on the host having trouble.

On Sat, 2005-12-17 at 00:59 -0500, Ryan Richter wrote:
> On Sat, Dec 17, 2005 at 12:32:57AM -0500, Trond Myklebust wrote:
> > On Fri, 2005-12-16 at 18:58 -0500, Ryan Richter wrote:
> > > On Fri, Dec 16, 2005 at 06:49:05PM -0500, Trond Myklebust wrote:
> > > > On Fri, 2005-12-16 at 15:55 -0500, Ryan Richter wrote:
> > > > > Hi, nfs locking stopped working on my file server running 2.6.15-rc5
> > > > > today. All clients that try locking operations hang, and I get the
> > > > > message from the server:
> > > > >
> > > > > lockd: couldn't create RPC handle for w.x.y.z
> > > >
> > > > Points either to a client which is not responding to callbacks, or an
> > > > OOM situation on the server.
> > > >
> > > > Does 'rpcinfo -u w.x.y.z 100021' work from the server?
> > >
> > > No.
> > >
> > > $ rpcinfo -u jacquere 100021
> > > rpcinfo: RPC: Timed out
> > > program 100021 version 0 is not available
> > > zsh: exit 1 rpcinfo -u jacquere 100021
> > >
> > > So I see now lockd is not present on the client. But...
> > >
> > > $ rpcinfo -p jacquere
> > > program vers proto port
> > > 100000 2 tcp 111 portmapper
> > > 100000 2 udp 111 portmapper
> > > 100021 1 udp 32768 nlockmgr
> > > 100021 3 udp 32768 nlockmgr
> > > 100021 4 udp 32768 nlockmgr
> > > 100024 1 udp 867 status
> > > 100024 1 tcp 870 status
> > >
> > > So what does that mean?
> >
> > Looks to me as if port 111 is pingable (since you can talk to the
> > portmapper), but port 32768 is not. Are you using port filtering or
> > firewalling anywhere (on the client, server, or switches)?
>
> There's no filtering between the two. I get this on the machine itself:
> $ rpcinfo -u localhost 100021
> rpcinfo: RPC: Timed out
> program 100021 version 0 is not available
> zsh: exit 1 rpcinfo -u localhost 100021
>
> There's no lockd process running on this client machine anymore.
>
> On the server:
>
> $ rpcinfo -u localhost 100021
> program 100021 version 1 ready and waiting
> rpcinfo: RPC: Program/version mismatch; low version = 1, high version =4
> program 100021 version 2 is not available
> program 100021 version 3 ready and waiting
> program 100021 version 4 ready and waiting
> zsh: exit 1 rpcinfo -u localhost 100021
>
> Also neither machine is anywhere close to OOM.
>
> -ryan
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
> for problems? Stop! Download the new AJAX search engine that makes
> searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
> http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
> _______________________________________________
> NFS maillist - NFS@xxxxxxxxxxxxxxxxxxxxx
> https://lists.sourceforge.net/lists/listinfo/nfs
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/