intermittent NFS hangs from NetApp

From: Garrick Staples (gstaples@interliant.com)
Date: Sun Feb 27 2000 - 14:13:49 EST


Hi all, first time poster here...

I have several brand new Dell 4350s running Stock-out-of-the-box Redhat
6.1 2.2.12-32smp (my boss won't let me upgrade the kernel unless I show
a real good reason). All of these boxes are mounting at least one NFS
mount from a Dell oem NetApp 760. All of these boxes run just fine with
the exception of one NFS mount that occasionally hangs. This particular
box has 5 different mounts from the NetApp, and only one hangs. I can't
cause the hang, excessive writes and reads won't cause it. The only
defining difference with this mount is that it is mounted with the
"nolock" option because of FrontPage.

When the problem occurs, all processes that touch the mount are
indefinitly hung. Errors in messages show up:
Feb 24 18:08:04 dhp0020 kernel: nfs: server 192.168.0.253 not
responding, still trying
Feb 24 18:08:04 dhp0020 kernel: nfs: server 192.168.0.253 not
responding, still trying
Feb 24 18:08:08 dhp0020 kernel: nfs: task 6136 can't get a request slot
Feb 24 18:08:12 dhp0020 kernel: nfs: task 6137 can't get a request slot
Feb 24 18:10:30 dhp0020 kernel: nfs: task 6138 can't get a request slot
...

The problem is immediately solved with a simple umount -f; which fails
because of the current processes, but it fixes the hang! When I do
umount -f, all of the waiting processes get a failed read, but they
continue normally.

I can see that umount -f calls umount2() instead of umount(), but I
can't see how that function somehow "resets" the mount. I can also see
that the printed errors come from call_timeout() in the rpc client code,
but I can't figure out why. Other mounts from the same NetApp work
normally ???

Sorry for the long post, but I wanted to be really complete... thanks in
advance

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:18 EST