Stale NFS file handles and race conditions...

Steve Mcclure (smcclure@emc.com)
Thu, 26 Aug 1999 15:39:53 -0400


Heya. I've come across yet another *very* strange Stale NFS file
handle situation. As a precursor, the NFS server doesn't matter, the
NFS client is a Linux 2.2.10 client (with other various patches
applied to bring the NFS client up to par), and I'm using NFS V2, over
UDP. Everything is mounted noac (though I tried other ways, doesn't
matter, still happens.)

I have two machines. We'll call them nfs1 and nfs2. If I create a
/home/smcclure on an NFS server (doesn't matter what kind I have found
out) and then execute the following commands, in order (note that I'm
interleaving the prompts from each machine - I have two windows open,
one into each machine), I get the following results:

nfs1> mkdir foo
nfs2> ls -sl foo
total 0
nfs1> rm -rf foo
nfs1> mkdir foo
nfs2> /bin/sh -c "(cd foo ; ps -ef > a)"
nfs2>

Note that this works. Now, I do this (again, from scratch, after
doing an rm -rf of foo):

nfs1> mkdir foo
nfs2> ls -sl foo
total 0
nfs1> rm -rf foo
nfs1> mkdir foo
nfs2> /bin/sh -c "( cd foo ; ps -ef > a ) & ( cd foo ; ps -ef > b )"
/bin/sh: a: Stale NFS file handle
/bin/sh: b: Stale NFS file handle
nfs2>

As you can see, that doesn't work. If you notice, I've created two
processes which both change directory into the same directory, then
try to create files within that directory.

Also, as a point of reference, I try this with Solairs and FreeBSD
clients, and those work perfectly (i.e. just like situation #1.)

So, I'm not sure exactly what is going on. It seems to be some sort
of race between flushing caches (though nothing should be cached with
noac on, unless yet something else slipped through) when the two
different processes change directory into the same directory, etc.

Any comments, suggestions, etc? Also, I'm not on the kernel list
right now, so please cc: me on any replies. Thanks!!!

-- Steve McClure
smcclure@emc.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/