Re: reiserfs and knfsd and NFSv4 and volatile file handles

From: Hans Reiser (reiser@idiom.com)
Date: Wed Mar 15 2000 - 12:39:12 EST


"Neil F. Brown" wrote:
>
> On Thursday March 2, Hans Reiser wrote to linux-kernel (and others) :
> > Andi Kleen wrote:
> >
> > >
> > > [NFSv4 has volatile file handles that require the client to store the
> > > path and relookup when needed, but of course the
> > > end result will be even farther away from POSIX semantics as current NFS
> > > is already]
> >
> > If I am correct in guessing what a volatile file handle is, it sounds like the
> > right answer to me, that and changing the posix semantics. What exactly is a
> > volatile file handle? Does it allow a file handle to go bad and then just
> > relookup? Is it an attempt to cure stale file handles?
> >
>
> and later, Hans Reiser wrote:
> > Currently having to find files by something other than a dentry or a filename
> > (or a key) uglifies our design. Even the use of a 64 bit number forces us to
> > use fixed length keys which forces us to use hashing which by destroying sorting
> > makes it impossible to use stem compression in our directories. (This affects
> > one of our web search engine sponsors who would like for 30 million entry
> > directories to fit into RAM in our next major version of reiserfs.) If knfsd
> > goes to using dentries to talk to reiserfs, we will be very happy.
>
> I've Cc:ed this to the nfsv4 working group as well.
>
> Maybe I can shed a bit of light.
>
> knfsd cannot just used dentries (internal structures accessed by path
> name) to talk to reiserfs as NFS requires it to use "file handles"
> which have a maximum length. To support NFSv2 or v3 you will need to
> provide a way to map from a fix-size identifier to a file, that is
> stable across rename, write, truncate, reboot.
>
> The volatile filehandles of NFSv4 aim to get around this restriction,
> and if you will (one day) be happy to only support NFSv4, not v2 or v3
> (once the spec if finalised and the implementations are done and
> clients are common...) then you have more flexability in talking to
> NFS.
>
> The NFS client only *needs* a file handle to be stable while it has
> the file open. Having it stable while the client does not have the
> file open is at best a convenience for caching.
>
> NFSv4 allows the client to "open" and subsequently "close" a file -
> abstractions that absent from NFSv2,3.
>
> NFSv4 allows the server to assert that file handles are stable
> (non-volatile, persistant) while a file is open, but that it will
> provide no guarantees at other times.
>
> This can work quite well for an MSDOS style filesystem which can use
> filehandles based on the file name (E.g. disc address of directory
> entry) and which disallows hard links, and disallows unlink or rename
> while a file is open.
>
> This should be able to be made to work fairly well for reiserfs too.
>
> As I understand reiserfs (which isn't very deeply), it has an emphasis
> on efficient support for lots of small files,

Yes

> and so likes to keep
> meta-data and some or all of the data associated with a file in the
> directory entry that references the file.
>
Not true at the moment, and not the issue. The issue is that we don't have
on-disk inodes and we don't have inode numbers. This means that we don't have
fixed allocations of space for inodes, and we can't do lookups by 32bit numbers
for files.

We do have keys, but these are currently 16 bytes and soon will be of variable
length.

Currently we hash filenames as part of constructing keys of fixed length, but
this is ugly and mostly motivated by NFS.

It hurts our performance, and makes directories difficult to compress since stem
compression doesn't work on randomly ordered entries. Directory entries are
ordered by the hash of the filename currently. This is bad for performance for
directories too big to fit into RAM, or even moderately large ones.

 I can imaging that this would mean changing the size of the file,
> renaming the file, or even adding files to the directory could all
> change the physical location of the meta-data on disc, making it hard
> to have a fixed-length, stable index to a file, as needed by NFSv2,3.
>
> A simple minded approach to providing presistant-only-when-open
> file handles for reiserfs might be to allocate sequential file handles
> and to keep a cache of "filehandle to metadata-disc-address"
> mappings. This cache would have to be kept up-to-date whenever a file's
> metadata moved.
> When a file was opened, the mapping would be marked as 'persistant'
> and be logged to stable storage. On the last close, the mapping could
> have the presistant flag removed.
>
> Then the cache could be regularly cleaned of non-persistant mappings,
> in an LRU fashion, and any filehandle that is presented which is not
> in the cache gets an "NFS4ERR_FHEXPIRED" response.
>
> The logging to stable storage is required so that in the event of a
> reboot, the server will still honour open file handles.
>
> As this approach requires logging all opens to disc, it is very likely
> to be a performance nightmare.
>
> If we can assume that while the disc address of metadata can change,
> it doesn't change very often, then a suitable alternative might be:
>
> - Store a uniquifier in the metadata for each file - an essentally
> random 32-64 bit number.
> - Make file handles out of disc-address+uniquifier.
> - When an open file has the disc address of its metadata changed, cache
> the old filehandle and the new address both in memory and on stable
> storage.
>
> Then, when a file handle is received, first check the cache. If it
> isn't there, check to see if valid metadata exists at the disc address
> and has the given uniquifier. If it doesn't only then give a
> NFS4ERR_FHEXPIRED.
>
> On reboot, you would need to prime the cache from the stable-storage
> log.
>
> This should provide the client with the stability it needs, and not
> incur too much performance overhead (or does the address of metadata
> change a lot?)
>
> I would like the server to be able to tell the client that a
> filehandle needs to be updated, and for the client to be able to ask

This would definitely be a cleaner architecture that would allow repackers to
not disrupt NFS if they change the key. We have a repacker in progress. I can
easily imagine your getting lots of NFS troubles every time the repacker changes
things without having a way to tell NFS that it was changed.

> for an updated copy of a given filehandle - so that entries in the
> old-to-new cache could be discarded even while a file was still open,
> but that proposal wasn't accepted (yet).
>
> NeilBrown
>

Thanks much Neil.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-- 
You can get ReiserFS at http://devlinux.org/namesys, and customizations and
industrial grade support at reiser@idiom.com.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:17 EST