Re: (reiserfs) reiserfs and knfsd and NFSv4 and volatile file handles

From: Steve Dodd (steved@loth.demon.co.uk)
Date: Sat Mar 18 2000 - 05:57:27 EST


On Fri, Mar 17, 2000 at 05:57:33PM +0100, Manfred Spraul wrote:

> I think we have 2 completely independant problems:
> a) POSIX inode numbers [ino_t].
> They must be unique, and programs such as tar uses them to identify the
> target of hardlinks. It's currently 32-bits on i386, mips, mips64; 64-bits
> on ia64, sparc64.
> We don't support lookups based on ino_t, and we don't need it for NFS.
>
> b) the ability to find an disk entry given a "key".
> required for NfsV2, NfsV3.
> This key should remain stable across rename, chown, reboot, .., and it
> should never be reused for another file.
> The key is quite long [NfsV2: 32 bytes, but several bits are required to
> identify the superblock, optimizations]

I don't think these two are independent at all. Things like tar also need
the key to be stable (incremental backups), and the key they currently use is
ino_t, which is an arithmetic type. The big question is whether restricting
filesystems to 32 / 64bits of key is reasonable or not. If it isn't, then we
have problems both userspace and NFS (<4). The userspace problem could only be
completely solved AFAICS by adding another version of stat which dealt with
opaque keys for inodes. I assume there are no other UN*X-like systems out
there that have done this already? If not, and it's deemed worth doing, we
should try and get some sort of agreement with other vendors. Of course,
there's still the issue of what the "old" API (stat + stat64) would return:
a hash of the key (not unique but stable over umount), or something
even nastier?

> I have no real idea how to solve that problem, I'm stuck with hardlinks:
>
> /fs: mount point for a filesystem.
> /fs/exported/
> /fs/not-exported/: 2 folders
>
> <local user>:
> cp /bin/sh /fs/exported/f1
> <nfsd gets a filehandle to f1>
> <local user>:
> mv /fs/exported/f1 /fs/not-exported/
> (* what happens to the nfs file handle? still valid, or EACCES?*)
> ln /fs/not-exported/f1 /fs/exported/f2
> (* what now? Does it become valid again?*)

Yeuch. The wonders of trying to control access to nameless objects by name :-)

> > We might even be able to satisfy the open-by-inode people with something
> > like this - or are there other problems with that?
>
> Who needs "open-by-inode"?

I dimly recall USENET server authors muttering about it, though with a sane
directory layout and an effective dcache I would've thought the win would be
minimal. I don't know how many of unfsd's problems it would solve, either. I
also don't know what the implications of this on the VFS would be. The next
step would be complete implemention of NTFS-like filesystem-as-database: inodes
are records, inode numbers are primary keys, and then (in theory) you can
index on whatever you like. Whether this is actually useful I have no idea. It
certainly doesn't seem very "UNIX-y".

-- 
The very concept of PNP is a lovely dream that simply does not translate to
reality. The confusion of manually doing stuff is nothing compared to the
confusion of computers trying to do stuff and getting it wrong, which they
gleefully do with great enthusiasm. -- Jinx Tigr in the SDM

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:25 EST