Re: Designing Another File System

From: John Richard Moser
Date: Tue Nov 30 2004 - 20:51:24 EST


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



Phil Lougher wrote:
| On Mon, 29 Nov 2004 23:32:05 -0500, John Richard Moser
| <nigelenki@xxxxxxxxxxx> wrote:
|
|
|>- - localization of Inodes and related meta-data to prevent disk thrashing
|
|
| All filesystems place their filesystem metadata inside the inodes. If
| you mean file metadata then please be more precise. This isn't
| terribly new, recent posts have discussed how moving eas/acls inside
| the inode for ext3 has sped up performance.
|

I got the idea from FFS and its derivatives (UFS, UFS2, EXT2). I don't
want to store xattrs inside inodes though; I want them in the same block
with the inode. A few mS for the seek, but eh, the data's right there,
not on the other side of the disk.

|
|>- - a scheme which allows Inodes to be dynamically allocated and
|>deallocated out of order
|>
|
|
| Um, all filesystems do that, I think you're missing words to the
| effect "without any performance loss or block fragmentation" !

All filesystems allow you to create the FS with 1 inode total?


Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/hda1 7823616 272670 7550946 4% /

No, it looks like this one allocates many inodes and uses them as it
goes. Reiser has 0 inodes . . .

|
|
|>- - 64 bit indices indicating the exact physical location on disk of
|>Inodes, giving a O(1) seek to the Inode itself
|
|
|>1) Can Unix utilities in general deal with 64 bit Inodes? (Most
|>programs I assume won't care; ls -i and df -i might have trouble)
|>
|
|
| There seems to be some confusion here. The filesystem can use 64 bit
| inode numbers internally but hide these 64 bits and instead present
| munged 32 bit numbers to Linux.
|
| The 64 bit resolution is only necessary within the filesystem dentry
| lookup function to go from a directory name entry to the physical
| inode location on disk. The inode number can then be reduced to 32
| bits for 'presentation' to the VFS. AFAIK as all file access is
| through the dentry cache this is sufficient. The only problems are
| that VFS iget() needs to be replaced with a filesystem specific iget.
| A number of filesystems do this. Squashfs internally uses 37 bit
| inode numbers and presents them as 32 bit inode numbers in this way.
|

Ugly, but ok. What happens when i actually have >4G inodes though?

|
|>3) What basic information do I absolutely *need* in my super block?
|>4) What basic information do I absolutely *need* in my Inodes? (I'm
|>thinking {type,atime,dtime,ctime,mtime,posix_dac,meta_data_offset,size,\
|>links}
|
|
| Very much depends on your filesystem. Cramfs is a good example of the
| minimum you need to store to satisfy the Linux VFS. If you don't care
| what they are almost anything can be invented (uid, gid, mode, atime,
| dtime etc) and set to a useful default. The *absolute* minimum is
| probably type, file/dir size, and file/dir data location on disk.

I meant basic, not for me. Basic things a real Unix filesystem needs.
What *I* need comes from my head. :)

|
|
|>I guess the second would be better? I can't locate any directories on
|>my drive with >2000 entries *shrug*. The end key is just the entry
|>{name,inode} pair.
|
|
| I've had people trying to store 500,000 + files in a Squashfs
| directory. Needless to say with the original directory implementation
| this didn't work terribly well...
|

Ouch. Someone told me the directory had to be O(1) lookup . . . .

| Phillip Lougher
|

- --
All content of all messages exchanged herein are left in the
Public Domain, unless otherwise explicitly stated.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBrSGNhDd4aOud5P8RAlo0AJ4pxB/LMhgTvNW4GdMmaNA2/uM0wACfWR8+
kOxwHU3/mTUUNAAhda2rv+g=
=fsJV
-----END PGP SIGNATURE-----
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/