Re: Implementing Meta File information in Linux (and a note at the end on current reiserfs status)

Stephen C. Tweedie (sct@redhat.com)
Wed, 2 Sep 1998 16:19:59 +0100


Hi,

On Tue, 01 Sep 1998 23:47:10 -0700, Hans Reiser <reiser@idiom.com> said:

> Theodore Y. Ts'o wrote:
>> Instead, we're much better off designing a high-level API (implemented
>> using a replaceable shared library) for storing and retrieving metadata
>> information (and a common metadata format which both KDE and GNOME
>> share!!!), and then having a shared library which implements the storage
>> of said metadata information via some non-kernel, non-FS means.

> This is the structured storage approach. It is hideous. It creates
> non-symmetric semantics, and it inevitably results in lower
> performance implementations because it layers one storage allocation
> system on top of another. Just look at what happened when MS did
> this.

We don't need to make the same mistakes. Are you suggesting that
layering one structured storage scheme (eg. SQL database; persistant
object store) over another one (eg. filesystem) is a Bad Thing??? Is
NFS a bad thing because it layers a network over a filesystem over a
virtual device driver layer instead of letting the server talk straight
to the disk?

Just because layering can be done badly does not mean that layering is
bad. libdb/libdbm etc. are useful. VMS takes it to extremes, with the
OS providing RMS (Record Management Services) which perform high-level,
network-aware transactional record and indexing services in a very
general manner. As an ex-VMS maintainer, I can tell you that lots of
applications use RMS, because it is so amazingly fast.

> What do you get from your API that you can't get a higher performance
> solution for in ReiserFS? Even just the solution of a directory called
> filename.forks is far superior to any structured storage style solution,
> and filename.forks would not require any changes to namei().

That's not the point at all. It might be a valid point if we were
inventing a completely new operating system which was to live in
splendid isolation, but we are not. We live in a world where Linux sits
in networks of SGIs, Suns and HPs, and where people variously use AFS,
CODA and NFS to connect them up. A new set of kernel fs semantics
breaks that networkability completely. Given that very efficient
directory tree and small file management will give us great performance
improvements for the sub-directory proposals, I can't see how adding
something completely new to the fs API can possibly overcome the
benefits of keeping the Unix storage model intact.

> Ted, you have written a nice filesystem that just isn't designed for
> this need. You don't want to rewrite it from scratch to handle this
> issue efficiently. ReiserFS was designed for this need and any
> modifications would be trivial to none. Don't ask glibc to take on
> the role of namei(), that is just so wrong.

Hans, you are being completely unfair here: nobody is doubting that
ReiserFS can improve things. On the contrary: I'm saying that using the
normal Unix storage API, we can achieve excellent performance for fork
management by using ReiserFS itself, taking advantage of the extra
efficiency if we are running on ReiserFS but not losing all Unix
compatibility in the case where we are running on something else. What
I'm objecting to is the notion that we need to change fs semantics in
order to get these advantages.

> Structured Storage is such bloated code, such a bloated API, please
> remember that anytime you add an API you add to code complexity.

We don't need a complex API. All we need is appropriate conventions
about where to place such information as icon data or invocation scripts
for files, so that desktop managers can agree about how to store that
kind of thing.

> Perhaps I am being unfair in assuming that it would resemble what MS
> did but.....

> Let's just use directories, and make them efficient enough that it works.

That's what we're suggesting! The only reason a shared library API is
useful is because it makes it easier for different applications to use
the same conventions about how to deal with things like subdirectory
naming. The API doesn't have to be much more complex than something
like mktemp(3). YOU are the one suggesting changing the filesystem
semantics. At the kernel level, I believe that efficient manegement of
small files and large directories is perfectly sufficient to deal with
the issue, and the compatibility and networking problems are a
sufficient reason not to pollute the kernel API with any new semantics.

--Stephen

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.altern.org/andrebalsa/doc/lkml-faq.html