Re: journaling filesystem

Adam D. Bradley (
Thu, 14 May 1998 02:33:56 -0400 (EDT)

On Thu, 14 May 1998, Rogier Wolff wrote:

> David Woodhouse wrote:
> >
> > OK. I'd rather not require write ordering either. Could anyone point me towards
> > a design for a filesystem which meets...
> >
> > The goal:
> > A filesystem in which the on-disk state is consistent at all times,
> > preferably including the _contents_ of files as well as the metadata.
> >
> > The constraint:
> > Blocks written to disk may be committed in arbitrary order. Hence not
> > only must the filesystem state be sane after every write, but it must
> > also be so for any possible chronological permutation of the writes.
> I'm not able to prove my claim mathematically, but:
> This is not possible, with a reasonable implementation.

Sure it is, provided you can guarantee the atomicity of blocks being
written to the disk. Every change to the filsystem is recorded in a
single block, part of which are headers and part of which is data
(which may be file content, whatever).

One of the headers is "serial number, and a "depends on" field. If you
can presume that the filesystem is in a consistent on-disk state
before we begin, then it can be guaranteed that the on-disk state is
always consistent (albeit it may be the original state)...

block #1 is not written
Block #2 -> depends on 1
Block #3 -> depends on 2
Blcok #4 -> depends on 1
Block #5 -> depends on 0

Then the on-disk image is still consistent, since only the initial
state is "valid"; 2 is rejected because it depends on 1, and all the
others are rejected because they either deepend on 1 implicitly or
explicitly, or because their serial numbers are greater than 1 (so
they may have an unstated but implicit relationship with 1, eg POSIX
open-order semantics, etc).

Of course, I don't have to tell you that run-time performance would
absolutely suck for such a filesstem. ;-)


