dir hardlinks (was: Re: fsync on large files)

Kai Henningsen (kaih@khms.westfalen.de)
24 Feb 1999 20:53:00 +0200

viro@math.psu.edu (Alexander Viro) wrote on 18.02.99 in <Pine.SOL.3.95.990218181307.11831e-100000@weyl.math.psu.edu>:

> Up to Linus, indeed, but IMO it's *the* way to madness.

In cases like this, *if* there is a solution, it usually involves keeping
some columbus-egg-type invariants - obvious once you know them, far from
obvious before.

I suspect a real solution to the directory hardlink problem is going to
involve some sort of ordering along (new) attributes.

There are essentially two problems I see: aliasing (which is solved by
referring inodes) and keeping link/unlink sane. (Rename is allowed if the
link/unlink or even unlink/link/unlink (in case of name collision)
sequence would be allowed.)

Keeping link/unlink sane necessitates not splitting off parts of the FS.
For the sake of applications like find, it would also be nice to avoid
creating loops.

Now, if you're splitting off a part of the FS via link/unlink, that means
you are creating a loop, so if you can avoid loops, not splitting off
parts comes free.

Here's a first attempt at a rule to avoid loops. It's not optimal, because
it's not particularly cheap, but maybe it gives someone an idea.

The root of the filesystem gets an attribute X with some numeric value[-].

Every directory gets an attribute X that is at least[+] one more than the
highest value of any parent it has.

When you create a new link to a directory D, that means the directory gets
a new parent P (which actually may already be a parent under another
name). If X(P)<X(D), you're fine.

If X(P)>=X(D), you may be creating a loop. Calculate the new value for
X(D). Recursively propagate through all children of D [*]. During this, if
you hit D, you have a loop; don't allow the link.

As for unlink, if the link to remove points to a directory with more than
one directory hardlink (that is, not counting any "." and ".." links you
might want to simulate for Unix compatibility), then you can do the
unlink. Nothing bad can happen.

[*] This is the costly part.
[+] It doesn't need to be exactly one larger. Anything larger works.
[-] Exact value unimportant.

As I said, it's not cheap, so we want something better. However, I do
believe it's correct.

Hmm. We could work in the opposite direction, giving leaf directories one
value and assigning values from there toward the root. Would essentially
work the same way, except going through the parents, not the children; has
the same type of problem, but one assumes one would have less parents than
children on average.

The important idea is keeping X(parent) > X(child) (or X(parent) <
X(child)) at all times.

Simple algorithm for count updating:

todo = directory about to change it's X attribute

While not empty(todo)
pull candidate from list
see if candidate needs X attribute changed
if yes, add all children (or parents) of candidate to list
if one of those was the original directory, fail the operation

When the list is empty, you're done. Since you started out without a loop,
the new loop must involve your starting directory, so either you'll find
the loop or terminate after looking at a finite number of directories.

Hmm. I just noticed this is very similar to the problem of threading news
articles. I just argued pretty strongly in the "drums" mail standards list
that we should not allow the more-than-one-parent case ;-)

MfG Kai

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/