Re: dir hardlinks (was: Re: fsync on large files)

Alexander Viro (viro@math.psu.edu)
Wed, 24 Feb 1999 19:16:03 -0500 (EST)


On 24 Feb 1999, Kai Henningsen wrote:

> viro@math.psu.edu (Alexander Viro) wrote on 18.02.99 in <Pine.SOL.3.95.990218181307.11831e-100000@weyl.math.psu.edu>:
>
> > Up to Linus, indeed, but IMO it's *the* way to madness.
>
> In cases like this, *if* there is a solution, it usually involves keeping
> some columbus-egg-type invariants - obvious once you know them, far from
> obvious before.
>
> I suspect a real solution to the directory hardlink problem is going to
> involve some sort of ordering along (new) attributes.
>
> There are essentially two problems I see: aliasing (which is solved by
> referring inodes) and keeping link/unlink sane. (Rename is allowed if the
> link/unlink or even unlink/link/unlink (in case of name collision)
> sequence would be allowed.)
>
> Keeping link/unlink sane necessitates not splitting off parts of the FS.
> For the sake of applications like find, it would also be nice to avoid
> creating loops.
>
> Now, if you're splitting off a part of the FS via link/unlink, that means
> you are creating a loop, so if you can avoid loops, not splitting off
> parts comes free.
>
> Here's a first attempt at a rule to avoid loops. It's not optimal, because
> it's not particularly cheap, but maybe it gives someone an idea.

[snip partial topsort w. dynamical updating]

> As I said, it's not cheap, so we want something better. However, I do
> believe it's correct.

Yes, it is. But: (a) It's vulnerable to overflows (two deep trees, move
one to the top of other, then back to original, then the same with roles
exchanged). I.e. it's not an invariant, only semi-invariant. Another thing
being locking and here we may end deep in it.
Moreover, it makes innocent, normal rename O(number of children) +
O(number of children) operation instead of O(1) + O(depth) (IO + RAM
accesses, that is).

> Hmm. We could work in the opposite direction, giving leaf directories one
> value and assigning values from there toward the root. Would essentially
> work the same way, except going through the parents, not the children; has
> the same type of problem, but one assumes one would have less parents than
> children on average.
>
> The important idea is keeping X(parent) > X(child) (or X(parent) <
> X(child)) at all times.

[snip]

> Hmm. I just noticed this is very similar to the problem of threading news
> articles. I just argued pretty strongly in the "drums" mail standards list
> that we should not allow the more-than-one-parent case ;-)

It's *almost* the same problem, except that with the newsfeed you don't
have connectedness and stuff arrives in random order.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/