Re: [PATCH] cowlinks v2

From: Eric W. Biederman
Date: Sat Mar 27 2004 - 18:47:51 EST


Jamie Lokier <jamie@xxxxxxxxxxxxx> writes:

> Eric W. Biederman wrote:
> > There are two possible implementations strategies for implementing
> > cow files. You can either start as Jörn did with hardlinks, or you
> > can start with symlinks.
>
> Symlinks have a _big_ problem: move one, or move it's target, and it
> no longer links to the same file. That's nothing like the
> transparency we need cowlinks to have.

With absolute paths you can move the symlink. But the other side
is still problematic.

> There's a third implementation strategy. Since we're talking in all
> cases about adding a new feature to the underlying filesystem, why not
> implement separate inodes pointing to an underlying shared inode which
> holds the data. (I think it was mentioned earlier in this thread).
>
> The implementation is very similar to symbolic links, but instead of
> having symlink inodes, you have cowlink inodes which point directly to
> another inode and count as references to it.
>
> That provides POSIX semantics and has none of the caveats you and I
> have mentioned for hard links and symbolic links.
>
> Implementation: Creating a cowlink to a non-cowlinked file creates a
> new shared inode by cloning the file's inode, converts the original
> inode to a cowlink-pointer inode, and creates a new cowlink-pointer
> inode.

And to state the obvious, you must make certain your fs has a lot of
inodes because this dramatically changes the inode to fs block ratio.

> This provides 100% semantic equivalence to copying: all operations
> including hard and symbolic links on the resulting cowlink files act
> as if the cowlink operation was a copy, but faster and using less
> space.

I like it. It's not too hard and it yields a very nice result.
syscall wise we would need:
int cowlink(const char *oldpath, const char *newpath);
int cstat(const char *filename, struct stat *buf);

Which of course are non POSIX :)

> > As my memory has it the proto implementation I saw using a stackable
> > fs and cow symlinks was about a 1000 lines. And it was that large
> > because it was complete (i.e. it did the copies including copying
> > directories.)
>
> I can see a stackable fs being useful for live CD distributions, where
> tmpfs or disk hold the modifications stacked over the original
> filesystem.

The practical use was getting root on a network filesystem to scale,
and be manageable. But the problem is essentially the same. Although
to guard against network congestion interacting badly with paging we
were actually thinking about implementing copy on read.

> But mounting an fs for each version of a source tree and keeping track
> of all those mounts: that would be incredibly clumsy to use.
>
> You could implement a stackable fs which is mounted once and provides
> cowlink operations within the fs. That still be a bit clumsy, but not
> so much as to make it unusable for source tree management.

And that is what the prototype we were playing with was doing. You
could not see the original files at all unless you mounted the base
filesystem somewhere outside the cowfs mount. It was actually quite
close to your special cow inode idea except that it stored data in
symlinks instead of inodes. To the extent of freeing files if all of
the cow links to them were removed, and the files were store on the
same filesystem as the cow links.

The addictive thing about the prototype implementation was that you
could do ``ln --cow / /some/other/directory'' and you would have an
atomic snapshot of your filesystem. Definitely not a feature for the
first implementation but certainly something to dream about.

Eric

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/