Re: [PATCH] cowlinks v2

From: Jörn Engel
Date: Mon Apr 05 2004 - 03:15:20 EST


On Fri, 2 April 2004 11:28:55 -0800, Ross Biro wrote:
>
> Of course, it gets more interesting if you try to do it at the block
> level instead of at the file level. For ext2, you could just reserve
> a block #, say -1, to mean take the data from the master cow file, and
> anything else is treated normally. You would need a deamon to make
> sure you were still saving space though.

More interesting is correct. I see the advantages and proposed this
myself some time ago, but there are downsides. Basically, for each
block you need additional data, at least a counter telling you the
number of users it currently has. Eats up memory.

If it really has to make sense, you also have to detect duplicated
blocks at runtime. So you need a checksum for each block and a
balanced tree containing those checksums or some other means of quick
access. Eats up 40 bytes (16 checksum, 3*8 tree pointers). With 4k
blocks, that's 1% memory overhead.

Runtime is even worse. Unless the tree fits into memory, you have 1-3
disk reads for each write. Most filesystem developers don't like to
hear such news.

Frankly, the disadvantages still outweigh the advantages today. With
64k blocks, more memory and more diskspace and in comparison less
unique blocks, it might make sense. Later. :)

Jörn

--
Ninety percent of everything is crap.
-- Sturgeon's Law
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/