Re: Transparent compression in the FS

From: Larry McVoy
Date: Thu Oct 16 2003 - 12:31:12 EST


On Wed, Oct 15, 2003 at 11:13:27AM -0400, Jeff Garzik wrote:
> Josh and others should take a look at Plan9's venti file storage method
> -- archival storage is a series of unordered blocks, all of which are
> indexed by the sha1 hash of their contents. This magically coalesces
> all duplicate blocks by its very nature, including the loooooong runs of
> zeroes that you'll find in many filesystems. I bet savings on "all
> bytes in this block are zero" are worth a bunch right there.

The only problem with this is that you can get false positives. Val Hensen
recently wrote a paper about this. It's really unlikely that you get false
positives but it can happen and it has happened in the field.
--
---
Larry McVoy lm at bitmover.com http://www.bitmover.com/lm
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/