Re: Transparent compression in the FS

From: Jeff Garzik
Date: Thu Oct 16 2003 - 12:11:30 EST


Andrea Arcangeli wrote:
Hi Jeff,

On Wed, Oct 15, 2003 at 11:13:27AM -0400, Jeff Garzik wrote:

Josh and others should take a look at Plan9's venti file storage method -- archival storage is a series of unordered blocks, all of which are indexed by the sha1 hash of their contents. This magically coalesces all duplicate blocks by its very nature, including the loooooong runs of zeroes that you'll find in many filesystems. I bet savings on "all bytes in this block are zero" are worth a bunch right there.


I had a few ideas on the above.

if the zero blocks are the problem, there's a tool called zum that nukes
them and replaces them with holes. I use it sometime, example:

andrea@velociraptor:~> dd if=/dev/zero of=zero bs=1M count=100
100+0 records in
100+0 records out
andrea@velociraptor:~> ls -ls zero
102504 -rw-r--r-- 1 andrea andrea 104857600 2003-10-16 18:24 zero
andrea@velociraptor:~> ~/bin/i686/zum zero
zero [820032K] [1 link]
andrea@velociraptor:~> ls -ls zero
0 -rw-r--r-- 1 andrea andrea 104857600 2003-10-16 18:24 zero
andrea@velociraptor:~>

Neat.


the hash to the data is interesting, but 1) you lose the zerocopy
behaviour for the I/O, it's like doing a checksum for all the data going to
disk that you normally would never do (except for the tiny files in reiserfs
with tail packing enabled, but that's not bulk I/O), 2) I wonder how much data
is really duplicate besides the "zero" holes trivially fixable in userspace
(modulo bzImage or similar where I'm unsure if the fs code in the bootloader
can handle holes ;).

FWIW archival storage doesn't really care... Since all data written to disk is hashed with SHA1 (sha1 hash == block's unique id), you gain (a) duplicate block coalescing and (b) _real_ data integrity guaranteed, but OTOH, you lose performance and possibly lose zero-copy.

I _really_ like the checksum aspect of Plan9's archival storage (venti).

As Andre H and Larry McVoy love to point out, data isn't _really_ secure until it's been checksummed, and that checksum data is verified on reads. LM has an anecdote (doesn't he always? <g>) about how BitKeeper -- which checksums its data inside the app -- has found data-corrupting kernel bugs, in days long past.

Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/