Re: Transparent compression in the FS

From: David Lang
Date: Thu Oct 16 2003 - 19:06:12 EST


On Thu, 16 Oct 2003, jw schultz wrote:

> Date: Thu, 16 Oct 2003 16:58:11 -0700
> From: jw schultz <jw@xxxxxxxxxx>
> To: linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: Transparent compression in the FS
>
> On Thu, Oct 16, 2003 at 07:30:58PM -0400, Jeff Garzik wrote:
> > jw schultz wrote:
> > >Because each hash algorithm has different pathologies
> > >multiple hashes are generally better than one but their
> > >effective aggregate bit count is less than the sum of the
> > >separate bit counts.
> >
> > I was coming to this conclusion too... still, it's safer simply to
> > handle collisions.
>
> And because, in a filesystem, you have to handle collisions
> you balance the cost of the hash quality against the cost of
> collision. A cheap hash with low colission rate is probably
> better than an expensive hash with a near-zero colission
> rate.

true, but one advantage of useing multiple hashes is that once you already
have the data in the CPU you can probably do multiple cheap hashes for the
same cost as a single one (the same reason that calculating a checksum is
essentially free if you are already having the CPU touch the data) when
the CPU is 12x+ faster then the memory bus you have a fair bit of CPU time
available for each byte of data that you are dealing with.

David Lang
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/