Re: Mercurial 0.4b vs git patchbomb benchmark

From: Andrea Arcangeli
Date: Fri Apr 29 2005 - 15:30:24 EST


On Thu, Apr 28, 2005 at 11:01:57PM -0700, Matt Mackall wrote:
> change nodes so you've got to potentially traverse all the commits to
> reconstruct a file's history. That's gonna be O(top-level changes)
> seeks. This introduces a number of problems:
>
> - no way to easily find previous revisions of a file
> (being able to see when a particular change was introduced is a
> pretty critical feature)
> - no way to do bandwidth-efficient delta transfer
> - no way to do efficient delta storage
> - no way to do merges based on the file's history[1]

And IMHO also no-way to implement a git-on-the-fly efficient network
protocol if tons of clients connects at the same time, it would be
dosable etc... At the very least such a system would require an huge
amount of ram. So I see the only efficient way to design a network
protocol for git not to use git, but to import the data into mercurial
and to implement the network protocol on top of mercurial.

The one downside is that git is sort of rock solid in the way it stores
data on disk, it makes rsync usage trivial too, the git fsck is reliable
and you can just sign the hash of the root of the tree and you sign
everything including file contents. And of course the checkin is
absolutely trivial and fast too.

With a more efficient diff-based storage like mercurial we'd be losing
those fsck properties etc.. but those reliability properties don't worth
the network and disk space they take IMHO, and the checkin time
shouldn't be substantially different (still running in O(1) when
appending at the head). And we could always store the hash of the
changeset, to give it some basic self-checking.

I give extreme value in a SCM in how efficiently it can represent the
whole tree for both network downloads and backups too. Being able to
store the whole history of 2.5 in < 100M is a very valuable feature
IMHO, much more valuable than to be able to sign the root.

Also don't get me wrong, I'm _very_ happy about git too, but I just
happen to prefer mercurial storage (I would never use git for anything
but the kernel, just like I wasn't using arch for similar reasons).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/