Re: more git updates..

From: tony . luck
Date: Sun Apr 10 2005 - 07:01:46 EST


>In other words, each "commit" file is very small and cheap, but since
>almost every commit will also imply a totally new tree-file, "git" is
>going to have an overhead of half a megabyte per commit. Oops.
>
>Damn, that's painful. I suspect I will have to change the format somehow.

Having dodged that bullet with the change to make tree files point at
other tree files ... here's another (potential) issue.

A changeset that touches just one file a few levels down from the top
of the tree (say arch/i386/kernel/setup.c) will make six new files in
the git repository (one for the changeset, four tree files, and a new
blob for the new version of the file). More complex changes make more
files ... but say the average is ten new files per changeset since most
changes touch few files. With 60,000 changesets in the current tree, we
will start out our git repository with about 600,000 files. Assuming
the first byte of the SHA1 hash is random, that means an average of 2343
files in each of the objects/xx directories. Give it a few more years at
the current pace, and we'll have over 10,000 files per directory. This
sounds like a lot to me ... but perhaps filesystems now handle large
directories enough better than they used to for this to not be a problem?

Or maybe the files should be named objects/xx/yy/zzzzzzzzzzzzzzzz?

-Tony
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/