Re: [PATCH] checkpatch.pl: Relax commit ID check to allow more than 12 chars

From: Linus Torvalds
Date: Sun Feb 05 2023 - 15:38:48 EST


On Sat, Feb 4, 2023 at 8:58 AM Joe Perches <joe@xxxxxxxxxxx> wrote:
>
> btw: it looks like 12 will still be sufficient for awhile yet

To be honest, that's actually closer to the 12-digit limit than I was expecting.

The git heuristics are pretty good, and it sounds like 13 hex digits
is already starting to happen, so maybe we should relax things.

That said, "up to 16" does sound questionable.

We're talking exponential growth by number of digits, so saying "let's
go from 12 to 16" is a *huge* jump. And I'd like to keep people doing
fewer digits just because these things get used in free-flowing prose,
and we have the whole line wrapping issue and things just get uglier
at some point.

So we're closing in on two decades of git use, and we are not that far
from having 10 million objects in our git database (for the base
tree). Sure, that's a lot of objects, but to a close approximation
the object count grows _largely_ linearly with time.

Considering that git is actually pretty good at handling the ambiguous
case anyway, I'd say go up at *most* to 14 digits.

I just checked my current tip-of-tree, and I needed to go down to
*five* digits to have git start complaining about ambiguous object
names:

[torvalds@ryzen linux]$ git show c608f
error: short object ID c608f is ambiguous
hint: The candidates are:
hint: c608f6b58f30 commit 2023-02-05 - Merge tag 'usb-6.2-rc7' of
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
hint: c608f14fb0ee tree
hint: c608fccf692f tree
hint: c608f76e5753 blob
hint: c608fa168fe6 blob
hint: c608fd96771c blob

and maybe that was pure luck, but looking at your stats it does look
like "6 digits is still unique for most objects", I really think that
we're better off with shorter and visually easier numbers than going
overboard.

Note above how even with just 5 digits, it's still unique in actual
commits, so from a *practical* standpoint even five digits are fine
(because normal human communication doesn't talk about the blob or
tree commits).

If this was some case of "when you hit the limit, things break
horribly badly", that would be one thing. But that not even being true
means that things like line wrapping and just visuals matter.

So I think 12 digits likely still work just fine for another decade or
two, but yes, we're at the point where we might want to start thinking
about 13 or 14.

Linus