Re: [PATCH] checkpatch: Also accept commit ids with 13-40 chars of sha1

From: Niklas Söderlund
Date: Sun Dec 10 2023 - 16:56:49 EST


Hi Geert,

Thanks for your patch.

On 2023-12-05 20:34:16 +0100, Geert Uytterhoeven wrote:
> Documentation/dev-tools/checkpatch.rst says:
>
> **GIT_COMMIT_ID**
> The proper way to reference a commit id is:
> commit <12+ chars of sha1> ("<title line>")
>
> However, scripts/checkpatch.pl has two different checks: one warning
> check accepting 12 characters exactly:
>
> # Check Fixes: styles is correct
> Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")'
>
> and a second error check accepting 12-40 characters:
>
> # Check for git id commit length and improperly formed commit descriptions
> # A correctly formed commit description is:
> # commit <SHA-1 hash length 12+ chars> ("Complete commit subject")
> Please use git commit description style 'commit <12+ chars of sha1>
>
> Hence patches containing commit ids with more than 12 characters are
> flagged by checkpatch, and sometimes rejected by maintainers or
> reviewers.

I agree, it's not nice that the two commit id checks do not agree on
length and that this should likely be aligned.

To clarify, the two commit id checks in checkpatch.pl do not conflict
with each other as one check Fixes tags while the other checks the
format when referring to commits in general, right?

The intention when adding the check for Fixes tags was to conform with
what is documented in [1]. And to enforce a minimum number of characters
as there where issues where too few where used.

>
> Fix this by aligning the first check with the second check, and with the
> documentation.

I think this change is a good idea, but the documentation should also be
updated.

1. https://www.kernel.org/doc/html/latest/process/submitting-patches.html#describe-your-changes

>
> Fixes: bd17e036b495bebb ("checkpatch: warn for non-standard fixes tag style")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
> ---
> Perhaps the time is ripe to increase the minimum from 12 to 16 chars
> (in a follow-up patch)?
>
> Running git-unique-abbrev[1] on a tree containing v6.7-rc3 and all
> stable releases gives:
>
> 12000853 objects
> 4: 12000853 / 65536
> 5: 12000717 / 1048423
> 6: 6130888 / 2703295
> 7: 525025 / 260563
> 8: 33736 / 16861
> 9: 2106 / 1053
> 10: 160 / 80
> 11: 10 / 5
> 12: 0 / 0
> 21cf4d54d3c702ac20c6747fa6d4f64dee07dd11
> 21cf4d54d3ced8a3e752030e483d72997721076d
> 8a048bbf89528d45c604aed68f7e0f0ef957067d
> 8a048bbf895b1359e4a33b779ea6d7386cfe4de2
> d3ac4e475103c4364ecb47a6a55c114d7c42a014
> d3ac4e47510ec0753ebe1e418a334ad202784aa8
> d597639e2036f04f0226761e2d818b31f2db7820
> d597639e203a100156501df8a0756fd09573e2de
> ef91b6e893a00d903400f8e1303efc4d52b710af
> ef91b6e893afc4c4ca488453ea9f19ced5fa5861
>
> 12000853 is still smaller than sqrt(16^12) = 16777216, but the safety
> margin is getting smaller. E.g. my main work tree already contains
> almost 18M objects. Hence the Birthday Paradox states that collisions
> of 12 char sha1 values are imminent.
>
> Note that we standardized on 12 chars in commit d311cd44545f2f69
> ("checkpatch: add test for commit id formatting style in commit log") in
> v3.17. For comparison, running git-unique-abbrev on a tree with all
> (upstream + stable) releases from that era gives:
>
> 4052307 objects
> 4: 4052307 / 65536
> 5: 3966948 / 940963
> 6: 869691 / 417363
> 7: 61208 / 30523
> 8: 3979 / 1989
> 9: 258 / 129
> 10: 24 / 12
> 11: 6 / 3
> 12: 0 / 0
> 21cf4d54d3c702ac20c6747fa6d4f64dee07dd11
> 21cf4d54d3ced8a3e752030e483d72997721076d
> d597639e2036f04f0226761e2d818b31f2db7820
> d597639e203a100156501df8a0756fd09573e2de
> ef91b6e893a00d903400f8e1303efc4d52b710af
> ef91b6e893afc4c4ca488453ea9f19ced5fa5861
>
> So the number of objects increased threefold during the last 9 years.
>
> Thanks for your comments!
>
> [1] https://blog.cuviper.com/2013/11/10/how-short-can-git-abbreviate/
> ---
> scripts/checkpatch.pl | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index 25fdb7fda1128aa9..a4e178a68f6d1d5f 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -3209,7 +3209,7 @@ sub process {
> $tag_case = 0 if $tag eq "Fixes:";
> $tag_space = 0 if ($line =~ /^fixes:? [0-9a-f]{5,} ($balanced_parens)/i);
>
> - $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12}$/i);
> + $id_length = 0 if ($orig_commit =~ /^[0-9a-f]{12,40}$/i);
> $id_case = 0 if ($orig_commit !~ /[A-F]/);
>
> # Always strip leading/trailing parens then double quotes if existing
> @@ -3226,7 +3226,7 @@ sub process {
> if ($ctitle ne $title || $tag_case || $tag_space ||
> $id_length || $id_case || !$title_has_quotes) {
> if (WARN("BAD_FIXES_TAG",
> - "Please use correct Fixes: style 'Fixes: <12 chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> + "Please use correct Fixes: style 'Fixes: <12+ chars of sha1> (\"<title line>\")' - ie: 'Fixes: $cid (\"$ctitle\")'\n" . $herecurr) &&
> $fix) {
> $fixed[$fixlinenr] = "Fixes: $cid (\"$ctitle\")";
> }
> --
> 2.34.1
>

--
Kind Regards,
Niklas Söderlund