Re: [PATCH v4] checkpatch: fix false positives in REPEATED_WORD warning

From: Aditya
Date: Sat Oct 24 2020 - 05:28:28 EST


On 24/10/20 7:07 am, Joe Perches wrote:
> On Sat, 2020-10-24 at 05:38 +0530, Aditya Srivastava wrote:
>> A quick evaluation on v5.6..v5.8 showed that this fix reduces
>> REPEATED_WORD warnings from 2797 to 907.
>
> How many of these 907 remaining are still false positive?
>
>> A quick manual check found all cases are related to hex output or
>> list command outputs in commit messages.
>
> You mean 1890 of the 2797 are now no longer reported and all 1890
> were false positives yes?
>

Yes. In v5.6..5.8, there were 2797 warnings for REPEATED_WORD, after
these changes, they are reduced to 907.
However, many among these 907 must have been fixed by Dwaipayan's
patch. I'll replace it with 1890 instead, for the better.

>>   pos($rawline) = 1 if (!$in_commit_log);
>>   while ($rawline =~ /\b($word_pattern) (?=($word_pattern))/g) {
>>  
>>
>> @@ -3074,6 +3076,17 @@ sub process {
>>   next if ($start_char =~ /^\S$/);
>>   next if (index(" \t.,;?!", $end_char) == -1);
>>  
>>
>> + # avoid repeating hex occurrences like 'ff ff fe 09 ...'
>> + my %allow_repeated_words = (
>> + add => '',
>> + added => '',
>> + bad => '',
>> + be => '',
>> + );
>
> If perl caches this local hash declaration, fine,
> but I think it better to use 'our %allow_repeated_words'
> and move it so it's only declared using the file scope.
>

I ran checkpatch over few commits, it was working fine. But I'll move
it to file scope, using 'our'. That should do as well.

>> + if ($first =~ /\b[0-9a-f]{2,}\b/) {
>
> This regex matches only lower case so it wouldn't match "Add".
>
> I think this regex would be clearer using
> /^[0-9a-f]+$/i or /^[A-Fa-f0-9]+$/
>
>

Missed it. Will do.

Thanks
Aditya