[PATCH v2] checkpatch: Only encode UTF-8 quoted printable mail headers

From: Geert Uytterhoeven
Date: Wed Jul 18 2018 - 10:53:03 EST


As PERL uses its own internal character encoding, always calling
encode("utf8", ...) on the author name may cause corruption, leading to
an author signoff mismatch.

This happens in the following cases:
- If a patch is in ISO-8859, and contains a non-ASCII author name in
the From: line, it is converted to UTF-8, while the Signed-off-by
line will still be in ISO-8859.
- If a patch is in UTF-8, and contains a non-ASCII author name in the
body (not header) From: line, it is assumed to be encoded in PERL's
internal character encoding, and converted to UTF-8 incorrectly,
while the Signed-off-by line will be in real UTF-8.

Fix this by only doing the encode step if the From: line used UTF-8
quoted printable encoding.

Reported-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
---
Fixes: bc76e3a125b44379 ("checkpatch: warn if missing author Signed-off-by")
in -next

To be folded into "checkpatch: Warn if missing author Signed-off-by" in
Andrew's tree.

v2:
- Add parentheses.
---
scripts/checkpatch.pl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3d01fee203c4775d..017253b1df513bcb 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2523,7 +2523,8 @@ sub process {

# Check the patch for a From:
if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
- $author = encode("utf8", $1);
+ $author = $1;
+ $author = encode("utf8", $author) if ($line =~ /=\?utf-8\?/i);
$author =~ s/"//g;
}

--
2.17.1