[PATCH] checkpatch: Only encode UTF-8 quoted printable mail headers

From: Geert Uytterhoeven
Date: Wed Jul 18 2018 - 07:35:37 EST


As PERL uses its own internal character encoding, always calling
encode("utf8", ...) on the author name may cause corruption, leading to
an author signoff mismatch.

This happens in the following cases:
- If a patch is in ISO-8859, and contains a non-ASCII author name in
the From: line, it is converted to UTF-8, while the Signed-off-by
line will still be in ISO-8859.
- If a patch is in UTF-8, and contains a non-ASCII author name in the
body (not header) From: line, it is assumed to be encoded in PERL's
internal character encoding, and converted to UTF-8 incorrectly,
while the Signed-off-by line will be in real UTF-8.

Fix this by only doing the encode step if the From: line used UTF-8
quoted printable encoding.

Reported-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Geert Uytterhoeven <geert+renesas@xxxxxxxxx>
---
Fixes: bc76e3a125b44379 ("checkpatch: warn if missing author Signed-off-by")
in -next

To be folded into "checkpatch: Warn if missing author Signed-off-by" in
Andrew's tree.
---
scripts/checkpatch.pl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 3d01fee203c4775d..e847377779e7804f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2523,7 +2523,8 @@ sub process {

# Check the patch for a From:
if (decode("MIME-Header", $line) =~ /^From:\s*(.*)/) {
- $author = encode("utf8", $1);
+ $author = $1;
+ $author = encode("utf8", $author) if $line =~ /=\?utf-8\?/i;
$author =~ s/"//g;
}

--
2.17.1