[PATCH v3 0/2] get_maintainer: correctly parse UTF-8 encoded names in files

From: Alvin Šipraga
Date: Mon Dec 18 2023 - 20:25:58 EST


Signed-off-by: Alvin Šipraga <alsi@xxxxxxxxxxxxxxx>
---
Changes in v3:
- add more rationale for opening everything with UTF-8 encoding
- fix a separate issue identified when introducing UTF-8 names, namely
that they would not get escaped with quotes as expected, due to Perl's
default behaviour being to match UTF-8 characters with \w
- add a second patch to fix an unrelated issue mentioned by Joe whereby
a mailing list might get the display name '-'
- Link to v2: https://lore.kernel.org/r/20231214-get-maintainers-utf8-v2-1-b188dc7042a4@xxxxxxxxxxxxxxx

Changes in v2:
- use '\p{L}' rather than '\p{Latin}', so that matching is even more
inclusive (i.e. match also Greek letters, CJK, etc.)
- fix commit message to refer to tools mailing list, not b4 mailing list
- Link to v1: https://lore.kernel.org/r/20231014-get-maintainers-utf8-v1-1-3af8c7aeb239@xxxxxxxxxxxxxxx

---
Alvin Šipraga (2):
get_maintainer: correctly parse UTF-8 encoded names in files
get_maintainer: remove stray punctuation when cleaning file emails

scripts/get_maintainer.pl | 48 +++++++++++++++++++++++++++--------------------
1 file changed, 28 insertions(+), 20 deletions(-)
---
base-commit: 2cf4f94d8e8646803f8fb0facf134b0cd7fb691a
change-id: 20231014-get-maintainers-utf8-32c65c4d6f8a