Re: [PATCH v2 00/40] Use ASCII subset instead of UTF-8 alternate symbols

From: David Woodhouse
Date: Wed May 12 2021 - 15:35:24 EST


Your title 'Use ASCII subset' is now at least a bit *closer* to
describing what the patches are actually doing, but it's still a bit
misleading because you're only doing it for *some* characters.

And the wording is still indicative of a fundamentally *misguided*
motivation for doing any of this. Your commit comments should be about
fixing a specific thing, nothing to do with "use ASCII subset", which
is pointless in itself.

On Wed, 2021-05-12 at 14:50 +0200, Mauro Carvalho Chehab wrote:
> Such conversion tools - plus some text editor like LibreOffice or similar - have
> a set of rules that turns some typed ASCII characters into UTF-8 alternatives,
> for instance converting commas into curly commas and adding non-breakable
> spaces. All of those are meant to produce better results when the text is
> displayed in HTML or PDF formats.

And don't we render our documentation into HTML or PDF formats? Are
some of those non-breaking spaces not actually *useful* for their
intended purpose?

> While it is perfectly fine to use UTF-8 characters in Linux, and specially at
> the documentation, it is better to stick to the ASCII subset on such
> particular case, due to a couple of reasons:
>
> 1. it makes life easier for tools like grep;

Barely, as noted, because of things like line feeds.

> 2. they easier to edit with the some commonly used text/source
> code editors.

That is nonsense. Any but the most broken and/or anachronistic
environments and editors will be just fine.

Attachment: smime.p7s
Description: S/MIME cryptographic signature