Re: [Patch] Support UTF-8 scripts

From: "Martin v. Löwis"
Date: Sun Aug 14 2005 - 17:11:17 EST


Lee Revell wrote:
> For strings, of course. But there's no need for UTF-8 operators.

Indeed - this is the main rationale for the patch, of course. People
want to write non-ASCII in script primarily in string literals,
and (perhaps even more often) in comments. Now, for comments, it
wouldn't really matter that the interpreter knows what the encoding
is - but the editor would have to know, and the UTF-8 signature
primarily helps the editor (*).

Then we are back to the rationale for this patch: if you need the
UTF-8 signature to reliably identify the script as being UTF-8
encoded, you then currently cannot easily run it as a script through
binfmt_script, as that code requires a script to start with #!.

Regards,
Martin

(*) As I said before: atleast for Python, the UTF-8 signature also
has syntactic meaning. It is allowed at the beginning of a file
as an addition to the language syntax, and it tells the interpreter
that Unicode literals (usually represented internally as UCS-2)
are represented as UTF-8 in the source code.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/