Re: [PATCH] console UTF-8 fixes

From: Egmont Koblinger
Date: Sat Apr 07 2007 - 13:26:24 EST


On Sat, Apr 07, 2007 at 01:00:48PM +0200, Jan Engelhardt wrote:

Hi,

> Please, no dot, and no inverse color.
> Imagine someone had the following bitmap for <unknown glyph/illegal sequence>:

No dot, I'm already convinced. To clarify the inverse thingy:

This is what the current kernel does:
1) tries to display the desired symbol
2) if it fails, tries to display U+FFFD (which usually looks similar to an
inverted question mark)
3) if this fails again then displays a normal '?'
(or a different symbol due to a bug discussed below)

Here's my proposal. This only alters the 3rd step, not the first two:
1) tries to display the desired symbol
2) if it fails, tries to display U+FFFD, still with _normal_ attributes
3) if this fails then display an ascii '?' with inverted attributes

So you won't get "double" inversion. If you do have U+FFFD in your font then
this will introduce no chance. If you don't have U+FFFD, you'll see inverse
question marks instead of normal ones.


> I blame your latin2 unicode map. (See above about 'Ã'.)

There's nothing wrong with my latin2 unicode map, and I've located and
changed the part _in the kernel_ that displays a false glyph using the
algorithm I've outlined. It just uses "the glyph at that code position
within the glyph table" as a fallback, which might be okay in 8-bit mode
(and I haven't modified the behavior in that case), but I got rid of this
behavior in UTF-8 mode since it's definitely a fault in the world of
Unicode.

> It should perhaps display a regular 'u' if it cannot display 'Ã',

I rather think it should display U+FFFD but YMMV.

> but definitely not 'Ã' (which is not called a double accent, btw).

This is not the character I've been talking about, I actually _did_ talk
about u with double acute accent (Å - you might not have seen this character
so far, AFAIK it's only used in Hungarian, no other languages). But we agree
that the kernel definitely shouldn't display a character with a different
accent on it. This is one of the bugs my patch addresses.


bye,

Egmont
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/