Re: UTF-8, OSTA-UDF [why?], Unicode, and miscellaneous gibberi=

H. Peter Anvin (hpa@transmeta.com)
27 Aug 1997 09:57:54 GMT


Followup to: <5u0rqn$b2p$1@lap.noris.de>
By author: smurf@lap.noris.de (Matthias Urlichs)
In newsgroup: linux.dev.kernel
>
> hpa@transmeta.com (H. Peter Anvin) writes:
> >
> > Furthermore, in Swedish, "ü" sorts like "y", between "x" and "z", but
> > in German it sorts after "z", "ä" and "ö". In Swedish "w" sorts like
> > "v", between "u" and "x", but in English separetly between "v" and
> > "x".
> >
> You're mostly right, but in German, "ä" sorts like "ae".
>

Oh ... and here is a gem:

In Swedish and Finnish, the end of the alphabet is: ... u v x y z å ä ö
In Danish and Norwegian, it is: ... u v x y z æ ø å

For all of these languages, "ä" and "æ", "ö" and "ø" are treated as
the same letter and sorted as such (they are pronounced identically.)

In fact, in the Swedish variant of ISO 646 the characters are not in
alphabetical order, so that Danish or Norwegian text becomes
automatically "transliterated". In fact, I do not believe there has
ever been a widely used computer character set which has all the
Swedish characters in alphabetical order; Latin-1 has the order ä å
ö. Swedes have had to deal with these sorting issues for a long
time...

-hpa

-- 
    PGP: 2047/2A960705 BA 03 D3 2C 14 A8 A8 BD  1E DF FE 69 EE 35 BD 74
    See http://www.zytor.com/~hpa/ for web page and full PGP public key
Always looking for a few good BOsFH.  **  Linux - the OS of global cooperation
        I am Baha'i -- ask me about it or see http://www.bahai.org/