Re: UTF-8, OSTA-UDF [why?], Unicode, and miscellaneous gibberis

Richard B. Johnson (root@analogic.com)
Tue, 26 Aug 1997 14:15:44 -0400 (EDT)


On 26 Aug 1997, Svein Erik Brostigen wrote:

>
> Kai Henningsen wrote:
> >
> > SveinErik.Brostigen@ksr.okpost.telemax.no (Svein Erik Brostigen) wrote on
> > 22.08.97 in <"15307 97/08/22
> > 10:34*/c=no/admd=telemax/prmd=okpost/o=ksr/s=Brostigen/g=SveinErik/"@MHS>:
> >
> > > If I understand correctly, all these uses their own type of encoding
> today
> > > and those are of course not easily converted/adopted from one language to
> > > another. Our main problem is then to find a way to unify all these
> > encodings
> > > and Unicode is an attempt to do so and according to some here, have
> failed
> > > utterly in doing so.
> >
> > For every solution, some people will claim this. That doesn't mean it's
> > true. In this case, it isn't.
> >
> > MfG Kai
> >
> Well, I will no take any side in this matter. I am just concerned with
> finding the *best* <G> solution to a problem. The temperature is rising with
> a tremendous speed in this thread now...
>
> I, for one, would love to be able tohave both japanese,korean, thai and
> norwegian characters on the screen at the saem time and without any tricky
> stuff to make this possible. It should just be as natural as it is to read
> andwrite this :)
>
> Happy fighting!

Some primative Operating Systems provide for 'Code Pages' which are
Language-specific chunks of code that interface into stdin, stdout, and
stderr. This is 'user mode' stuff.

It wouldn't need any kernel support if the initial 'shell' was a chunk
of code that did the translation on the way to/from the real shell.

One can readily create such a program that creates new fds 0, 1, and 2,
which point to pipes created by that program, before forking the shell.

In this manner, all the terminal I/O for that shell goes through that
parent program. It can do whatever translation it wants. Such a scheme
can be as simple or as complex as one wants. All the extra CPU Cycles are
charged to the Code-Page user. The kernel doesn't suffer. It just deals
with strings of bytes, not caring what they mean.

project:*:666:666:Project:/home/proj:/bin/xlate -tISO1234 /bin/bash
|__ program | |
|__ type |
|_shell

This is already being done to a certain with the `bash` shell. If you
execute the following program, it will create a file-name that erases the
screen. I deliberately made it start with a 'z' so you could find the
name. Bash shows all the <ESC> characters as <?>. However, if you type
echo *, (no translation), it will erase your screen.

#include <stdio.h>

const char fname[]={'z', 27, '[', 'H', 27, '[', 'J', 0};
main()
{
FILE *file;
if(file = fopen(fname, "w"))
fclose(file);
return 0;
}

The kernel doesn't care what the file-name characters are. This is the
way it should_be(tm)!

Cheers,
DJ
Richard B. Johnson
Analogic Corporation
Penguin : Linux version 2.1.51 on an i586 machine (66.15 BogoMips).
Warning : It's hard to stay on the trailing edge of technology.
Linux : Engineering tool
Windows : Typewriter