Re: A Great Idea (tm) about reimplementing NLS.

From: Lukasz Stelmach
Date: Wed Jun 22 2005 - 04:00:45 EST


MÃns RullgÃrd napisaÅ(a):

>>>That's exactly how ext3, reiserfs, xfs, jfs, etc. all work. A few
>>>filesystems are tagged as using some specific encoding. If your
>>>filesystem is marked for iso-8859-1, what should a kernel with a
>>>conversion mechanism do if a user tries to name a file ê?
>>
>>Return -ENOENT? I am guessing.
>
>
> Doesn't seem very friendly.

Well, if user marks her fs as iso-8859-1 that means that she doesn't
want it to contain filenames unrepresentable in this particular
codepage. Aleksey has begun the whole thread because in Russia there are
several, equally popular, different encodings for the same alphabet. And
in this context his proposal is quite good: develope general, fs
independent NLS layer.

>>But please tell me what should do userland software if it runs with
>>locale set to something.iso-8859-2 and finds ê in the directory?
>
>
> I suppose it will display ÄÅ (0x80 doesn't seem be a printable
> iso-8859-2 character). You told it to use iso-8859-2 in the first
> place, so what do you expect?

ls(1) displays either \0nnn or ?. Or maybe some other mangling could be
done, however, octal representation seems to be ok.


>>That is the same problem. And for now ISO 8-bit encodings are far
>>more popular and usefull with contemporary tools than UTF-8.
>
>
> ISO 8-bit encodings are more common with characters they can
> represent. These are a small minority of all characters commonly
> used.

OK. Let me be more general: fixed char width encodings. AFAIK Japanese
encodigs use 16bits, yet it is still fixed width.

>>That is why I think suggestion of a layer in the kernel that would
>>translate filenames form utf-8 stored on the media to e.g. latin2
>>used by tools is quite reasonable. Especially when there is more
>>than one encoding for a particular language (think Russian,
>>Polish). Even more, with such a facility transition would be much
>>more greaceful since you could have utf-8 filesystem and then you
>>can worry about tools other tools. The filesystem is already
>>populated with UFT-8 names.
>
>
> How is the kernel to know what to translate to/from?

Mount options. See the letter from Kyle Moffett
<C960854D-7EA5-4DD7-8F2B-7021092CE3EB@xxxxxxx>


[ good filesystem for portable media ]
>>That's why IMHO FAT is quite enough for this purpose.
>
> FAT has a bad habit of constantly hammering the same sectors over and
> over. This can wear out cheap flash media in no time at all.

Maybe. I don't think that digital cameras or audio players will suppoty
UDF though. But that is something completly differnent.

--
ByÂo mi bardzo miÂo. Trzecia pospolita klÃska, [...]
>Âukasz< Ju nie katolicka lecz zÂodziejska. (c)PP

Attachment: signature.asc
Description: OpenPGP digital signature