[PATCH] UTF-8ifying the kernel source

From: David Eger
Date: Thu Mar 04 2004 - 05:07:43 EST




http://www.yak.net/random/linux-2.6.3-utf8-cleanup-auto.diff.bz2

Here you find the first of several patches to convert the kernel
source from ISO Latin-1 to UTF-8. I'm working on the files that didn't
auto-convert easily; comments welcome ;-)

First, some statistics!

In Linux 2.6.3, there are:
15860 clean 7-bit ASCII files
274 text files are not 7-bit clean

38 of these 274 files are not auto-convertible -- either they are not ISO
Latin-1 or the high octets appear within the actual code (not comments).

This first patch applies to help files, documentation, and comments which
are trivially correct ISO Latin-1 => UTF-8 conversions. The work I have
left to do is summarized below.

--dte


Un-needed/wrong non-ASCII characters (these fixes will form patch 2)
====================================================================
drivers/video/amifb.c - +- sign?
Documentation/i2c/i2c-protocol - NBSP, but why?
arch/i386/kernel/cpu/cyrix.c - NBSP, but why?
arch/v850/kernel/as85ep1.ld - WTF? comments in some random charset...
drivers/char/ftape/lowlevel/fdc-isr.c - WTF? shit in the comments
include/asm-m68k/atarihw.h - 0x94 - "cancel character"?
include/asm-m68k/atariints.h - 0x94 - "cancel character"?
include/linux/802_11.h - why the non-standard dash?
scripts/docproc.c - why the bizarre spelling for specific?
fs/ext2/xattr.c - bad ASCII art
fs/ext3/xattr.c - bad ASCII art
fs/afs/vlclient.h - a degrees sign, but why?

Box-drawing ASCII art (these fixes will form patch 3)
=====================================================
Documentation/networking/tms380tr.txt - DOS-style ASCII art
arch/arm/nwfpe/fpopcode.h - line-drawing characters

C strings - (what to do?)
=========================
arch/ppc/platforms/proc_rtas.c - a C string containing "degrees"
arch/ppc64/kernel/rtas-proc.c - a C string containing "degrees"
drivers/macintosh/therm_adt7467.c - degrees, MODULE_PARAM_DESC(),
and a C string
drivers/mtd/chips/cfi_probe.c - C strings
drivers/net/wireless/netwave_cs.c - C strings
drivers/scsi/dc395x.c - C strings

Other - (i'd convert it, but...)
================================
drivers/pci/pci.ids - I don't know what program processes this...
drivers/ieee1394/oui.db - I don't know what program processes this...

Machine / charset specific shite - (does anything need to be done?)
===================================================================
arch/m68k/hp300/hp300map.map - maps to "char"s.. grr
drivers/char/defkeymap.map - a map file... maps to "char"s.. grr
drivers/char/qtronixmap.c_shipped - maps to "char"s.. grr
drivers/char/qtronixmap.map - maps to "char"s.. grr
drivers/tc/lk201-map.c_shipped - maps to "char"s.. grr
drivers/tc/lk201-map.map - maps to "char"s.. grr
drivers/acorn/char/defkeymap-l7200.c - maps to "char"s.. grr
arch/s390/kernel/ebcdic.c - comments on a keymap table
drivers/video/console/font_8x16.c - comments on a keymap table
drivers/video/console/font_8x8.c - comments on a keymap table
drivers/video/console/font_pearl_8x8.c - comments on a keymap table
drivers/s390/ebcdic.c - comments on a keymap table

Noise from userland (this I won't be touching)
==============================================
Documentation/networking/ethertap.txt - random crap cat'd from /dev/tap0
Documentation/s390/Debugging390.txt - weird gdb output

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/