fs/nls/nls_base.c: minor utf8_mbstowcs bug (w/ patch :)

Urban Widmark (urban@svenskatest.se)
Thu, 6 May 1999 22:54:39 +0200 (CEST)


Hello,

I've been trying to use the utf8/unicode functions to make some charset
conversions. (And since your name was conveniently located at the top of
the source file ... :)

utf8_mbstowcs fails on the following byte sequence:
c3 a5 c3 a4 c3 b6
(utf8 form of "едц", or 0x00e5 0x00e4 0x00f6)

op is a __u16 *, not a __u8 * as ip, but still ip and op is advanced the
same value. I believe the following is correct since utf8_mbtowc only
writes one 2-byte character.

--- linux-2.2.7/fs/nls/nls_base.c Fri Dec 18 17:09:51 1998
+++ linux/fs/nls/nls_base.c Thu May 6 22:28:22 1999
@@ -89,7 +89,7 @@
ip++;
n--;
} else {
- op += size;
+ op++;
ip += size;
n -= size;
}

Do you agree?

I suppose something like (size >> 1) is more in line with the rest of the
else ...

/Urban

---
Urban Widmark                           urban@svenskatest.se
Svenska Test AB                         +46 90 71 71 23

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/