RE: [PATCH v2] kbuild: treat char as always unsigned

From: David Laight
Date: Thu Dec 22 2022 - 05:42:11 EST


From: Linus Torvalds
> Sent: 21 December 2022 17:07
>
> On Wed, Dec 21, 2022 at 7:56 AM Guenter Roeck <linux@xxxxxxxxxxxx> wrote:
> >
> > The above assumes an unsigned char as input to strcmp(). I consider that
> > a hypothetical problem because "comparing" strings with upper bits
> > set doesn't really make sense in practice (How does one compare Günter
> > against Gunter ? And how about Gǖnter ?). On the other side, the problem
> > observed here is real and immediate.
>
> POSIX does actually specify "Günter" vs "Gunter".
>
> The way strcmp is supposed to work is to return the sign of the
> difference between the byte values ("unsigned char").
>
> But that sign has to be computed in 'int', not in 'signed char'.
>
> So yes, the m68k implementation is broken regardless, but with a
> signed char it just happened to work for the US-ASCII case that the
> crypto case tested.
>
> I think the real fix is to just remove that broken implementation
> entirely, and rely on the generic one.

I wonder how much slower it is - m68k is likely to be microcoded
and I don't think instruction timings are actually available.
The fastest version probably uses subx (with carry) to generate
0/-1 and leaves +delta for the other result - but getting the
compares and branches in the right order is hard.

I believe some of the other m68k asm functions are also missing
the "memory" 'clobber' and so could get mis-optimised.
While I can write (or rather have written) m68k asm I don't have
a compiler.

I also suspect that any x86 code that uses 'rep scas' is going
to be slow on anything modern.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)