Re: [PATCH] kbuild: treat char as always signed

From: Linus Torvalds
Date: Fri Oct 21 2022 - 18:46:31 EST


On Thu, Oct 20, 2022 at 3:41 AM Gabriel Paubert <paubert@xxxxxxx> wrote:
>
> I must miss something, the strcmp man page says:
>
> "The comparison is done using unsigned characters."

You're not missing anything, I just hadn't looked at strcmp() in forever.

Yeah, strcmp clearly doesn't care about the signedness of 'char', and
arguably an unsigned char argument makes more sense considering the
semantics of the funmction.

> But it's not for this that I wrote this message. Has anybody considered
> using transparent unions?

I don't love the transparent union-as-argument syntax, but you're
right, that would fix the warning.

Except it then doesn't actually *work* very well.

Try this:

#include <sys/types.h>

#if USE_UNION
typedef union {
const char *a;
const signed char *b;
const unsigned char *c;
} conststring_arg __attribute__ ((__transparent_union__));
size_t strlen(conststring_arg);
#else
size_t strlen(const char *);
#endif

int test(char *a, unsigned char *b)
{
return strlen(a)+strlen(b);
}

int test2(void)
{
return strlen("hello");
}

and now compile it both ways with

gcc -DUSE_UNION -Wall -O2 -S t.c
gcc -Wall -O2 -S t.c

and notice how yes, the "-DUSE_UNION" one silences the warning about
using 'unsigned char *' for strlen. So it seems to work fine.

But then look at the code it generates for 'test2()" in the two cases.

The transparent union version actually generates a function call to an
external 'strlen()' function.

The regular version uses the compiler builtin, and just compiles
test2() to return the constant value 5.

So playing games with anonymous union arguments ends up also disabling
all the compiler optimizations we do want, becaue apparently gcc then
decides "ok, I'm not going to warn about you declaring this
differently, but I'm also not going to use the regular one because you
declared it differently".

This, btw, is also the reason why we don't use --freestanding in the
kernel. We do want the basic <string.h> things to just DTRT.

For the sockaddr_in games, the above isn't an issue. For strlen() and
friends, it very much is.

Linus