Re: [PATCH] kbuild: treat char as always signed

From: Jason A. Donenfeld
Date: Wed Oct 19 2022 - 23:12:17 EST


On Wed, Oct 19, 2022 at 6:11 PM Linus Torvalds
<torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Wed, Oct 19, 2022 at 1:35 PM Jason A. Donenfeld <Jason@xxxxxxxxx> wrote:
> >
> > I wish folks would use `u8 *` when they mean "byte array".
>
> Together with '-funsigned-char', we could typedef 'u8' to just 'char'
> (just for __KERNEL__ code, though!), and then we really could just use
> 'strlen()' and friends on said kind of arrays without any warnings.
>
> But we do have a *lot* of 'unsigned char' users, so it would be a huge
> amount of churn to do this kind of thing.

I think, though, there's an argument to be made that every use of
`unsigned char` is much better off as a `u8`. We don't have any C23
fancy unicode strings. As far as I can tell, the only usage of
`unsigned char` ought to be "treat this as a byte array", and that's
what u8 is for. Yea, that'd be churn. But technically, it wouldn't
really be difficult churn: If naive-sed mangles that, I'm sure
Coccinelle would be up to the task. If you think that's a wise
direction, I can play with it and see how miserable it is to do.

(As a sidebar, Sultan and I were discussing today... I find the
radical extension of this idea to its logical end somewhat attractive:
exclusively using u64, s64, u32, s32, u16, s16, u8, s8, uword (native
size), sword (native size), char (string/character). It'd hardly look
like C any more, though, and the very mention of the idea is probably
triggering for some. So I'm not actually suggesting we do that in
earnest. But there is some appeal.)

Jason