RE: [PATCH v3 5/5] minmax: Relax check to allow comparison between int and small unsigned constants.

From: David Laight
Date: Thu Aug 10 2023 - 04:29:50 EST


From: Linus Torvalds
> Sent: 07 August 2023 16:49
>
> On Mon, 7 Aug 2023 at 03:50, David Laight <David.Laight@xxxxxxxxxx> wrote:
> >
> > To my mind the value of min(variable, TWENTY) shouldn't depend
> > on how TWENTY is defined regardless of the type of the variable.
> > So 20, 20u, 20l, 20ul, (char)20, sizeof (foo), offsetof(x, y)
> > should all be equally valid and all generate the same result.
>
> That sounds nice, but I don't believe it is true.
>
> If somebody writes
>
> a = min(b, 20u);
>
> then that is *not* the same thing as
>
> a = min(b, 20);
>
> without knowing the types.
>
> But you make them be the same thing - now they become ambiguous and
> depend on the type of 'b'.
>
> Does that expression mean "give me a number 0..20" or "MININT..20"?

Why does the lower bound of any type matter?
There are two integer values between -infinity and +infinity.
The return value is the smaller of the two values.

The only problem arises when there isn't a C type that is
guaranteed to be able to contain the result.
Typically this happens when comparing signed int and unsigned int
where the value might be -1 and (1u << 31) - clearly allowing
this is problematic.

> And the whole point of the type checking is very much to not have
> ambiguous comparisons where you can have those kinds of questions.
>
> > I've found all sorts of dubious min_t() while writing these patches.
> > One in a filesystem was min_t(ulong, ulong_var, u64_var) and I
> > couldn't convince myself it was right on 32bit.
>
> Honestly, that's a great example, but I think that's more of an
> argument against 'min_t()' than it is an argument for changing 'min()'
> and 'max()'.
>
> I think it was a mistake to do "min_t()",

At least we agree on that.

> and we should have done
> sign-based ones ("do a unsigned/signed min/max").

Patch 1 adds min_unsigned() that uses the integer promotion
rules to convert a non-negative signed value to unsigned without
ever masking high bits.
The compiler then optimises most of it away.

I'm not sure you ever want to force a signed compare with a
non-constant unsigned value.
Patch 5 (which you seem to really dislike) makes comparisons
against unsigned constants 'just work' - the returned value is
never misleading.

The other common problem with min() is different sized (or named)
unsigned types.
It would be nice for these to be allowed (since the result always
fits in the smaller type) without having to use (even) min_unsigned()
because you'd then know that nothing was signed.

Changing the basic test from 'same types' to 'same signedness'
in patch 3 makes this 'just work'.

> I agree that a "min_t()" that narrows the type is very scary, in that
> it might lose bits, and it's obviously also easily dependent on word
> size too, as in your example.
>
> We could perhaps aim to make 'min_t()' warn about 't' being narrower
> than the types you compare.

That is going to give false positives with sizeof().

> (Although then you hit the traditional "C doesn't really have 'char'
> and 'short' types in expressions", so you'd probably have to do the
> type size check with widening of 't' in place, so 'min_t(char, int,
> int)' would be ok).

Usually you see:
char_a = min_t(char, char_b, limit);
the 'char' pretty much needs to be 'int'.
or:
char_a = min_t(char, char_a + 1, char_b);
patch 4 converts char/short to signed int for the signedness test.

As a C test, which of these enum values are unsigned?
And what difference does gcc 13 make??

enum ea { ea1 = 1, ea1u = 1u};
enum eb { eb1 = 1, eb1u = 1u, ebm1 = -1};
enum ec { ec1 = 1, ec1u = 1u, ecmx = 0x80000000};
enum ed { ed1 = 1, edm1 = -1, edmx = 0x80000000u};

(for the answer see https://godbolt.org/z/GrM8j9a1q )

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)