Re: [PATCH] string.c: test *cmp for all possible 1-character strings

From: Jason A. Donenfeld
Date: Thu Dec 22 2022 - 10:17:13 EST


On Thu, Dec 22, 2022 at 03:05:06PM +0100, Rasmus Villemoes wrote:
> The switch to -funsigned-char made a pre-existing bug on m68k more
> apparent. That is now fixed (by removing m68k's private strcmp(), see
> commit 7c0846125358), but we still have quite a few architectures that
> provide one or more of strcmp(), strncmp() and memcmp().
>
> They probably all work fine for the cases where the input is all
> ASCII, and/or where the caller only wants to know about equality or
> not (i.e. only checks whether the return value is 0 or not).
>
> Let's check that all these implementations also behave correctly for
> bytes with the high bit set, and provide the correct ordering -
> independent of us now building with -funsigned-char, the C standard
> says that these *cmp functions should consider the buffers as
> consisting of unsigned chars.
>
> This is only intended to help find other latent bugs and can/should be
> ripped out again before v6.2, or perhaps moved to test_string.c in
> some form, but for now I think it's worth doing unconditionally.
>
> Signed-off-by: Rasmus Villemoes <linux@xxxxxxxxxxxxxxxxxx>
> ---
> lib/string.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/lib/string.c b/lib/string.c
> index 4fb566ea610f..1718f96e8082 100644
> --- a/lib/string.c
> +++ b/lib/string.c
> @@ -880,3 +880,30 @@ void *memchr_inv(const void *start, int c, size_t bytes)
> return check_bytes8(start, value, bytes % 8);
> }
> EXPORT_SYMBOL(memchr_inv);
> +
> +static int sign(int x)
> +{
> + return (x > 0) - (x < 0);
> +}
> +
> +static int test_xxxcmp(void)
> +{
> + char a[2], b[2];
> + int i, j;
> +
> + a[1] = b[1] = 0;
> + for (i = 0; i < 256; ++i) {
> + a[0] = i;
> + for (j = 0; j < 256; ++j) {
> + b[0] = j;
> + WARN_ONCE(sign(strcmp(a, b)) != sign(i - j),
> + "strcmp() broken for (%2ph, %2ph)\n", a, b);
> + WARN_ONCE(sign(memcmp(a, b, 2)) != sign(i - j),
> + "memcmp() broken for (%2ph, %2ph)\n", a, b);
> + WARN_ONCE(sign(strncmp(a, b, 2)) != sign(i - j),
> + "strncmp() broken for (%2ph, %2ph)\n", a, b);
> + }
> + }
> + return 0;
> +}
> +late_initcall(test_xxxcmp);

This probably belongs in some config-gated selftest file that can be
compiled out, rather than running unconditionally on every boot, right?

Jason