Re: [PATCH v4 2/4] kstrtox: introduce a safer version of memparse()

From: Randy Dunlap
Date: Sun Jan 14 2024 - 23:27:43 EST


Hi,

On 1/4/24 18:35, Qu Wenruo wrote:
> [BUGS]
> Function memparse() lacks error handling:
>
> - If no valid number string at all
> In that case @retptr would just be updated and return value would be
> zero.
>
> - No overflown detection

overflow

> This applies to both the number string part, and the suffixes part.
> And since we have no way to indicate errors, we can get weird results
> like:
>
> "25E" -> 10376293541461622784 (9E)
>
> This is due to the fact that for "E" suffix, there is only 4 bits
> left, and 25 with 60 bits left shift would lead to overflow.
>
> [CAUSE]
> The root cause is already mentioned in the comments of the function, the
> usage of simple_strtoull() is the source of evil.
> Furthermore the function prototype is no good either, just returning an
> unsigned long long gives us no way to indicate an error.
>
> [FIX]
> Due to the prototype limits, we can not have a drop-in replacement for
> memparse().
>
> This patch can only help by introduce a new helper, memparse_safe(), and
> mark the old memparse() deprecated.
>
> The new memparse_safe() has the following improvement:
>
> - Invalid string detection
> If no number string can be detected at all, -EINVAL would be returned.

is returned.

>
> - Better overflow detection
> Both the string part and the extra left shift would have overflow

have overflow

> detection.
> Any overflow would result -ERANGE.

Any overflow results in -ERANGE.

>
> - Safer default suffix selection
> The helper allows the caller to choose the suffixes that they want to
> use.
> But only "KMGTP" are recommended by default since the "E" leaves only
> 4 bits before overflow.
> For those callers really know what they are doing, they can still
> manually to include all suffixes.
>
> Due to the prototype change, callers should migrate to the new one and
> change their code and add extra error handling.
>
> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> Reviewed-by: David Disseldorp <ddiss@xxxxxxx>
> ---
> include/linux/kernel.h | 8 +++-
> include/linux/kstrtox.h | 15 +++++++
> lib/cmdline.c | 4 +-
> lib/kstrtox.c | 99 +++++++++++++++++++++++++++++++++++++++++
> 4 files changed, 124 insertions(+), 2 deletions(-)
>

> diff --git a/lib/cmdline.c b/lib/cmdline.c
> index 90ed997d9570..35dbb03b5592 100644
> --- a/lib/cmdline.c
> +++ b/lib/cmdline.c
> @@ -139,10 +139,12 @@ char *get_options(const char *str, int nints, int *ints)
> EXPORT_SYMBOL(get_options);
>
> /**
> - * memparse - parse a string with mem suffixes into a number
> + * memparse - DEPRECATED, parse a string with mem suffixes into a number
> * @ptr: Where parse begins
> * @retptr: (output) Optional pointer to next char after parse completes
> *
> + * There is no way to handle errors, and no overflown detection and string
> + * sanity checks.
> * Parses a string into a number. The number stored at @ptr is
> * potentially suffixed with K, M, G, T, P, E.
> */
> diff --git a/lib/kstrtox.c b/lib/kstrtox.c
> index 41c9a499bbf3..375c7f0842e3 100644
> --- a/lib/kstrtox.c
> +++ b/lib/kstrtox.c
> @@ -113,6 +113,105 @@ static int _kstrtoull(const char *s, unsigned int base, unsigned long long *res)
> return 0;
> }
>
> +/**
> + * memparse_safe - convert a string to an unsigned long long, safer version of
> + * memparse()
> + *
> + * @s: The start of the string. Must be null-terminated.

Unless I misunderstand, this is the biggest problem that I see with
memparse_safe(): "Must be null-terminated".
memparse() does not have that requirement.

And how is @retptr updated if the string is null-terminated?

If the "Must be null-terminated." is correct, it requires that every user/caller
first determine the end of the number (how? space and/or any special character
or any alphabetic character that is not in KMGTPE? Then save that ending char,
change it to NUL, call memparse_safe(), then restore the saved char?

I'm hoping that the documentation is not correct...

> + * The base is determined automatically, if it starts with "0x"
> + * the base is 16, if it starts with "0" the base is 8, otherwise
> + * the base is 10.
> + * After a valid number string, there can be at most one
> + * case-insensitive suffix character, specified by the @suffixes
> + * parameter.
> + *
> + * @suffixes: The suffixes which should be handled. Use logical ORed
> + * memparse_suffix enum to indicate the supported suffixes.
> + * The suffixes are case-insensitive, all 2 ^ 10 based.
> + * Supported ones are "KMGPTE".
> + * If one suffix (one of "KMGPTE") is hit but that suffix is
> + * not specified in the @suffxies parameter, it ends the parse
> + * normally, with @retptr pointed to the (unsupported) suffix.
> + * E.g. "68k" with suffxies "M" returns 68 decimal, @retptr
> + * updated to 'k'.
> + *
> + * @res: Where to write the result.
> + *
> + * @retptr: (output) Optional pointer to the next char after parse completes.
> + *
> + * Returns:
> + * * %0 if any valid numeric string can be parsed, and @retptr is updated.
> + * * %-EINVAL if no valid number string can be found.
> + * * %-ERANGE if the number overflows.
> + * * For negative return values, @retptr is not updated.
> + */
> +noinline int memparse_safe(const char *s, enum memparse_suffix suffixes,
> + unsigned long long *res, char **retptr)
> +{

Thanks.
--
#Randy