Re: [PATCH 1/1] lib/vsprintf: Implement ssprintf() to catch truncated strings

From: Lee Jones
Date: Mon Jan 29 2024 - 04:25:03 EST


NB: I was _just_ about to send out v2 with Rasmus's suggestions before I
saw your reply. I'm going to submit it anyway and Cc both you and
Rasmus. If you still disagree with my suggested approach, we can either
continue discussion here or on the new version.

More below:

> From: Lee Jones
> > Sent: 25 January 2024 10:36
> > On Thu, 25 Jan 2024, Rasmus Villemoes wrote:
> >
> > > On 25/01/2024 09.39, Lee Jones wrote:
> > > > There is an ongoing effort to replace the use of {v}snprintf() variants
> > > > with safer alternatives - for a more in depth view, see Jon's write-up
> > > > on LWN [0] and/or Alex's on the Kernel Self Protection Project [1].
> > > >
> > > > Whist executing the task, it quickly became apparent that the initial
> > > > thought of simply s/snprintf/scnprintf/ wasn't going to be adequate for
> > > > a number of cases. Specifically ones where the caller needs to know
> > > > whether the given string ends up being truncated. This is where
> > > > ssprintf() [based on similar semantics of strscpy()] comes in, since it
> > > > takes the best parts of both of the aforementioned variants. It has the
> > > > testability of truncation of snprintf() and returns the number of Bytes
> > > > *actually* written, similar to scnprintf(), making it a very programmer
> > > > friendly alternative.
> > > >
> > > > Here's some examples to show the differences:
> > > >
> > > > Success: No truncation - all 9 Bytes successfully written to the buffer
> > > >
> > > > ret = snprintf (buf, 10, "%s", "123456789"); // ret = 9
> > > > ret = scnprintf(buf, 10, "%s", "123456789"); // ret = 9
> > > > ret = ssprintf (buf, 10, "%s", "123456789"); // ret = 9
> > > >
> > > > Failure: Truncation - only 9 of 10 Bytes written; '-' is truncated
> > > >
> > > > ret = snprintf (buf, 10, "%s", "123456789-"); // ret = 10
> > > >
> > > > Reports: "10 Bytes would have been written if buf was large enough"
> > > > Issue: Programmers need to know/remember to check ret against "10"
> > >
> > > Yeah, so I'm not at all sure we need yet-another-wrapper with
> > > yet-another-hard-to-read-prefix when people can just RTFM and learn how
> > > to check for truncation or whatnot. But if you do this:
> >
> > As wonderful as it would be for people to "just RTFM", we're seeing a
> > large number of cases where this isn't happening. Providing a more
> > programmer friendly way is thought, by people way smarter than me, to be
> > a solid means to solve this issue. Please also see Kees Cook's related
> > work to remove strlcpy() use.
>
> My worry is that people will believe the length and forget that
> it might be an error code.

My plan is to go around and convert these myself. All of the examples
in the kernel will check the return value for error. We can go one
further and author a Coccinelle rule to enforce the semantics.

> So you replace one set of errors (truncated data), with another
> worse set (eg write before start of buffer).

Under-running the buffer is no worse over-running. However, as I say,
we're going to make a concerted effort to prevent that via various
proactive and passive measures.

> I'm sure that the safest return for 'truncated' is the buffer length.
> The a series of statements like:
> buf += xxx(buf, buf_end - buf, .....);
> can all be called with a single overflow check at the end.
>
> Forget the check, and the length just contains a trailing '\0'
> which might cause confusion but isn't going to immediately
> break the world.

snprintf() does this and has been proven to cause buffer-overflows.
There have been multiple articles authored describing why using
snprintf() is not generally a good idea for the masses including the 2
linked in the commit message:

LWN: snprintf() confusion
https://lwn.net/Articles/69419/

KSPP: Replace uses of snprintf() and vsnprintf()
https://github.com/KSPP/linux/issues/105

Yes, you should check ssprintf() for error. This is no different to the
many hundreds of APIs where this is also a stipulation. Not checking
(m)any of the memory allocation APIs for error will also lead to similar
results which is why we enforce the check.

--
Lee Jones [李琼斯]