Re: [PATCH 1/1] lib/vsprintf: Implement ssprintf() to catch truncated strings

From: Rasmus Villemoes
Date: Tue Jan 30 2024 - 10:18:57 EST


On 30/01/2024 16.07, Lee Jones wrote:
> On Mon, 29 Jan 2024, Lee Jones wrote:
>
>> On Mon, 29 Jan 2024, David Laight wrote:
>>
>>> ...
>>>>> I'm sure that the safest return for 'truncated' is the buffer length.
>>>>> The a series of statements like:
>>>>> buf += xxx(buf, buf_end - buf, .....);
>>>>> can all be called with a single overflow check at the end.
>>>>>
>>>>> Forget the check, and the length just contains a trailing '\0'
>>>>> which might cause confusion but isn't going to immediately
>>>>> break the world.
>>>>
>>>> snprintf() does this and has been proven to cause buffer-overflows.
>>>> There have been multiple articles authored describing why using
>>>> snprintf() is not generally a good idea for the masses including the 2
>>>> linked in the commit message:
>>>
>>> snprintf() returns the number of bytes that would have been output [1].
>>> I'm not suggesting that, or not terminating the buffer.
>>> Just returning the length including the '\0' (unless length was zero).
>>> This still lets the code check for overflow but isn't going to
>>> generate a pointer outside the buffer if used to update a pointer.
>>
>> I see. Well I'm not married to my solution. However, I am convinced
>> that the 2 solutions currently offered can be improved upon. If you or
>> anyone else has a better solution, I'd be more than happy to implement
>> and switch to it.
>>
>> Let me have a think about the solution you suggest and get back to you.
>
> Okay, I've written a bunch of simple test cases and results are
> positive. It seems to achieve my aim whilst minimising any potential
> pitfalls.
>
> - Success returns Bytes actually written - no functional change
> - Overflow returns the size of the buffer - which makes the result
> a) testable for overflow
> b) non-catastrophic if accidentally used to manipulate later sizes

You are describing scnprintf(), which almost does exactly that. The last
thing we need is another interface with almost identical semantics.

> int size = 10;
> char buf[size];
> char *b = buf;
>
> ret = spprintf(b, size, "1234");
> size -= ret;
> b += ret;
> // ret = 4 size = 6 buf = "1234\0"
>
> ret = spprintf(b, size, "5678");
> size -= ret;
> b += ret;
> // ret = 4 size = 2 buf = "12345678\0"
>
> ret = spprintf(b, size, "9***");
> size -= ret;
> b += ret;
> // ret = 2 size = 0 buf = "123456789\0"

So here scnprint() would have returned 1, leaving size at 1. scnprintf()
has the invariant that, for non-zero size, the return value is strictly
less than that size, so when passed a size of 1, all subsequent calls
return 0 (corresponding to the fact that all it could do was to write
the '\0' terminator).

This pattern already exists, and is really the reason scnprint exists.
Yes, scnprintf() cannot distinguish overflow from
it-just-exactly-fitted. Maybe it would have been better to make it work
like this, but I don't think there's a real use - and we do have
seq_buf() if one really wants an interface that can build a string
piece-meal while keeping track of whether it ever caused overflow.

Rasmus