Re: [GIT PULL] Zstd fixes for v6.7

From: Nick Terrell
Date: Wed Nov 15 2023 - 17:54:07 EST




> On Nov 14, 2023, at 8:31 PM, Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, 14 Nov 2023 at 17:17, Nick Terrell <terrelln@xxxxxxxx> wrote:
>>
>> Only a single line change to fix a benign UBSAN warning that has been
>> baking in linux-next for a month. I just missed the merge window, but I
>> think it is worthwhile to include this fix in the v6.7 kernel. If you
>> would like me to wait for v6.8 please let me know.
>
> Hmm. You claim it's been in linux-next for a month, but why the hell
> was it then rebased *minutes* before you sent the pull request?
>
> That's really not ok. Rebasing literally removes the test coverage you
> had. What possible reason was there for rebasing? And why didn't you
> mention it?
>
> So stop doing these dodgy things.

I’m sorry, I will do better. Thanks for taking the time to point out my
mistakes.

> I have pulled this, but despite this being a "trivial" one-liner, I
> think there is a bug in there somewhere.
>
> In particular, we *used* to have
>
> typedef struct {
> short ncount[FSE_MAX_SYMBOL_VALUE + 1];
> FSE_DTable dtable[1]; /* Dynamically sized */
> } FSE_DecompressWksp;
>
> and in FSE_decompress_wksp_body() we have
>
> FSE_DecompressWksp* const wksp = (FSE_DecompressWksp*)workSpace;
> ...
> if (wkspSize < sizeof(*wksp)) return ERROR(GENERIC);
> ...
> wkspSize -= sizeof(*wksp) + FSE_DTABLE_SIZE(tableLog);
>
> and note that "sizeof(*wksp)".
>
> Because it has *changed* with that one-liner fix, since now we have an
> unsized array there:
>
> typedef struct {
> short ncount[FSE_MAX_SYMBOL_VALUE + 1];
> FSE_DTable dtable[]; /* Dynamically sized */
> } FSE_DecompressWksp;
>
> so while the logic actually looks better to me with the change (no
> more off-by-one errors), the fact that it used to work with what looks
> like an off-by-one error in the sizeof() calculation just makes me go
> "Hmm".
>
> In particular, that
>
> wkspSize -= sizeof(*wksp) + FSE_DTABLE_SIZE(tableLog);
>
> seems to have removed too much from the wkspSize variable, but it
> still ended up not triggering any limit checks. Hmm?
>
> End result: this may be a one-liner change, but honestly, I think it
> was done HORRIBLY BADLY. That one-liner has serious implications and
> just a trivial check of mine seems to say this code is or was seriosly
> buggy exactlky when it comes to that one-liner.

You’re right, the code previously had an off-by-one error where it consumed
4 bytes too much of the `wkspSize`. This workspace is a shared buffer that
is sized to accommodate the largest use of it, which is not this function.
So there was enough slack that the extra 4 bytes wasn’t noticed.

But I absolutely should’ve mentioned why it is safe in the commit message.

> And no, rebasing minutes before sending a pull request is not ok.
>
> Linus