Re: [GIT PULL] tracing: Prevent trace_marker being bigger than unsigned short

From: Steven Rostedt
Date: Mon Mar 04 2024 - 17:12:13 EST


On Mon, 4 Mar 2024 13:50:13 -0800
Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, 4 Mar 2024 at 13:40, Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
> >
> > As I mentioned that the design is based on that the allocated buffer size is
> > the string length rounded up to the word size, all I need to do is to make
> > sure that there's a nul terminating byte within the last word of the
> > allocated buffer. Then "%s" is all I need.
>
> Please don't add pointless code that helps nothing.
>
> > Would this work for you?
>
> No. This code only adds debug code, and doesn't actually improve anything.
>
> We *have* debug code already. Things like KASAN already find array
> overruns, and your ex-tempore debug code adds zero actual value.

Sorry I thought debug code was OK. But I guess I was mistaken. KASAN isn't
run in the field, where this would trigger. But I get your point. If it's
passing my tests (which I do run with KASAN), I guess that's good enough
for you.

>
> That, btw, is why your old stupid precision code was not only
> triggering warnings, but was ACTIVELY DETRIMENTAL.
>
> All that precision code could ever do was to potentially hide bugs if
> the string wasn't NUL-terminated.
>
> So no. I absolutely do NOT want you to write more code to hide bugs or
> do half-arsed checking.

Well, it wouldn't hide it. It would trigger a big fat warning if it was
missing a nul terminator.

>
> I want you to *simplify* the code, and put proper limits in place for strings.
>
> I want to see the code that actually notices when somebody generates a
> crazy string, and stops that garbage in its tracks.
>
> What I do *not* want to see is more ad-hoc code that tries to deal
> with the symptoms of you not having done so.

This warning is just making sure the code is nul terminated. It has nothing
to do with size. The bug that triggered when I was working on other code
was a miscalculation of the input. I didn't write the entire string into
the ring buffer which meant that the terminating nul was also missing. On
reading the string, it crashed the kernel.

I put in the precision when debugging the code, and that's where I found the
mismatch in string size vs writing to the buffer. I then kept the precision
just in case I hit a similar bug. Which is what you have issues with.

Fine, I'll just remove the precision as that's not needed. There was no
other overflows involved here.

-- Steve