Re: [PATCH] tracing: Stop FORTIFY_SOURCE complaining about stack trace caller

From: Kees Cook
Date: Thu Jul 13 2023 - 01:22:43 EST


On Wed, Jul 12, 2023 at 08:43:58PM -0400, Steven Rostedt wrote:
> On Wed, 12 Jul 2023 16:36:30 -0700
> Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > [...]
> > +union stack_entry_dynamic {
> > + struct stack_entry entry;
> > + struct {
> > + int size;
> > + unsigned long caller[] __counted_by(size);
> > + };
> > +};
>
> This actually makes it more likely to cause a bug in the future (and the
> present!). Now we need to know that if stack_entry were to ever change, we
> have to change this too. And it can change (although highly unlikely).

Yeah, I should have included a bit more, like:
BUILD_BUG_ON(offsetof(struct stack_entry, caller) == offsetof(struct stack_entry_dynamic, caller))

But anyway, I think we can still do better. :)

> [...]
> Now, this event is unique, as it's the only event that has a dynamic array
> that is not done by the way other dynamic arrays are done. Which is to
> insert a field that has an offset and length to it. That is, other events
> would look like this:
>
> struct stack_entry {
> int size;
> short offset; length;
> [ more fields ]
> int dynamic[];
> }

Yeah, it won't be a "true" trace dynamic array. stack_entry is the
sockaddr of trace. :) (i.e. we want to change the size of the trailing
array without actually changing the struct, but this way leads to
madness.)

> Where offset would be the ((void *)(struct stack_entry *)data) + offset. As
> all the dynamic size portions of an event are at the end of the event, with
> these offset/length tuples to tell user space and the kernel where to look
> in the event binary data for those fields.
>
> But to keep backward compatibility, as this event had code specific for
> parsing it in libtraceevent that doesn't expect that offset/length tuple,
> and instead just looked at the "caller[8]" portion to see that it had 8
> fields (this is static for all events). New code uses the size to know, and
> ignores the [8]. The event meta data gives the actual size of the stored
> data so the parser knows not to go beyond that.

Right -- so, it would have been much nicer to use a new struct when
caller[8] wasn't enough. :) But that ship has sailed. Now we're in
a state where some tools expect caller[8] and some expect "caller[]
__counted_by(size)". In other places where we've had a "minimum sized
flexible array" requirements, we would've used a union, effectively like:

int size;
union {
some_type legacy_padding[8];
some_type caller[] __counted_by(size);
};

Old code ends up with the same sized struct, and new code gets a proper
flexible array with the original struct member name. (And with
__counted_by, we'll have the capacity to do run-time bounds checking on
the flexible array.)

>
> Note, this event existed before normal trace events that had the dynamic
> arrays, which is why it didn't do the same.
>
> The only thing this code is doing is filling in the ring buffer. The entry
> structure is just a helper to know where to place the data in the ring
> buffer.
>
> So my question to you is, what's the purpose of hardening? To prevent
> future bugs, right? By making a shadow struct, we are now burdened to
> remember to modify the shadow stack if we ever modify this current
> structure.

Right, I'd rather avoid a shadow struct -- it's especially weird here
given the special way trace creates its structures. :P The hardening
efforts are mainly holistic protections, in the sense that much of the
kernel isn't doing this kind of careful length management, etc. But that
means we can't abuse fixed-sized arrays anymore -- the compiler needs
to believe what it's told about sizes. ;)

> [...]
> This is why I much rather prefer the simple:
>
> ptr += offsetof(typeof(*entry), caller);
> memcpy(ptr, fstack->calls, size);
>
> Which doesn't need to care about synchronizing with the macro magic of the
> structure, which may change in the future, and this will lead to one more
> location that would need to be updated, but likely forgotten.

I am just thinking this will kick the can down the road. The compiler
may get smart enough to see through this or who know what next.

> C is fun, let's go shopping!

Yup. I'm just trying to take away all the foot-guns. :) Let me try
another way to solve this that is less of a bypass. For now, sure, let's
take the work-around, but I think it'll need to be fixed up soon.

-Kees

--
Kees Cook