Re: [PATCH] tracing: Stop FORTIFY_SOURCE complaining about stack trace caller

From: Steven Rostedt
Date: Wed Jul 12 2023 - 20:44:09 EST


On Wed, 12 Jul 2023 16:36:30 -0700
Kees Cook <keescook@xxxxxxxxxxxx> wrote:

> >
> > To hide this from the FORTIFY_SOURCE logic, pointer arithmetic is used:
> >
> > ptr = ring_buffer_event_data(event);
> > entry = ptr;
> > ptr += offsetof(typeof(*entry), caller);
> > memcpy(ptr, fstack->calls, size);
>
> But... Why are you lying to the compiler? Why not just make this
> dynamically sized for real? It's not a "struct stack_entry" if it might
> be bigger.

I was waiting for some controversy from this patch ;-)

>
> Just create a new struct that isn't lying? (Dealing with the "minimum
> size" issue for a dynamic array is usually done with unions, but
> ftrace's structs are ... different. As such, I just added a one-off
> union.) Here, and you can be the first user of __counted_by too:
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 4529e264cb86..40935578c365 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -3108,6 +3108,14 @@ struct ftrace_stacks {
> static DEFINE_PER_CPU(struct ftrace_stacks, ftrace_stacks);
> static DEFINE_PER_CPU(int, ftrace_stack_reserve);
>
> +union stack_entry_dynamic {
> + struct stack_entry entry;
> + struct {
> + int size;
> + unsigned long caller[] __counted_by(size);
> + };
> +};

This actually makes it more likely to cause a bug in the future (and the
present!). Now we need to know that if stack_entry were to ever change, we
have to change this too. And it can change (although highly unlikely).

The problem comes with this structure being created by a macro that creates
the structure and what it exports to user space.

>From kernel/trace/trace_entries.h: The macro that creates this structure.

FTRACE_ENTRY(kernel_stack, stack_entry,

TRACE_STACK,

F_STRUCT(
__field( int, size )
__array( unsigned long, caller, FTRACE_STACK_ENTRIES )
),

F_printk("\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n\t=> %ps\n"
"\t=> %ps\n\t=> %ps\n",
(void *)__entry->caller[0], (void *)__entry->caller[1],
(void *)__entry->caller[2], (void *)__entry->caller[3],
(void *)__entry->caller[4], (void *)__entry->caller[5],
(void *)__entry->caller[6], (void *)__entry->caller[7])
);

Now, this event is unique, as it's the only event that has a dynamic array
that is not done by the way other dynamic arrays are done. Which is to
insert a field that has an offset and length to it. That is, other events
would look like this:

struct stack_entry {
int size;
short offset; length;
[ more fields ]
int dynamic[];
}

Where offset would be the ((void *)(struct stack_entry *)data) + offset. As
all the dynamic size portions of an event are at the end of the event, with
these offset/length tuples to tell user space and the kernel where to look
in the event binary data for those fields.

But to keep backward compatibility, as this event had code specific for
parsing it in libtraceevent that doesn't expect that offset/length tuple,
and instead just looked at the "caller[8]" portion to see that it had 8
fields (this is static for all events). New code uses the size to know, and
ignores the [8]. The event meta data gives the actual size of the stored
data so the parser knows not to go beyond that.

Note, this event existed before normal trace events that had the dynamic
arrays, which is why it didn't do the same.

The only thing this code is doing is filling in the ring buffer. The entry
structure is just a helper to know where to place the data in the ring
buffer.

So my question to you is, what's the purpose of hardening? To prevent
future bugs, right? By making a shadow struct, we are now burdened to
remember to modify the shadow stack if we ever modify this current
structure.

Oh, to further my point, your change is buggy too (and I almost didn't even
notice it!)

> +
> static void __ftrace_trace_stack(struct trace_buffer *buffer,
> unsigned int trace_ctx,
> int skip, struct pt_regs *regs)
> @@ -3116,7 +3124,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
> struct ring_buffer_event *event;
> unsigned int size, nr_entries;
> struct ftrace_stack *fstack;
> - struct stack_entry *entry;
> + union stack_entry_dynamic *entry;
> int stackidx;
>
> /*
> @@ -3155,16 +3163,15 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
> nr_entries = stack_trace_save(fstack->calls, size, skip);
> }
>
> - size = nr_entries * sizeof(unsigned long);
> event = __trace_buffer_lock_reserve(buffer, TRACE_STACK,
> - (sizeof(*entry) - sizeof(entry->caller)) + size,
> + struct_size(entry, caller, nr_entries),
> trace_ctx);

Your definition of stack_entry_dynamic is wrong, because stack_entry is
really (as created by the macro, hence why this is error prone):

struct stack_entry {
struct trace_entry ent;
int size;
long caller[8];
};

So you didn't allocate enough, and are writing into the wrong part of the
data, corrupting it.

This is why I much rather prefer the simple:

ptr += offsetof(typeof(*entry), caller);
memcpy(ptr, fstack->calls, size);

Which doesn't need to care about synchronizing with the macro magic of the
structure, which may change in the future, and this will lead to one more
location that would need to be updated, but likely forgotten.

C is fun, let's go shopping!

-- Steve


> if (!event)
> goto out;
> entry = ring_buffer_event_data(event);
>
> - memcpy(&entry->caller, fstack->calls, size);
> entry->size = nr_entries;
> + memcpy(entry->caller, fstack->calls, flex_array_size(entry, caller, nr_entries));
>
> if (!call_filter_check_discard(call, entry, buffer, event))
> __buffer_unlock_commit(buffer, event);
>