Re: [PATCH v11 2/5] ring-buffer: Introducing ring-buffer mapping functions

From: Steven Rostedt
Date: Thu Jan 11 2024 - 18:22:26 EST


On Thu, 11 Jan 2024 11:34:58 -0500
Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:


> The LTTng kernel tracer has supported mmap'd buffers for nearly 15 years [1],
> and has a lot of similarities with this patch series.
>
> LTTng has the notion of "subbuffer id" to allow atomically exchanging a
> "reader" extra subbuffer with the subbuffer to be read. It implements
> "get subbuffer" / "put subbuffer" ioctls to allow the consumer (reader)
> to move the currently read subbuffer position. [2]
>
> It would not hurt to compare your approach to LTTng and highlight
> similarities/differences, and the rationale for the differences.
>
> Especially when it comes to designing kernel ABIs, it's good to make sure
> that all bases are covered, because those choices will have lasting impacts.
>
> Thanks,
>
> Mathieu
>
> [1] https://git.lttng.org/?p=lttng-modules.git;a=blob;f=src/lib/ringbuffer/ring_buffer_mmap.c
> [2] https://git.lttng.org/?p=lttng-modules.git;a=blob;f=src/lib/ringbuffer/ring_buffer_vfs.c
>

Hi Mathieu,

Thanks for sharing!

As we discussed a little bit in the tracing meeting we do somethings
differently but very similar too ;-)

The similarities as that all the sub-buffers are mapped. You have a
reader-sub-buffer as well.

The main difference is that you use an ioctl() that returns where to find
the reader-sub-buffer, where our ioctl() is just "I'm done, get me a new
reader-sub-buffer". Ours will update the meta page.

Our meta page looks like this:

> +struct trace_buffer_meta {
> + unsigned long entries;
> + unsigned long overrun;
> + unsigned long read;

If start tracing: trace-cmd start -e sched_switch and do:

~ cat /sys/kernel/tracing/per_cpu/cpu0/stats
entries: 14
overrun: 0
commit overrun: 0
bytes: 992
oldest event ts: 84844.825372
now ts: 84847.102075
dropped events: 0
read events: 0

You'll see similar to the above.

entries = entries
overrun = overrun
read = read

The above "read" is total number of events read.

Pretty staight forward ;-)


> +
> + unsigned long subbufs_touched;
> + unsigned long subbufs_lost;
> + unsigned long subbufs_read;

Now I'm thinking we may not want this exported, as I'm not sure it's useful.

Vincent, talking with Mathieu, he was suggesting that we only export what
we really need, and I don't think we need the above. Especially since they
are not exposed in the stats file.


> +
> + struct {
> + unsigned long lost_events;
> + __u32 id;
> + __u32 read;
> + } reader;

The above is definitely needed, as all of it is used to read the
reader-page of the sub-buffer.

lost_events is the number of lost events that happened before this
sub-buffer was swapped out.

Hmm, Vincent, I just notice that you update the lost_events as:

> +static void rb_update_meta_page(struct ring_buffer_per_cpu *cpu_buffer)
> +{
> + struct trace_buffer_meta *meta = cpu_buffer->meta_page;
> +
> + WRITE_ONCE(meta->entries, local_read(&cpu_buffer->entries));
> + WRITE_ONCE(meta->overrun, local_read(&cpu_buffer->overrun));
> + WRITE_ONCE(meta->read, cpu_buffer->read);
> +
> + WRITE_ONCE(meta->subbufs_touched, local_read(&cpu_buffer->pages_touched));
> + WRITE_ONCE(meta->subbufs_lost, local_read(&cpu_buffer->pages_lost));
> + WRITE_ONCE(meta->subbufs_read, local_read(&cpu_buffer->pages_read));
> +
> + WRITE_ONCE(meta->reader.read, cpu_buffer->reader_page->read);
> + WRITE_ONCE(meta->reader.id, cpu_buffer->reader_page->id);
> + WRITE_ONCE(meta->reader.lost_events, cpu_buffer->lost_events);
> +}

The lost_events may need to be handled differently, as it doesn't always
get cleared. So it may be stale data.


> +
> + __u32 subbuf_size;
> + __u32 nr_subbufs;

This gets is the information needed to read the mapped ring buffer.

> +
> + __u32 meta_page_size;
> + __u32 meta_struct_len;

The ring buffer gets mapped after "meta_page_size" and this structure is
"meta_struct_len" which will change if we add new data to it.

> +};

-- Steve