Re: [PATCH] tracing: Fix race when concurrently splice_read trace_pipe

From: Steven Rostedt
Date: Wed Aug 16 2023 - 15:23:55 EST


On Sat, 12 Aug 2023 10:22:43 +0800
Zheng Yejian <zhengyejian1@xxxxxxxxxx> wrote:

>
> >>>> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> >>>> index b8870078ef58..f169d33b948f 100644
> >>>> --- a/kernel/trace/trace.c
> >>>> +++ b/kernel/trace/trace.c
> >>>> @@ -7054,14 +7054,16 @@ static ssize_t tracing_splice_read_pipe(struct file *filp,
> >>>> if (ret <= 0)
> >>>> goto out_err;
> >>>>
> >>>> - if (!iter->ent && !trace_find_next_entry_inc(iter)) {
> >>>> + trace_event_read_lock();
> >>>> + trace_access_lock(iter->cpu_file);
> >>>> +
> >>>> + if (!trace_find_next_entry_inc(iter)) {
> >>>
> >>> It seems you skips '!iter->ent' check. Is there any reason for this change?
> >>
> >> IIUC, 'iter->ent' may be the entry that was found but not consumed
> >> in last call tracing_splice_read_pipe(), and in this call, 'iter->ent'
> >> may have being consumed, so we may should find a new 'iter->ent' before
> >> printing it in tracing_fill_pipe_page(), see following reduced codes:
> >
> > And if it wasn't consumed? We just lost it?
>
> If 'iter->ent' was not consumed, trace_find_next_entry_inc() will find
> it again, will it?
>
> -- Zheng Yejian
>
> >
> >>
> >> tracing_splice_read_pipe() {
> >> if (!iter->ent && !trace_find_next_entry_inc(iter)) { // 1. find
> >> entry here
> >> ... ...
> >> }
> >> tracing_fill_pipe_page() {
> >> for (;;) {
> >> ... ...
> >> ret = print_trace_line(iter); // 2. print entry
> >> ... ...
> >
> > You missed:
> >
> > count = trace_seq_used(&iter->seq) - save_len;
> > if (rem < count) {
> > rem = 0;
> > iter->seq.seq.len = save_len;
> >
> > Where the above just threw away what was printed in the above
> > "print_trace_line()", and it never went to console.
> >
> > break;
> > }
> >
>
> Thanks for pointing this out!

Just to get this moving again, I believe we should add a ref count to
trace_pipe and the per_cpu trace_pipes, where if they are opened, nothing else can read it.

Opening trace_pipe locks all per_cpu ref counts, if any of them are open,
then the trace_pipe open will fail (and releases any ref counts it had
taken).

Opening a per_cpu trace_pipe will up the ref count for just that CPU
buffer. This will allow multiple tasks to read different per_cpu trace_pipe
files, but will prevent the main trace_pipe file from being opened.

Does that work for this?

-- Steve