Re: [RFC] convert ftrace syscall tracer to TRACE_EVENT()

From: Ingo Molnar
Date: Sat May 09 2009 - 04:38:24 EST



* Jason Baron <jbaron@xxxxxxxxxx> wrote:

> Hi,
>
> I've been thinking about converting the current ftrace syscall
> tracer to the TRACE_EVENT() macros. There are a few issues with
> the current syscall tracer approach:
>
> 1) It has to be enabled for all processes and all syscalls. By
> moving to TRACE_EVENT(), it can be enabled/disabled per tracepoint
> and can also make use of the generic tracing filters, such as
> "trace all process for pid x"
>
> 2) Other tracers can not tie into it, since its not tracepoint
> based. TRACE_EVENT() fixes this.
>
> 3) data formatting. The syscall tracer I don't believe understands
> all the various types for output formatting. By moving to
> TRACE_EVENT(), we can print out a more readible syscall trace.
>
> 4) The ftrace syscall tracer needs a new arch specific code for
> each architecture. By converting to TRACE_EVENT() we don't need
> any architecutre specific code.
>
> Other issues to consider:
>
> * Maintainence. The current syscall tracer automatically picks up
> new syscalls. The TRACE_EVENT() will be harder to initially set
> up. But once its done, syscalls are obviously not added often. So
> I don't think this will be too bad.
>
> * Performance. The current syscall tracer adds a
> 'test_thread_flag()' to syscall entry/exit. The TRACE_EVENT()
> would add a per-syscall global to check. So they are going to have
> different cache profiles...however, the tracepoint infrastructure
> is hopefully moving to the 'immediate' value work, which will make
> this more highly optimized.
>
> I've also tested the patch shown below (which uses,
> DECLARE_TRACE(), as a preliminary proof of concept), using
> getpid() in a loop, and tbench, and saw very small performance
> differences. Obviously we would have to do more extensive testing
> before deciding.
>
> Patch is pretty rough, but should give a rough sense of what the
> DECLARE_TRACE() type patch might look like...

Yeah, i very much agree with the direction. (I've Cc:-ed Tom Zanussi
who also has expressed interest in this.)

I'm not sure about the implementation as you've posted it though:

Firstly, it adds two new tracepoints to every system call. That is
unnecessary - we already have the TIF flag based callbacks, and we
can use the existing syscall attributes table to get to tracepoints
- without slow down (or impacting) the fast path in any way.

Secondly, we should reuse the information we get in SYSCALL_DEFINE,
to construct the TRACE_EVENT tracepoints directly - without having
to list all syscalls again in a separate file.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/