Re: [RFC PATCH -tip 0/2] kprobes: A trial to reuse graph-tracer's return stack for kretprobe

From: Masami Hiramatsu
Date: Wed Jan 24 2018 - 20:49:28 EST


On Wed, 24 Jan 2018 12:04:55 -0500
Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:

>
> Hi Masami,
>
> I just came across this patch set (buried deep in my INBOX). Are you
> still doing anything with this?

Ah, We had a talk on this topic at last plumbers, and (IIRC) you will
expand ftrace API so that it can support return hook, and kretprobe
can piggyback on that. So I'm waiting that your enhancement, or
could I try it? :)

Thank you,

>
> -- Steve
>
>
> On Tue, 22 Aug 2017 00:40:05 +0900
> Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:
>
> > Hello,
> >
> > Here is a feasible study patch to use function_graph
> > tracer's per-thread return stack for storing kretprobe
> > return address as fast path.
> >
> > Currently kretprobe has own instance hash-list for storing
> > return address. However, it introduces a spin-lock for
> > hash list entry and compel users to estimate how many
> > probes run concurrently (and set it to kretprobe->maxactive).
> >
> > To solve this issue, this reuses function_graph's per-thread
> > ret_stack for kretprobes as fast path instead of using its
> > hash-list if possible. Note that if the kretprobe has
> > custom entry_handler and store data in kretprobe_instance,
> > we can not use the fast path, since current per-thread
> > return stack is fixed size. (This feature is used by some
> > systemtap scripts)
> >
> > This series also includes showing missed count of
> > kretprobes via ftrace's kprobe_profile interface, which
> > had been posted in this March. That is required for
> > below test case. (without that, we can not see any
> > kretprobe miss count)
> >
> > Usage
> > =====
> > Note that this is just a feasibility study code, and since
> > the per-thread ret_stack is initialized only when the
> > function_graph tracer is enabled, you have to following
> > operation to enable it.
> >
> > # echo '*' > <tracefs>/set_graph_notrace
> > # echo function_graph > <tracefs>/current_tracer
> >
> > After that, try to add an kretprobe event with just 1
> > instance (anyway we don't use it).
> >
> > # echo r1 vfs_write > <tracefs>/kprobe_events
> > # echo 1 > <tracefs>/events/kprobes/enable
> >
> > And run "yes" command concurrently.
> >
> > # for i in {0..31}; do yes > /dev/null & done
> > # cat <tracefs>/kprobe_profile
> > r_vfs_write_0 4756473 0
> >
> > Then you will see the error count (the last column) is zero.
> > Currently, this feature is disabled when the function graph
> > tracer is stopped, so if you set nop tracer as below,
> >
> > # echo nop > <tracefs>/current_tracer
> >
> > Then you'll see the error count is increasing.
> >
> > # cat <tracefs>/kprobe_profile
> > r_vfs_write_0 7663462 238537
> >
> > This may gain the performance of kretprobe, but I haven't
> > benchmark it yet.
> >
> >
> > TODO
> > ====
> > This is just a feasible study code, I haven't tested it
> > deeper. It may still have some bugs. Anyway, if it is good,
> > I would like to split the per-thread return stack code
> > from ftrace, and make it a new generic feature (e.g.
> > CONFIG_THERAD_RETURN_STACK) so that both kprobes and
> > ftrace can share it. It may also move return-stack
> > allocation as direct call instead of event handler.
> >
> > Any comment?
> >
> > Thank you,
> >
> > ---
> >
> > Masami Hiramatsu (2):
> > trace: kprobes: Show sum of probe/retprobe nmissed count
> > kprobes/x86: Use graph_tracer's per-thread return stack for kretprobe
> >
> >
> > arch/x86/kernel/kprobes/core.c | 95 ++++++++++++++++++++++++++++++++++
> > include/linux/ftrace.h | 3 +
> > kernel/kprobes.c | 11 ++++
> > kernel/trace/trace_functions_graph.c | 5 +-
> > kernel/trace/trace_kprobe.c | 2 -
> > 5 files changed, 112 insertions(+), 4 deletions(-)
> >
> > --
> > Masami Hiramatsu (Linaro) <mhiramat@xxxxxxxxxx>
>


--
Masami Hiramatsu <mhiramat@xxxxxxxxxx>