Re: [RFC][PATCH 0/3] ftrace: Add dynamically allocated trampolines

From: Oleg Nesterov
Date: Wed Jul 23 2014 - 08:10:17 EST


On 07/22, Steven Rostedt wrote:
>
> On Tue, 22 Jul 2014 18:47:07 +0200
> Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>
> > On 07/03, Steven Rostedt wrote:
> >
> > > The way the function callback mechanism works in ftrace is that if there's
> > > only one function callback registered, it will set the mcount/fentry
> > > trampoline to call that function directly. But as soon as you register
> > > another callback, the mcount trampoline calls a loop function that iterates
> > > over all the registered callbacks (ftrace_ops) checking their hash tables
> > > to see if the called function matches the ops before calling its callback.
> > > This happens even if the two registered functions are not even tracing
> > > the same function!
> > >
> > > This really sucks if you are tracing all functions, and then add a kprobe
> > > or perf event that traces a single function. That will cause all the
> > > other functions being traced to perform the loop test.
> >
> > But this is even worse or I missed something? I mean, currently even
> > if you trace nothing and then add a KPROBE_FLAG_FTRACE kprobe, then
> > kprobe_ftrace_handler() is called by ftrace_ops_list_func() ?
>
> It shouldn't be. It should get called directly from the trampoline. The
> allocated trampoline should never call the list op. Well, it might
> during the conversion for safety, but after that, trampolines should
> only call the registered ftrace_ops->func directly.

I meant the current code (I am reading 3.16-rc2). Even if we have a single
KPROBE_FLAG_FTRACE kprobe, kprobe_ftrace_handler() won't be called directly.

Or I misunderstood your reply? Just in case, let me check...

With this stupid patch

--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -4464,6 +4464,7 @@ __ftrace_ops_list_func(unsigned long ip, unsigned long parent_ip,
printk("op=%p %pS\n", op, op);
goto out;
}
+ pr_crit("LIST_FUNC -> %pf()\n", op->func);
op->func(ip, parent_ip, op, regs);
}
} while_for_each_ftrace_op(op);

I do
# cd /sys/kernel/debug/tracing/
# echo "p:xx SyS_prctl+0x1c" >| kprobe_events
# cat ../kprobes/list
ffffffff81056c4c k SyS_prctl+0x1c [DISABLED][FTRACE]
# echo 1 >| events/kprobes/xx/enable
#
# perl -e 'syscall 157,-1'
# dmesg
LIST_FUNC -> kprobe_ftrace_handler()

so it is really called by the loop test code.

And I guess that after your patches kprobe_ftrace_handler() should be called
from the trampoline in this case.

> > ftrace_save_ops_tramp_hash():
> >
> > do_for_each_ftrace_rec(pg, rec) {
> > if (ftrace_rec_count(rec) == 1 &&
> > ftrace_ops_test(ops, rec->ip, rec)) {
> >
> > /* This record had better have a trampoline */
> > if (FTRACE_WARN_ON(!(rec->flags & FTRACE_FL_TRAMP_EN)))
> > return -1;
> >
> > Yes, but I can't understand how this can work.
>
> I wanted the back to 1 case to happen after we get the up to one case
> working. That is, I don't want to worry about it now ;-) As you can
> see, this code has enough things to try to keep straight without adding
> more complexity to the mix.

Yes, I see... but note that this WARN_ON() looks wrong in any case. At
least currently.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/