Re: [PATCH RFC] ftrace: Show all functions with addresses in available_filter_functions_addrs

From: Andrii Nakryiko
Date: Thu Jun 08 2023 - 19:55:59 EST


On Thu, Jun 8, 2023 at 4:27 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> On Thu, 8 Jun 2023 15:43:03 -0700
> Andrii Nakryiko <andrii.nakryiko@xxxxxxxxx> wrote:
>
> > On Thu, Jun 8, 2023 at 2:26 PM Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> > >
> > >
> > > hi,
> > > when ftrace based tracers we need to cross check available_filter_functions
> > > with /proc/kallsyms. For example for kprobe_multi bpf link (based on fprobe)
> > > we need to make sure that symbol regex resolves to traceable symbols and
> > > that we get proper addresses for them.
>
> I forgot, what was the problem with doing the above?

More code, more memory, more CPU to parse all the text files. Parsing
kallsyms is quite expensive, so avoiding this would be great.

>
> > >
> > > Looks like on the last last LSF/MM/BPF there was an agreement to add new
> > > file that will have available_filter_functions symbols plus addresses.
> > >
> > > This RFC is to kick off the discussion, I'm not sure Steven wants to do
> > > that differently ;-)
>
> I'm not totally against this, but I'd like to know the full issue its
> solving. Perhaps I need to know more about what is being done, and what is
> needed too.

There are BPF tools that allow user to specify regex/glob of kernel
functions to attach to. This regex/glob is checked against
available_filter_functions to check which functions are traceable. All
good. But then also it's important to have corresponding memory
addresses for selected functions (for many reasons, e.g., to have
non-ambiguous and fast attachment by address instead of by name, or
for some post-processing based on captured IP addresses, etc). And
that means that now we need to also parse /proc/kallsyms and
cross-join it with data fetched from available_filter_functions.

All this is unnecessary if avalable_filter_functions would just
provide function address in the first place. It's a huge
simplification. And saves memory and CPU.

>
> > >
> > > thanks,
> > > jirka
> > >
> > >
> > > ---
> > > Adding new available_filter_functions_addrs file that shows all available
> > > functions (same as available_filter_functions) together with addresses,
> > > like:
> > >
> > > # cat available_filter_functions_addrs | head
> >
> > nit: can we have some more succinct name, like "traceable_funcs" or
>
>
> It's to match avaliable_filter_functions

it's minor, I'm fine with whatever name, I'm searching for it in my
history every single time anyways :)

>
> Another way is to add a tracing option to make the address show up in the
> available_filter_functions file. That would be my preferred choice.
>
> echo 1 > options/available_filter_addrs
>
> Or something like that.

This would modify behavior for entire system, right? I think this is
very bad. Just because one application is aware of this option and
wants to turn this on, doesn't mean that all other applications that
might also use available_filter_functions should immediately break on
that machine.

Please, let's have a separate file. There is no downside to that.

>
>
>
> > something? And btw, does this have to be part of tracefs/debugfs
>
> Because it's part of ftrace, and that belongs in tracefs.

I can use ftrace (through BPF) without mounting tracefs, right? So it
would be good to have a list of attachable kprobes without having to
worry whether tracefs was mounted or not. It's no big deal, I was just
curious if there has to be a tie to tracefs.

>
> > (never knew the difference, sorry). E.g., can it be instead exposed
> > through sysfs?
>
> tracefs is not debugfs, as debugfs includes all things debuggy (and
> considered not secure). tracefs is its own file system dedicated to the
> tracing code in the kernel. It exists with CONFIG_DEBUG not defined, and
> lives in /sys/kernel/tracing. The only reason /sys/kernel/debug/tracing
> (which is a duplicate mount point) exists is for backward compatibility for
> before tracefs existed. But that path really should be deprecated.

cool, thanks for explaining!

>
> >
> > Either than these minor things, yep, I think this is something that
> > would be extremely useful, thanks, Jiri, for taking a stab at it!
> >
> > > ffffffff81000770 __traceiter_initcall_level
> > > ffffffff810007c0 __traceiter_initcall_start
> > > ffffffff81000810 __traceiter_initcall_finish
> > > ffffffff81000860 trace_initcall_finish_cb
> > > ...
> > >
> > > It's useful to have address avilable for traceable symbols, so we don't
> > > need to allways cross check kallsyms with available_filter_functions
> > > (or the other way around) and have all the data in single file.
>
> Is it really that big of an issue? Again, I'm not against this change, but
> I'm just wondering how much of a burden is it relieving?

Quite a lot, especially when you have to do all that in pure C.

>
> > >
> > > For backwards compatibility reasons we can't change the existing
> > > available_filter_functions file output, but we need to add new file.
>
> Or we could add an option to change it ;-)
>
> > >
> > > Suggested-by: Steven Rostedt (Google) <rostedt@xxxxxxxxxxx>
> > > Suggested-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
> > > Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> > > ---
> > > include/linux/ftrace.h | 1 +
> > > kernel/trace/ftrace.c | 52 ++++++++++++++++++++++++++++++++++++++----
> > > 2 files changed, 48 insertions(+), 5 deletions(-)
> > >

[...]