Re: [PATCH v10 perf, bpf-next 1/9] perf, bpf: Introduce PERF_RECORD_KSYMBOL

From: Arnaldo Carvalho de Melo
Date: Thu Jan 17 2019 - 09:59:00 EST


Em Thu, Jan 17, 2019 at 02:49:10PM +0000, Song Liu escreveu:
>
>
> > On Jan 17, 2019, at 4:56 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> >
> > On Wed, Jan 16, 2019 at 08:29:23AM -0800, Song Liu wrote:
> >> For better performance analysis of dynamically JITed and loaded kernel
> >> functions, such as BPF programs, this patch introduces
> >> PERF_RECORD_KSYMBOL, a new perf_event_type that exposes kernel symbol
> >> register/unregister information to user space.
> >>
> >> The following data structure is used for PERF_RECORD_KSYMBOL.
> >>
> >> /*
> >> * struct {
> >> * struct perf_event_header header;
> >> * u64 addr;
> >> * u32 len;
> >> * u16 ksym_type;
> >> * u16 flags;
> >> * char name[];
> >> * struct sample_id sample_id;
> >> * };
> >> */
> >
> > So I've cobbled together the attached patches to see how it would work
> > out..
> >
> > I didn't convert ftrace trampolines; because ftrace code has this
> > uncanny ability to make my head hurt. But I don't think it should be
> > hard, once one figures out the right structure to stick that
> > kallsym_node thing in (ftrace_ops ?!).
> >
> > It is compiled only, so no testing what so ever (also, no changelogs).
> >
> > I didn't wire up the KSYM_TYPE thing; I'm wondering if we really need
> > that, OTOH it really doesn't hurt having it either.
> >
> > One weird thing I noticed, wth does bpf_prog_kallsyms_add() check
> > CAP_ADMIN ?! Surely even a non-priv JIT'ed program generates symbols,
> > why hide those?
> >
> > Anyway; with the one nit about the get_names() thing sorted:
> >
> > Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> >
> > (thanks for sticking with this)
> > <peterz-latch-next.patch><peterz-kallsym.patch><peterz-kallsym-bpf.patch>
>
> Aha, now I get the point on perf_event_ksymbol(). Yeah this approach is
> definitely better.
>
> While I run more tests with these patches, could we get current in
> perf/core? This will enable the development of user space tools like
> bcc.
>
> Also, I current base this set on bpf-next tree, as tip/perf/core is
> 4 week old. Shall I rebase the set on Linus' tree? Or shall I wait for
> tip/perf/core?

So, can you post one last set, this time with PeterZ's Acked-by,
assuming you're sorting out the get_names() thing, and I can try merging
this into my perf/core branch, then pushing it out to Ingo, my perf/core
starts from tip/perf/urgent, so should be new enough.

I'd then right after testing it send a pull request to Ingo, synching
everything.

- Arnaldo