Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

From: Namhyung Kim
Date: Mon Jul 13 2015 - 10:12:02 EST


On Mon, Jul 13, 2015 at 10:01:26PM +0800, pi3orama wrote:
>
>
> åèæç iPhone
>
> > å 2015å7æ13æïäå9:52ïNamhyung Kim <namhyung@xxxxxxxxxx> åéï
> >
> > Hi,
> >
> >> On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> >> hi, Alexei
> >>
> >>> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >>>> On 7/10/15 3:03 AM, He Kuang wrote:
> >>>> There're scenarios that we need an eBPF program to record not only
> >>>> kprobe point args, but also the PMU counters, time latencies or the
> >>>> number of cache misses between two probe points and other information
> >>>> when the probe point is entered.
> >>>>
> >>>> This patch adds a new trace event to establish infrastruction for bpf to
> >>>> output data to perf. Userspace perf tools can detect and use this event
> >>>> as using the existing tracepoint events.
> >>>>
> >>>> New bpf trace event entry in debugfs:
> >>>>
> >>>> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>>>
> >>>> Userspace perf tools detect the new tracepoint event as:
> >>>>
> >>>> bpf:bpf_output_data [Tracepoint event]
> >>>
> >>> Nice! This approach looks cleanest so far.
> >>>
> >>>> +TRACE_EVENT(bpf_output_data,
> >>>> +
> >>>> + TP_PROTO(u64 *src, int len),
> >>>> +
> >>>> + TP_ARGS(src, len),
> >>>> +
> >>>> + TP_STRUCT__entry(
> >>>> + __dynamic_array(u64, buf, len)
> >>>> + ),
> >>>> +
> >>>> + TP_fast_assign(
> >>>> + memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >>>
> >>> may be make it 'u8' array? The extra multiply and...
> >>
> >> OK
> >>
> >> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> >> 0x623eb6d) will be this:
> >>
> >> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> >> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
> >>
> >> And users are not restricted to u64 type elements. I'll change that.
> >
> > While this general event format works well, I think it might be hard
> > to know which output came from which program when more than one bpf
> > programs used.
> >
> > I was thinking about providing custom event formats for each bpf
> > program (if needed). The event format definitions might be in a
> > specific directory or a bpf object itself. Then perf can read those
> > formats and print the output data according to the formats. Maybe we
> > need to add some dynamic event id to match format and data.
> >
>
> I think we can do it in perf side. Let BPF programs themselves
> encode format information into the array and make perf read and
> decode them. In kernel side simply support raw data should be
> enough, so we can make kernel code as simple as possible.

Yes, of course, I also meant that doing those work all in perf side. :)

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/