Re: [RFC PATCH v4 3/3] bpf: Introduce function for outputing data to perf event

From: Namhyung Kim
Date: Mon Jul 13 2015 - 09:54:41 EST


Hi,

On Mon, Jul 13, 2015 at 12:36:27PM +0800, He Kuang wrote:
> hi, Alexei
>
> On 2015/7/11 6:10, Alexei Starovoitov wrote:
> >On 7/10/15 3:03 AM, He Kuang wrote:
> >>There're scenarios that we need an eBPF program to record not only
> >>kprobe point args, but also the PMU counters, time latencies or the
> >>number of cache misses between two probe points and other information
> >>when the probe point is entered.
> >>
> >>This patch adds a new trace event to establish infrastruction for bpf to
> >>output data to perf. Userspace perf tools can detect and use this event
> >>as using the existing tracepoint events.
> >>
> >>New bpf trace event entry in debugfs:
> >>
> >> /sys/kernel/debug/tracing/events/bpf/bpf_output_data
> >>
> >>Userspace perf tools detect the new tracepoint event as:
> >>
> >> bpf:bpf_output_data [Tracepoint event]
> >
> >Nice! This approach looks cleanest so far.
> >
> >>+TRACE_EVENT(bpf_output_data,
> >>+
> >>+ TP_PROTO(u64 *src, int len),
> >>+
> >>+ TP_ARGS(src, len),
> >>+
> >>+ TP_STRUCT__entry(
> >>+ __dynamic_array(u64, buf, len)
> >>+ ),
> >>+
> >>+ TP_fast_assign(
> >>+ memcpy(__get_dynamic_array(buf), src, len * sizeof(u64));
> >
> >may be make it 'u8' array? The extra multiply and...
>
> OK
>
> So the output of three u64 integers (e.g. 0x2060572485, 0x20667b0ff2,
> 0x623eb6d) will be this:
>
> dd 994 [000] 139.158180: bpf:bpf_output_data: 85 24 57 60 20 00 00 00
> f2 0f 7b 66 20 00 00 00 6d eb 23 06 00 00 00 00
>
> And users are not restricted to u64 type elements. I'll change that.

While this general event format works well, I think it might be hard
to know which output came from which program when more than one bpf
programs used.

I was thinking about providing custom event formats for each bpf
program (if needed). The event format definitions might be in a
specific directory or a bpf object itself. Then perf can read those
formats and print the output data according to the formats. Maybe we
need to add some dynamic event id to match format and data.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/