Re: [PATCH bpf-next 1/2] bpf/perf: Call perf_prepare_sample() before bpf_prog_run()

From: Namhyung Kim
Date: Thu Dec 22 2022 - 17:26:07 EST


On Thu, Dec 22, 2022 at 12:16 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Thu, Dec 22, 2022 at 09:34:42AM -0800, Namhyung Kim wrote:
>
> > Sorry about that. Let me rephrase it like below:
> >
> > With bpf_cast_to_kern_ctx(), BPF programs attached to a perf event
> > can access perf sample data directly from the ctx.
>
> This is the bpf_prog_run() in bpf_overflow_handler(), right?

Yes.

>
> > But the perf sample
> > data is not fully prepared at this point, and some fields can have invalid
> > uninitialized values. So it needs to call perf_prepare_sample() before
> > calling the BPF overflow handler.
>
> It never was, why is it a problem now?

BPF used to allow selected fields only like period and addr, and they
are initialized always by perf_sample_data_init(). This is relaxed
by the bpf_cast_to_kern_ctx() and it can easily access arbitrary
fields of perf_sample_data now.

The background of this change is to use BPF as a filter for perf
event samples. The code is there already and returning 0 from
BPF can drop perf samples. With access to more sample data,
it'd make more educated decisions.

For example, I got some requests to limit perf samples in a
selected region of address (code or data). Or it can collect
samples only if some hardware specific information is set in
the raw data like in AMD IBS. We can easily extend it to other
sample info based on users' needs.

>
> > But just calling perf_prepare_sample() can be costly when the BPF
>
> So you potentially call it twice now, how's that useful?

Right. I think we can check data->sample_flags in
perf_prepare_sample() to minimize the duplicate work.
It already does it for some fields, but misses others.

Thanks,
Namhyung