RE: [PATCH v5 bpf-next 1/3] perf: enable branch record for software events

From: John Fastabend
Date: Thu Sep 02 2021 - 16:49:53 EST


Song Liu wrote:
> The typical way to access branch record (e.g. Intel LBR) is via hardware
> perf_event. For CPUs with FREEZE_LBRS_ON_PMI support, PMI could capture
> reliable LBR. On the other hand, LBR could also be useful in non-PMI
> scenario. For example, in kretprobe or bpf fexit program, LBR could
> provide a lot of information on what happened with the function. Add API
> to use branch record for software use.
>
> Note that, when the software event triggers, it is necessary to stop the
> branch record hardware asap. Therefore, static_call is used to remove some
> branch instructions in this process.
>
> Signed-off-by: Song Liu <songliubraving@xxxxxx>
> ---

[...]

> void intel_pmu_auto_reload_read(struct perf_event *event);
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
> index fe156a8170aa3..4fe11f4f896b1 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -57,6 +57,7 @@ struct perf_guest_info_callbacks {
> #include <linux/cgroup.h>
> #include <linux/refcount.h>
> #include <linux/security.h>
> +#include <linux/static_call.h>
> #include <asm/local.h>
>
> struct perf_callchain_entry {
> @@ -1612,4 +1613,26 @@ extern void __weak arch_perf_update_userpage(struct perf_event *event,
> extern __weak u64 arch_perf_get_page_size(struct mm_struct *mm, unsigned long addr);
> #endif
>
> +/*
> + * Snapshot branch stack on software events.
> + *
> + * Branch stack can be very useful in understanding software events. For
> + * example, when a long function, e.g. sys_perf_event_open, returns an
> + * errno, it is not obvious why the function failed. Branch stack could
> + * provide very helpful information in this type of scenarios.
> + *
> + * On software event, it is necessary to stop the hardware branch recorder
> + * fast. Otherwise, the hardware register/buffer will be flushed with
> + * entries af the triggering event. Therefore, static call is used to
^^
nit, af->of

> + * stop the hardware recorder.
> + */
> +
> +/*
> + * cnt is the number of entries allocated for entries.
> + * Return number of entries copied to .
> + */

A bit out of scope, but LGTM.

Acked-by: John Fastabend <john.fastabend@xxxxxxxxx>