Re: [PATCH V2 01/13] perf/core: Add new branch sample type for LBR TOS

From: Liang, Kan
Date: Tue Oct 22 2019 - 11:39:33 EST




On 10/22/2019 5:39 AM, Peter Zijlstra wrote:
On Mon, Oct 21, 2019 at 01:03:02PM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
From: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>

In LBR call stack mode, the depth of reconstructed LBR call stack limits
to the number of LBR registers. With LBR Top-of-Stack (TOS) information,
perf tool may stitch the stacks of two samples. The reconstructed LBR
call stack can break the HW limitation.

Add a new branch sample type to retrieve LBR TOS.

Only when the new branch sample type is set, the TOS information is
dumped into the PERF_SAMPLE_BRANCH_STACK output.
Perf tool should check the attr.branch_sample_type, and apply the
corresponding format for PERF_SAMPLE_BRANCH_STACK samples.
Otherwise, some user case may be broken. For example, users may parse a
perf.data, which include the new branch sample type, with an old version
perf tool (without the check). Users probably get incorrect information
without any warning.

Signed-off-by: Kan Liang <kan.liang@xxxxxxxxxxxxxxx>
---
include/linux/perf_event.h | 4 ++++
include/uapi/linux/perf_event.h | 10 +++++++++-
kernel/events/core.c | 10 ++++++++++
3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 61448c19a132..0cebc8ec44fa 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -972,6 +972,10 @@ struct perf_sample_data {
u64 stack_user_size;
u64 phys_addr;
+
+ /* PMU specific data */
+ u64 lbr_tos;
+
} ____cacheline_aligned;

Last time you put this in perf_branch_stack, that was a much better
place. Can't this work now?

It should still work.
I just thought perf_branch_stack is a generic structure for branches.
TOS is Intel specific for LBR call stack. Maybe it's better to move it out.
Also, I wanted to make it consistent as perf tool struct branch_stack.
But those should not be big deals.

I will move tos to perf_branch_stack in V3 as below.

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 0cebc8ec44fa..c8bf40e608b6 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -92,6 +92,7 @@ struct perf_raw_record {
/*
* branch stack layout:
* nr: number of taken branches stored in entries[]
+ * tos: Top-of-Stack (TOS) information. PMU specific data.
*
* Note that nr can vary from sample to sample
* branches (to, from) are stored from most recent
@@ -100,6 +101,7 @@ struct perf_raw_record {
*/
struct perf_branch_stack {
__u64 nr;
+ __u64 tos; /* PMU specific data */
struct perf_branch_entry entries[0];
};


Thanks,
Kan