Re: [RESEND PATCH V3 1/6] perf: Add branch stack extra

From: Peter Zijlstra
Date: Mon Oct 02 2023 - 17:38:25 EST


On Mon, Oct 02, 2023 at 03:19:04PM -0400, Liang, Kan wrote:

> >> Also, add a new branch sample type, PERF_SAMPLE_BRANCH_EVT_CNTRS, to
> >> indicate whether include occurrences of events in branch info. The
> >> information will be stored in the extra space.
> >
> > This... why do we need two flags?
>
> Users may only collect the occurrences of some events in a group. The
> EVT_CNTRS flag is used to indicate those events. E.g.,
> perf record -e "{cpu/branch-instructions,branch_type=call/,
> cpu/branch-misses,branch_type=event/}"
>
> Only the occurrences of the branch-misses event is collected in LBR and
> finally dumped into the extra buffer.
>
> While the first flag, PERF_SAMPLE_BRANCH_EXTRA, only tells that the
> extra space is required.

Or have it implicit, I reallt don't see the point of having two bits
here.

> > Also, I can't find this in the SDM, how wide are these counter deltas?
> > ISTR they're saturating, but not how wide they are.
>
> Now, it's documented in the Intel® Architecture Instruction Set
> Extensions and Future Features, Chapter 8, 8.6 LBR ENHANCEMENTS. It
> should be moved to SDM later.
> https://cdrdv2.intel.com/v1/dl/getContent/671368
>
> Only 2 bits for each counter. Saturating at a value of 3.

Urgh, this ISE document is shite, that thing don't say how many
IA32_LBR_INFO.PMCx_CNT fields there are, I think your later patch says
4, right? And is this for arch LBR or the other thing?

(Also, what is IA32_LER_x_INFO ?)

This is then a grant total of 8 bits.

And we still have 31 spare bits in perf_branch_entry.

Why again do we need the extra u64 ?!?

More specifically, this interface is pretty crap -- suppose the next
generation of things feels that 2 bits aint' enough and goes and gives
us 4. Then what do we do?

Did I already say that the ISE document raises more questions than it
provides answers?