Re: [PATCH V13 - RESEND 06/10] arm64/perf: Enable branch stack events via FEAT_BRBE

From: Anshuman Khandual
Date: Wed Jul 26 2023 - 02:26:26 EST



On 7/11/23 13:54, Anshuman Khandual wrote:
> +static void capture_brbe_flags(struct perf_branch_entry *entry, struct perf_event *event,
> + u64 brbinf)
> +{
> + if (branch_sample_type(event))
> + brbe_set_perf_entry_type(entry, brbinf);
> +
> + if (!branch_sample_no_cycles(event))
> + entry->cycles = brbe_get_cycles(brbinf);

Just revisiting the branch records cycle capture process in BRBE.

'cycles' field is a 16 bit field in struct perf_branch_entry.

struct perf_branch_entry {
__u64 from;
__u64 to;
__u64 mispred:1, /* target mispredicted */
predicted:1,/* target predicted */
in_tx:1, /* in transaction */
abort:1, /* transaction abort */
cycles:16, /* cycle count to last branch */
type:4, /* branch type */
spec:2, /* branch speculation info */
new_type:4, /* additional branch type */
priv:3, /* privilege level */
reserved:31;
};

static inline int brbe_get_cycles(u64 brbinf)
{
/*
* Captured cycle count is unknown and hence
* should not be passed on to the user space.
*/
if (brbinf & BRBINFx_EL1_CCU)
return 0;

return FIELD_GET(BRBINFx_EL1_CC_MASK, brbinf);
}

capture_brbe_flags()
{

.....
if (!branch_sample_no_cycles(event))
entry->cycles = brbe_get_cycles(brbinf);
.....

}

BRBINF<N>_EL1.CC[45:32] is a 14 bits field which gets assigned into
perf_branch_entry->cycles field which is 16 bits wide. Although the
cycle count representation here is mantissa and exponent based one.

CC bits[7:0] indicate the mantissa M
CC bits[13:8] indicate the exponent E

The actual cycle count is based on the following formula as the per
the spec [1], which needs to be derived from BRBINF<N>_EL1.CC field.

-------------------------------------------------------------------
The cycle count is expressed using the following function:

if IsZero(E) then UInt(M) else UInt('1':M:Zeros(UInt(E)-1))

If required, the cycle count is rounded to a multiple of 2(E-1) towards zero before being encoded.

A value of all ones in both the mantissa and exponent indicates the cycle count value exceeded the size of the cycle counter.
-------------------------------------------------------------------

Given that there is only 16 bits in perf_branch_entry structure for
cycles count, I assume that it needs to be derived in the user perf
tool (per the above calculation) before being interpreted ?

[1] https://developer.arm.com/documentation/ddi0601/2023-06/AArch64-Registers/BRBINF-n--EL1--Branch-Record-Buffer-Information-Register--n-?lang=en

- Anshuman