Re: [PATCH V4 4/7] perf/x86/intel: Support LBR event logging
From: Peter Zijlstra
Date: Thu Oct 19 2023 - 06:52:55 EST
On Wed, Oct 04, 2023 at 11:40:41AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
> +#define ARCH_LBR_EVENT_LOG_WIDTH 2
> +#define ARCH_LBR_EVENT_LOG_MASK 0x3
event log ?
> +static __always_inline void intel_pmu_update_lbr_event(u64 *lbr_events, int idx, int pos)
> +{
> + u64 logs = *lbr_events >> (LBR_INFO_EVENTS_OFFSET +
> + idx * ARCH_LBR_EVENT_LOG_WIDTH);
> +
> + logs &= ARCH_LBR_EVENT_LOG_MASK;
> + *lbr_events |= logs << (pos * ARCH_LBR_EVENT_LOG_WIDTH);
> +}
> +
> +/*
> + * The enabled order may be different from the counter order.
> + * Update the lbr_events with the enabled order.
> + */
> +static void intel_pmu_lbr_event_reorder(struct cpu_hw_events *cpuc,
> + struct perf_event *event)
> +{
> + int i, j, pos = 0, enabled[X86_PMC_IDX_MAX];
> + struct perf_event *leader, *sibling;
> +
> + leader = event->group_leader;
> + if (branch_sample_counters(leader))
> + enabled[pos++] = leader->hw.idx;
> +
> + for_each_sibling_event(sibling, leader) {
> + if (!branch_sample_counters(sibling))
> + continue;
> + enabled[pos++] = sibling->hw.idx;
> + }
Ok, so far so good: enabled[x] = y, is a mapping of hardware index (y)
to group order (x).
Although I would perhaps name that order[] instead of enabled[].
> +
> + if (!pos)
> + return;
How would we ever get here if this is the case?
> +
> + for (i = 0; i < cpuc->lbr_stack.nr; i++) {
> + for (j = 0; j < pos; j++)
> + intel_pmu_update_lbr_event(&cpuc->lbr_events[i], enabled[j], j);
But this confuses me... per that function it:
- extracts counter value for enabled[j] and,
- or's it into the same variable at j
But what if j is already taken by something else?
That is, suppose enabled[] = {3,2,1,0}, and lbr_events = 11 10 01 00
Then: for (j) intel_pmu_update_lbt_event(&lbr_event, enabled[j], j);
0: 3->0, 11 10 01 00 -> 11 10 01 11
1: 2->1, 11 10 01 11 -> 11 10 11 11
2: 1->2, 11 10 11 11 -> 11 11 11 11
> +
> + /* Clear the original counter order */
> + cpuc->lbr_events[i] &= ~LBR_INFO_EVENTS;
> + }
> +}
Would not something like:
src = lbr_events[i];
dst = 0;
for (j = 0; j < pos; j++) {
cnt = (src >> enabled[j]*2) & 3;
dst |= cnt << j*2
}
lbr_events[i] = dst;
be *FAR* clearer, and actually work?