Re: [PATCH V11 05/10] arm64/perf: Add branch stack support in ARMV8 PMU

From: Mark Rutland
Date: Tue Jun 06 2023 - 06:43:50 EST


On Tue, Jun 06, 2023 at 04:04:25PM +0530, Anshuman Khandual wrote:
> On 6/5/23 17:35, Mark Rutland wrote:
> > On Wed, May 31, 2023 at 09:34:23AM +0530, Anshuman Khandual wrote:
> >> static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
> >> @@ -1145,12 +1162,24 @@ static int armv8pmu_probe_pmu(struct arm_pmu *cpu_pmu)
> >> };
> >> int ret;
> >>
> >> + ret = armv8pmu_private_alloc(cpu_pmu);
> >> + if (ret)
> >> + return ret;
> >> +
> >> ret = smp_call_function_any(&cpu_pmu->supported_cpus,
> >> __armv8pmu_probe_pmu,
> >> &probe, 1);
> >> if (ret)
> >> return ret;
> >>
> >> + if (arm_pmu_branch_stack_supported(cpu_pmu)) {
> >> + ret = branch_records_alloc(cpu_pmu);
> >> + if (ret)
> >> + return ret;
> >> + } else {
> >> + armv8pmu_private_free(cpu_pmu);
> >> + }
> >
> > I see from the next patch that "private" is four ints, so please just add that
> > to struct arm_pmu under an ifdef CONFIG_ARM64_BRBE. That'll simplify this, and
> > if we end up needing more space in future we can consider factoring it out.
>
> struct arm_pmu {
> ........................................
> /* Implementation specific attributes */
> void *private;
> }
>
> private pointer here creates an abstraction for given pmu implementation
> to hide attribute details without making it known to core arm pmu layer.
> Although adding ifdef CONFIG_ARM64_BRBE solves the problem as mentioned
> above, it does break that abstraction. Currently arm_pmu layer is aware
> about 'branch records' but not about BRBE in particular which the driver
> adds later on. I suggest we should not break that abstraction.

I understand the rationale, but I think it's simpler for now to break that
abstraction. We can always refactor it later.

> Instead a global 'static struct brbe_hw_attr' in drivers/perf/arm_brbe.c
> can be initialized into arm_pmu->private during armv8pmu_branch_probe(),
> which will also solve the allocation-free problem.

IIUC that's not going to work for big.LITTLE systems where the BRBE support
varies, as we need this data per arm_pmu.

> Also similar helpers armv8pmu_task_ctx_alloc()/free() could be defined to
> manage task context cache i.e arm_pmu->pmu.task_ctx_cache independently.
>
> But now armv8pmu_task_ctx_alloc() can be called after pmu probe confirms
> to have arm_pmu->has_branch_stack.

I think those are different, and should be kept.

Thanks,
Mark.