RE: [PATCH v6 1/4] perf: starfive: Add StarLink PMU support

From: JiSheng Teoh
Date: Fri Feb 23 2024 - 20:27:36 EST


> On Mon, Jan 29, 2024 at 05:51:38PM +0800, Ji Sheng Teoh wrote:
> > This patch adds support for StarFive's StarLink PMU (Performance
> > Monitor Unit). StarLink PMU integrates one or more CPU cores with a
> > shared L3 memory system. The PMU supports overflow interrupt, up to 16
> > programmable 64bit event counters, and an independent 64bit cycle
> > counter. StarLink PMU is accessed via MMIO.
>
> Since Palmer acked this (thanks!), I queued it locally but then ran into a few small issues with my build testing. Comments below.
>
> > diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig index
> > 273d67ecf6d2..41278742ef88 100644
> > --- a/drivers/perf/Kconfig
> > +++ b/drivers/perf/Kconfig
> > @@ -86,6 +86,15 @@ config RISCV_PMU_SBI
> > full perf feature support i.e. counter overflow, privilege mode
> > filtering, counter configuration.
> >
> > +config STARFIVE_STARLINK_PMU
> > + depends on ARCH_STARFIVE
>
> Please can you add "|| COMPILE_TEST" to this dependency so that you get build coverage from other architectures?
>
Sure, will add it in the next revision.

> > + bool "StarFive StarLink PMU"
> > + help
> > + Provide support for StarLink Performance Monitor Unit.
> > + StarLink Performance Monitor Unit integrates one or more cores with
> > + an L3 memory system. The L3 cache events are added into perf event
> > + subsystem, allowing monitoring of various L3 cache perf events.
> > +
> > config ARM_PMU_ACPI
> > depends on ARM_PMU && ACPI
> > def_bool y
>
> [...]
>
> > diff --git a/drivers/perf/starfive_starlink_pmu.c
> > b/drivers/perf/starfive_starlink_pmu.c
> > new file mode 100644
> > index 000000000000..2447ca09a471
> > --- /dev/null
> > +++ b/drivers/perf/starfive_starlink_pmu.c
> > @@ -0,0 +1,643 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * StarFive's StarLink PMU driver
> > + *
> > + * Copyright (C) 2023 StarFive Technology Co., Ltd.
> > + *
> > + * Author: Ji Sheng Teoh <jisheng.teoh@xxxxxxxxxxxxxxxx>
> > + *
> > + */
>
> [...]
>
> > +static void starlink_pmu_counter_start(struct perf_event *event,
> > + struct starlink_pmu *starlink_pmu) {
> > + struct hw_perf_event *hwc = &event->hw;
> > + int idx = event->hw.idx;
> > + u64 val;
> > +
> > + /*
> > + * Enable counter overflow interrupt[63:0],
> > + * which is mapped as follow:
> > + *
> > + * event counter 0 - Bit [0]
> > + * event counter 1 - Bit [1]
> > + * ...
> > + * cycle counter - Bit [63]
> > + */
> > + val = readq(starlink_pmu->pmu_base + STARLINK_PMU_INTERRUPT_ENABLE);
> > +
> > + if (hwc->config == STARLINK_CYCLES) {
> > + /*
> > + * Cycle count has its dedicated register, and it starts
> > + * counting as soon as STARLINK_PMU_GLOBAL_ENABLE is set.
> > + */
> > + val |= STARLINK_PMU_CYCLE_OVERFLOW_MASK;
> > + } else {
> > + writeq(event->hw.config, starlink_pmu->pmu_base +
> > + STARLINK_PMU_EVENT_SELECT + idx * sizeof(u64));
> > +
> > + val |= (1 << idx);
> > + }
>
> I think this needs to be a u64 on the right hand side, or just use the
> BIT_ULL() macro.
>
Ahh ok, will just append it with BIT_ULL() macro.

> > +
> > + writeq(val, starlink_pmu->pmu_base + STARLINK_PMU_INTERRUPT_ENABLE);
> > +
> > + writeq(STARLINK_PMU_GLOBAL_ENABLE, starlink_pmu->pmu_base +
> > + STARLINK_PMU_CONTROL);
> > +}
>
> [...]
>
> > +static irqreturn_t starlink_pmu_handle_irq(int irq_num, void *data) {
> > + struct starlink_pmu *starlink_pmu = data;
> > + struct starlink_hw_events *hw_events =
> > + this_cpu_ptr(starlink_pmu->hw_events);
> > + bool handled = false;
> > + int idx;
> > + u64 overflow_status;
> > +
> > + for (idx = 0; idx < STARLINK_PMU_MAX_COUNTERS; idx++) {
> > + struct perf_event *event = hw_events->events[idx];
> > +
> > + if (!event)
> > + continue;
> > +
> > + overflow_status = readq(starlink_pmu->pmu_base +
> > + STARLINK_PMU_COUNTER_OVERFLOW_STATUS);
> > + if (!(overflow_status & BIT(idx)))
> > + continue;
> > +
> > + writeq(1 << idx, starlink_pmu->pmu_base +
> > + STARLINK_PMU_COUNTER_OVERFLOW_STATUS);
>
> Same shifting problem here.
>
Got it.

> > +static int starlink_pmu_probe(struct platform_device *pdev) {
> > + struct starlink_pmu *starlink_pmu;
> > + struct starlink_hw_events *hw_events;
> > + struct resource *res;
> > + int cpuid, i, ret;
> > +
> > + starlink_pmu = devm_kzalloc(&pdev->dev, sizeof(*starlink_pmu), GFP_KERNEL);
> > + if (!starlink_pmu)
> > + return -ENOMEM;
> > +
> > + starlink_pmu->pmu_base =
> > + devm_platform_get_and_ioremap_resource(pdev, 0, &res);
> > + if (IS_ERR(starlink_pmu->pmu_base))
> > + return PTR_ERR(starlink_pmu->pmu_base);
> > +
> > + starlink_pmu->hw_events = alloc_percpu_gfp(struct starlink_hw_events,
> > + GFP_KERNEL);
> > + if (!starlink_pmu->hw_events) {
> > + dev_err(&pdev->dev, "Failed to allocate per-cpu PMU data\n");
> > + kfree(starlink_pmu);
>
> You shouldn't call kfree() on a device-managed object (i.e. allocated with devm_kzalloc()).
>
You are right, I will drop it.

Thanks for the review Will.

JiSheng