RE: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor module

From: Besar Wicaksono
Date: Wed Apr 26 2023 - 20:45:20 EST


Hi Suzuki,

> -----Original Message-----
> From: Besar Wicaksono <bwicaksono@xxxxxxxxxx>
> Sent: Wednesday, April 19, 2023 7:22 PM
> To: Suzuki K Poulose <suzuki.poulose@xxxxxxx>; catalin.marinas@xxxxxxx;
> will@xxxxxxxxxx; mark.rutland@xxxxxxx
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> tegra@xxxxxxxxxxxxxxx; Thierry Reding <treding@xxxxxxxxxx>; Jonathan
> Hunter <jonathanh@xxxxxxxxxx>; Vikram Sethi <vsethi@xxxxxxxxxx>; Richard
> Wiley <rwiley@xxxxxxxxxx>; Eric Funsten <efunsten@xxxxxxxxxx>
> Subject: RE: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor module
>
> Hi Suzuki,
>
> > -----Original Message-----
> > From: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> > Sent: Wednesday, April 19, 2023 4:32 PM
> > To: Besar Wicaksono <bwicaksono@xxxxxxxxxx>; catalin.marinas@xxxxxxx;
> > will@xxxxxxxxxx; mark.rutland@xxxxxxx
> > Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> > tegra@xxxxxxxxxxxxxxx; Thierry Reding <treding@xxxxxxxxxx>; Jonathan
> > Hunter <jonathanh@xxxxxxxxxx>; Vikram Sethi <vsethi@xxxxxxxxxx>;
> Richard
> > Wiley <rwiley@xxxxxxxxxx>; Eric Funsten <efunsten@xxxxxxxxxx>
> > Subject: Re: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor module
> >
> > External email: Use caution opening links or attachments
> >
> >
> > On 18/04/2023 20:33, Besar Wicaksono wrote:
> > > Hi Suzuki,
> > >
> > >> -----Original Message-----
> > >> From: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> > >> Sent: Tuesday, April 18, 2023 6:07 AM
> > >> To: Besar Wicaksono <bwicaksono@xxxxxxxxxx>;
> > catalin.marinas@xxxxxxx;
> > >> will@xxxxxxxxxx; mark.rutland@xxxxxxx
> > >> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx;
> > linux-
> > >> tegra@xxxxxxxxxxxxxxx; Thierry Reding <treding@xxxxxxxxxx>; Jonathan
> > >> Hunter <jonathanh@xxxxxxxxxx>; Vikram Sethi <vsethi@xxxxxxxxxx>;
> > Richard
> > >> Wiley <rwiley@xxxxxxxxxx>; Eric Funsten <efunsten@xxxxxxxxxx>
> > >> Subject: Re: [PATCH v2] perf: arm_cspmu: Separate Arm and vendor
> > module
> > >>
> > >> External email: Use caution opening links or attachments
> > >>
> > >>
> > >> On 18/04/2023 07:20, Besar Wicaksono wrote:
> > >>> Arm Coresight PMU driver consists of main standard code and vendor
> > >>> backend code. Both are currently built as a single module.
> > >>> This patch adds vendor registration API to separate the two to
> > >>> keep things modular. Vendor module shall register to the main
> > >>> module on loading and trigger device reprobe.
> > >>>
> > >>> Signed-off-by: Besar Wicaksono <bwicaksono@xxxxxxxxxx>
> > >>> ---
> > >>>
> > >>> Changes from v1:
> > >>> * Added separate Kconfig entry for nvidia backend
> > >>> * Added lock to protect accesses to the lists
> > >>> * Added support for matching subset devices from a vendor
> > >>> * Added state tracking to avoid reprobe when a device is in use
> > >>> v1: ttps://lore.kernel.org/linux-arm-kernel/20230403163905.20354-1-
> > >> bwicaksono@xxxxxxxxxx/T/#u
> > >>>
> > >>> ---
> > >>> drivers/perf/arm_cspmu/Kconfig | 9 +-
> > >>> drivers/perf/arm_cspmu/Makefile | 6 +-
> > >>> drivers/perf/arm_cspmu/arm_cspmu.c | 280
> > >> +++++++++++++++++++++++---
> > >>> drivers/perf/arm_cspmu/arm_cspmu.h | 32 ++-
> > >>> drivers/perf/arm_cspmu/nvidia_cspmu.c | 39 +++-
> > >>> drivers/perf/arm_cspmu/nvidia_cspmu.h | 17 --
> > >>> 6 files changed, 325 insertions(+), 58 deletions(-)
> > >>> delete mode 100644 drivers/perf/arm_cspmu/nvidia_cspmu.h
> > >>>
> > >>> diff --git a/drivers/perf/arm_cspmu/Kconfig
> > >> b/drivers/perf/arm_cspmu/Kconfig
> > >>> index 0b316fe69a45..8ce7b45a0075 100644
> > >>> --- a/drivers/perf/arm_cspmu/Kconfig
> > >>> +++ b/drivers/perf/arm_cspmu/Kconfig
> > >>> @@ -1,6 +1,6 @@
> > >>> # SPDX-License-Identifier: GPL-2.0
> > >>> #
> > >>> -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>>
> > >>> config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> tristate "ARM Coresight Architecture PMU"
> > >>> @@ -11,3 +11,10 @@ config
> > ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> based on ARM CoreSight PMU architecture. Note that this PMU
> > >>> architecture does not have relationship with the ARM CoreSight
> > >>> Self-Hosted Tracing.
> > >>> +
> > >>> +config NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> + tristate "NVIDIA Coresight Architecture PMU"
> > >>> + depends on ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > >>> + help
> > >>> + Provides NVIDIA specific attributes for performance monitoring unit
> > >>> + (PMU) devices based on ARM CoreSight PMU architecture.
> > >>> diff --git a/drivers/perf/arm_cspmu/Makefile
> > >> b/drivers/perf/arm_cspmu/Makefile
> > >>> index fedb17df982d..f8ae22411d59 100644
> > >>> --- a/drivers/perf/arm_cspmu/Makefile
> > >>> +++ b/drivers/perf/arm_cspmu/Makefile
> > >>> @@ -1,6 +1,6 @@
> > >>> -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>> #
> > >>> # SPDX-License-Identifier: GPL-2.0
> > >>>
> > >>> -obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> arm_cspmu_module.o
> > >>> -arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o
> > >>> +obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> arm_cspmu.o
> > >>> +obj-$(CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> > >> nvidia_cspmu.o
> > >>> diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c
> > >> b/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> index e31302ab7e37..c55ea2b74454 100644
> > >>> --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> > >>> @@ -16,7 +16,7 @@
> > >>> * The user should refer to the vendor technical documentation to get
> > >> details
> > >>> * about the supported events.
> > >>> *
> > >>> - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> > >> reserved.
> > >>> + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All
> > rights
> > >> reserved.
> > >>> *
> > >>> */
> > >>>
> > >>> @@ -25,13 +25,14 @@
> > >>> #include <linux/ctype.h>
> > >>> #include <linux/interrupt.h>
> > >>> #include <linux/io-64-nonatomic-lo-hi.h>
> > >>> +#include <linux/list.h>
> > >>> #include <linux/module.h>
> > >>> +#include <linux/mutex.h>
> > >>> #include <linux/perf_event.h>
> > >>> #include <linux/platform_device.h>
> > >>> #include <acpi/processor.h>
> > >>>
> > >>> #include "arm_cspmu.h"
> > >>> -#include "nvidia_cspmu.h"
> > >>>
> > >>> #define PMUNAME "arm_cspmu"
> > >>> #define DRVNAME "arm-cs-arch-pmu"
> > >>> @@ -117,11 +118,52 @@
> > >>> */
> > >>> #define HILOHI_MAX_POLL 1000
> > >>>
> > >>> -/* JEDEC-assigned JEP106 identification code */
> > >>> -#define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B
> > >>> -
> > >>> static unsigned long arm_cspmu_cpuhp_state;
> > >>>
> > >>> +/* List of Coresight PMU instances in the system. */
> > >>> +static LIST_HEAD(arm_cspmus);
> > >>> +
> > >>> +/* List of registered vendor backends. */
> > >>> +static LIST_HEAD(arm_cspmu_impls);
> > >>> +
> > >>> +static DEFINE_MUTEX(arm_cspmu_lock);
> > >>> +
> > >>> +/*
> > >>> + * State of the generic driver.
> > >>> + * 0 => registering backend.
> > >>> + * 1 => ready to use.
> > >>> + * 2 or more => in use.
> > >>> + */
> > >>> +#define ARM_CSPMU_STATE_REG 0
> > >>> +#define ARM_CSPMU_STATE_READY 1
> > >>> +static atomic_t arm_cspmu_state;
> > >>> +
> > >>> +static void arm_cspmu_state_ready(void)
> > >>> +{
> > >>> + atomic_set(&arm_cspmu_state, ARM_CSPMU_STATE_READY);
> > >>> +}
> > >>> +
> > >>> +static bool try_arm_cspmu_state_reg(void)
> > >>> +{
> > >>> + const int old = ARM_CSPMU_STATE_READY;
> > >>> + const int new = ARM_CSPMU_STATE_REG;
> > >>> +
> > >>> + return atomic_cmpxchg(&arm_cspmu_state, old, new) == old;
> > >>> +}
> > >>> +
> > >>> +static bool try_arm_cspmu_state_get(void)
> > >>> +{
> > >>> + return atomic_inc_not_zero(&arm_cspmu_state);
> > >>> +}
> > >>> +
> > >>> +static void arm_cspmu_state_put(void)
> > >>> +{
> > >>> + int ret;
> > >>> +
> > >>> + ret = atomic_dec_if_positive(&arm_cspmu_state);
> > >>> + WARN_ON(ret < 0);
> > >>> +}
> > >>> +
> > >>
> > >> As long as the vendor module is set for the PMU instance, it won't be
> > >> unloaded as long as there are any perf events and thus the specific
> > >> driver cannot be unloaded. So, you don't need explicit refcount
> > >> maintenance for each pmu callbacks.
> > >>
> > >
> > > The arm_cspmu_state mainly protects during new backend registration
> > when
> > > the device is attached to standard implementation. All devices are attached
> > to
> > > standard implementation initially when arm_cspmu module is loaded,
> since
> > there
> > > is no backend yet. On backend registration, the standard impl is replaced by
> > > backend impl. However, the module unloading mechanism doesn't provide
> > > protection because standard impl is owned by arm_cspmu module, which
> > > is not unloaded during registration.
> > >
> > > The refcount may not be required if the devices are not attached to
> standard
> > > Implementation by default, and the unreg doesn't fallback to it. But that
> > makes
> > > the devices usable only when there is a vendor backend available.
> >
> > Ok, thanks for the explanation. But I still think we :
> >
> > - Don't need a single global refcount for all the PMUs. Instead this
> > could be per PMU instance (arm_cspmu), only when it is backed by
>
> Ok, we can add refcount per PMU.
>
> > "generic" backend, others get module refcount. If the refcount of
> > "generic" PMU is positive, during the registration of a matching
> > backend driver, we could simply mark that as pending reprobe.
> >
> > - And only do the refcount for the following call backs:
> >
> > pmu:: event_init -> hold the refcount
> > pmu:: destroy -> drop the refcount and trigger a reprobe if one was
> > pending (see above)
>
> Is it safe to reprobe on destroy ? The reprobe will free and reallocate
> the memory managed by the device.

I checked the _free_event function from kernel/events/core.c. There are other
cleanups after the event->destroy is called, which may touch a stale data after a
reprobe occurs.
This would suggest reprobing right after event->destroy may not be safe, whether
it's a deferred reprobe or immediate reprobe from arm_cspmu_impl_register.

The alternative of not attaching the device to standard implementation by default
seems to be the better one so far.

Do you have other suggestion ?

>
> Thanks,
> Besar