RE: [PATCH v3] perf: arm_cspmu: Separate Arm and vendor module

From: Besar Wicaksono
Date: Mon May 08 2023 - 13:05:55 EST


Hi Robin,

> -----Original Message-----
> From: Robin Murphy <robin.murphy@xxxxxxx>
> Sent: Friday, May 5, 2023 12:24 PM
> To: Besar Wicaksono <bwicaksono@xxxxxxxxxx>; suzuki.poulose@xxxxxxx;
> catalin.marinas@xxxxxxx; will@xxxxxxxxxx; mark.rutland@xxxxxxx
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> tegra@xxxxxxxxxxxxxxx; Thierry Reding <treding@xxxxxxxxxx>; Jonathan
> Hunter <jonathanh@xxxxxxxxxx>; Vikram Sethi <vsethi@xxxxxxxxxx>; Richard
> Wiley <rwiley@xxxxxxxxxx>; Eric Funsten <efunsten@xxxxxxxxxx>
> Subject: Re: [PATCH v3] perf: arm_cspmu: Separate Arm and vendor module
>
> External email: Use caution opening links or attachments
>
>
> On 2023-05-05 01:59, Besar Wicaksono wrote:
> > Arm Coresight PMU driver consists of main standard code and vendor
> > backend code. Both are currently built as a single module.
> > This patch adds vendor registration API to separate the two to
> > keep things modular. Main driver maintains a list of backend module
> > and will request it when probing the device.
> >
> > Signed-off-by: Besar Wicaksono <bwicaksono@xxxxxxxxxx>
> > ---
> >
> > Changes from v2:
> > * Move sysfs_event/format_show definition to arm_cspmu.h and move
> impl
> > registration API definition to a separate module so main driver and vendor
> > module are independent.
> > * The registration API now just sets the impl_init_ops callback, no reprobe.
> > * Add PMIIDR table that maps to the vendor module name. During device
> probe,
> > main driver requests the vendor module if PMIIDR is matching.
> > * Keeping the name of the main driver module as arm_cspmu_module.
> > Thanks to Robin and Suzuki for the feedback.
> > v2: https://lore.kernel.org/linux-arm-kernel/20230418062030.45620-1-
> bwicaksono@xxxxxxxxxx/T/#u
> >
> > Changes from v1:
> > * Added separate Kconfig entry for nvidia backend
> > * Added lock to protect accesses to the lists
> > * Added support for matching subset devices from a vendor
> > * Added state tracking to avoid reprobe when a device is in use
> > Thanks to Suzuki for the feedback.
> > v1: https://lore.kernel.org/linux-arm-kernel/20230403163905.20354-1-
> bwicaksono@xxxxxxxxxx/T/#u
> >
> > ---
> > drivers/perf/arm_cspmu/Kconfig | 9 +-
> > drivers/perf/arm_cspmu/Makefile | 7 +-
> > drivers/perf/arm_cspmu/arm_cspmu.c | 86 ++++++------------
> > drivers/perf/arm_cspmu/arm_cspmu.h | 55 ++++++++++--
> > drivers/perf/arm_cspmu/arm_cspmu_impl.c | 114
> ++++++++++++++++++++++++
> > drivers/perf/arm_cspmu/nvidia_cspmu.c | 35 +++++++-
> > drivers/perf/arm_cspmu/nvidia_cspmu.h | 17 ----
> > 7 files changed, 235 insertions(+), 88 deletions(-)
> > create mode 100644 drivers/perf/arm_cspmu/arm_cspmu_impl.c
> > delete mode 100644 drivers/perf/arm_cspmu/nvidia_cspmu.h
> >
> > diff --git a/drivers/perf/arm_cspmu/Kconfig
> b/drivers/perf/arm_cspmu/Kconfig
> > index 0b316fe69a45..8ce7b45a0075 100644
> > --- a/drivers/perf/arm_cspmu/Kconfig
> > +++ b/drivers/perf/arm_cspmu/Kconfig
> > @@ -1,6 +1,6 @@
> > # SPDX-License-Identifier: GPL-2.0
> > #
> > -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> >
> > config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > tristate "ARM Coresight Architecture PMU"
> > @@ -11,3 +11,10 @@ config ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > based on ARM CoreSight PMU architecture. Note that this PMU
> > architecture does not have relationship with the ARM CoreSight
> > Self-Hosted Tracing.
> > +
> > +config NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > + tristate "NVIDIA Coresight Architecture PMU"
> > + depends on ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU
> > + help
> > + Provides NVIDIA specific attributes for performance monitoring unit
> > + (PMU) devices based on ARM CoreSight PMU architecture.
> > diff --git a/drivers/perf/arm_cspmu/Makefile
> b/drivers/perf/arm_cspmu/Makefile
> > index fedb17df982d..602ecb99dc57 100644
> > --- a/drivers/perf/arm_cspmu/Makefile
> > +++ b/drivers/perf/arm_cspmu/Makefile
> > @@ -1,6 +1,9 @@
> > -# Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > +# Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > #
> > # SPDX-License-Identifier: GPL-2.0
> >
> > +obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> arm_cspmu_impl.o
>
> Not sure what's up with this... I have no complaint with keeping the
> impl infrastucture together in its own source file, but it still wants
> to end up as part of arm_cspmu_module. Doing otherwise just adds
> unnecessary overhead at many levels and invites more problems.

My intention is to separate arm_cspmu_impl, arm_cspmu, and
vendor backend into different modules. Here is the dependency I have in mind:

arm_cspmu_impl
____|____
| |
arm_cspmu nvidia_cspmu

This helps during device probe that the call to request_module can be made
as a blocking call and the backend init_impl_ops will always be ready to use after
request_module returns. The code seems simpler this way. Could you please
elaborate the potential issue that might arise with this approach?

After reading your other comments on built-in kernel, can we use late_initcall
for arm_cspmu module to assume that the main driver will always be init'ed after
backend module ?

>
> > obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> arm_cspmu_module.o
> > -arm_cspmu_module-y := arm_cspmu.o nvidia_cspmu.o
> > +arm_cspmu_module-y := arm_cspmu.o
> > +
> > +obj-$(CONFIG_NVIDIA_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
> nvidia_cspmu.o
> > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.c
> b/drivers/perf/arm_cspmu/arm_cspmu.c
> > index a3f1c410b417..04c318744f55 100644
> > --- a/drivers/perf/arm_cspmu/arm_cspmu.c
> > +++ b/drivers/perf/arm_cspmu/arm_cspmu.c
> > @@ -16,7 +16,7 @@
> > * The user should refer to the vendor technical documentation to get
> details
> > * about the supported events.
> > *
> > - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > *
> > */
> >
> > @@ -31,7 +31,6 @@
> > #include <acpi/processor.h>
> >
> > #include "arm_cspmu.h"
> > -#include "nvidia_cspmu.h"
> >
> > #define PMUNAME "arm_cspmu"
> > #define DRVNAME "arm-cs-arch-pmu"
> > @@ -117,9 +116,6 @@
> > */
> > #define HILOHI_MAX_POLL 1000
> >
> > -/* JEDEC-assigned JEP106 identification code */
> > -#define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B
> > -
> > static unsigned long arm_cspmu_cpuhp_state;
> >
> > /*
> > @@ -186,16 +182,6 @@ static inline bool use_64b_counter_reg(const
> struct arm_cspmu *cspmu)
> > return (counter_size(cspmu) > 32);
> > }
> >
> > -ssize_t arm_cspmu_sysfs_event_show(struct device *dev,
> > - struct device_attribute *attr, char *buf)
> > -{
> > - struct dev_ext_attribute *eattr =
> > - container_of(attr, struct dev_ext_attribute, attr);
> > - return sysfs_emit(buf, "event=0x%llx\n",
> > - (unsigned long long)eattr->var);
> > -}
> > -EXPORT_SYMBOL_GPL(arm_cspmu_sysfs_event_show);
> > -
> > /* Default event list. */
> > static struct attribute *arm_cspmu_event_attrs[] = {
> > ARM_CSPMU_EVENT_ATTR(cycles,
> ARM_CSPMU_EVT_CYCLES_DEFAULT),
> > @@ -231,16 +217,6 @@ arm_cspmu_event_attr_is_visible(struct kobject
> *kobj,
> > return attr->mode;
> > }
> >
> > -ssize_t arm_cspmu_sysfs_format_show(struct device *dev,
> > - struct device_attribute *attr,
> > - char *buf)
> > -{
> > - struct dev_ext_attribute *eattr =
> > - container_of(attr, struct dev_ext_attribute, attr);
> > - return sysfs_emit(buf, "%s\n", (char *)eattr->var);
> > -}
> > -EXPORT_SYMBOL_GPL(arm_cspmu_sysfs_format_show);
> > -
>
> Is there a reason for moving these (other than bodging around issues
> caused by the Makefile mishap above)?
>

The main reason is to remove backend module (nvidia_cspmu)
dependency to main driver.

> (also, I'm now wondering why they're exported in the first place, since
> a backend module is hardly going to need to override the default
> implementations with the default implementations...)

My intention is to make the event and format attribute macro
on the header file to be reusable for the backend module. The event/format
attribute on the other PMUs is pretty generic, so I thought it would be
harmless.

>
> > static struct attribute *arm_cspmu_format_attrs[] = {
> > ARM_CSPMU_FORMAT_EVENT_ATTR,
> > ARM_CSPMU_FORMAT_FILTER_ATTR,
> > @@ -379,27 +355,12 @@ static struct attribute_group
> arm_cspmu_cpumask_attr_group = {
> > .attrs = arm_cspmu_cpumask_attrs,
> > };
> >
> > -struct impl_match {
> > - u32 pmiidr;
> > - u32 mask;
> > - int (*impl_init_ops)(struct arm_cspmu *cspmu);
> > -};
> > -
> > -static const struct impl_match impl_match[] = {
> > - {
> > - .pmiidr = ARM_CSPMU_IMPL_ID_NVIDIA,
> > - .mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
> > - .impl_init_ops = nv_cspmu_init_ops
> > - },
> > - {}
> > -};
> > -
> > static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu)
> > {
> > - int ret;
> > + int ret = 0;
> > struct acpi_apmt_node *apmt_node = cspmu->apmt_node;
> > struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops;
> > - const struct impl_match *match = impl_match;
> > + const struct arm_cspmu_impl_module *match;
> >
> > /*
> > * Get PMU implementer and product id from APMT node.
> > @@ -411,18 +372,21 @@ static int arm_cspmu_init_impl_ops(struct
> arm_cspmu *cspmu)
> > readl(cspmu->base0 + PMIIDR);
> >
> > /* Find implementer specific attribute ops. */
> > - for (; match->pmiidr; match++) {
> > - const u32 mask = match->mask;
> > -
> > - if ((match->pmiidr & mask) == (cspmu->impl.pmiidr & mask)) {
> > - ret = match->impl_init_ops(cspmu);
> > - if (ret)
> > - return ret;
> > -
> > - break;
> > + match = arm_cspmu_impl_match_module(cspmu->impl.pmiidr);
> > + if (match) {
> > + request_module(match->name);
>
> Are we confident this can't deadlock when it's already in the middle of
> loading the main module?
>

The backend module does not depend on the main driver module anymore
(please see my top comment). The blocking call to request_module should be
able to return.

> I'm mostly just paranoid thanks to my plentiful experience of
> inadvertently making driver probe routines crash, leaving some lock held
> in the module loader which typically ends up in having to hard-reset the
> machine :(
>
> > +
> > + if (match->param.module && match->param.impl_init_ops) {
>
> The module handles can be NULL if everything is built-in, so
> impl_init_ops is our only useful indicator here.
>

Thanks, I will update it on next version.

> > + if (try_module_get(match->param.module)) {
> > + cspmu->impl.module = match->param.module;
> > + ret = match->param.impl_init_ops(cspmu);
> > + }
>
> As mentioned previously, I think it might be appropriate to warn that
> we're falling back to the generic implementation when we've failed to
> find the ops for a match we *do* know about.
>

Sure, will do.

> > }
> > }
> >
> > + if (ret)
> > + return ret;
> > +
> > /* Use default callbacks if implementer doesn't provide one. */
> > CHECK_DEFAULT_IMPL_OPS(impl_ops, get_event_attrs);
> > CHECK_DEFAULT_IMPL_OPS(impl_ops, get_format_attrs);
> > @@ -484,11 +448,6 @@ arm_cspmu_alloc_attr_group(struct arm_cspmu
> *cspmu)
> > struct attribute_group **attr_groups = NULL;
> > struct device *dev = cspmu->dev;
> > const struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops;
> > - int ret;
> > -
> > - ret = arm_cspmu_init_impl_ops(cspmu);
> > - if (ret)
> > - return NULL;
> >
> > cspmu->identifier = impl_ops->get_identifier(cspmu);
> > cspmu->name = impl_ops->get_name(cspmu);
> > @@ -1203,11 +1162,21 @@ static int arm_cspmu_device_probe(struct
> platform_device *pdev)
> > if (ret)
> > return ret;
> >
> > + ret = arm_cspmu_init_impl_ops(cspmu);
> > + if (ret)
> > + goto module_put;
> > +
> > ret = arm_cspmu_register_pmu(cspmu);
> > if (ret)
> > - return ret;
> > + goto module_put;
> >
> > return 0;
> > +
> > +module_put:
> > + if (cspmu->impl.module)
>
> No need to check, module_put(NULL) is valid.
>

Got it, will make the change on next version.

> > + module_put(cspmu->impl.module);
> > +
> > + return ret;
> > }
> >
> > static int arm_cspmu_device_remove(struct platform_device *pdev)
> > @@ -1217,6 +1186,9 @@ static int arm_cspmu_device_remove(struct
> platform_device *pdev)
> > perf_pmu_unregister(&cspmu->pmu);
> > cpuhp_state_remove_instance(arm_cspmu_cpuhp_state, &cspmu-
> >cpuhp_node);
> >
> > + if (cspmu->impl.module)
>
> Ditto.
>

Got it.

> > + module_put(cspmu->impl.module);
> > +
> > return 0;
> > }
> >
> > diff --git a/drivers/perf/arm_cspmu/arm_cspmu.h
> b/drivers/perf/arm_cspmu/arm_cspmu.h
> > index 51323b175a4a..0b60165f770a 100644
> > --- a/drivers/perf/arm_cspmu/arm_cspmu.h
> > +++ b/drivers/perf/arm_cspmu/arm_cspmu.h
> > @@ -1,7 +1,7 @@
> > /* SPDX-License-Identifier: GPL-2.0
> > *
> > * ARM CoreSight Architecture PMU driver.
> > - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > *
> > */
> >
> > @@ -68,8 +68,13 @@
> >
> > /* PMIIDR register field */
> > #define ARM_CSPMU_PMIIDR_IMPLEMENTER GENMASK(11, 0)
> > +#define ARM_CSPMU_PMIIDR_REVISION GENMASK(15, 12)
> > +#define ARM_CSPMU_PMIIDR_VARIANT GENMASK(19, 16)
> > #define ARM_CSPMU_PMIIDR_PRODUCTID GENMASK(31, 20)
> >
> > +/* JEDEC-assigned JEP106 identification code */
> > +#define ARM_CSPMU_IMPL_ID_NVIDIA 0x36B
> > +
> > struct arm_cspmu;
> >
> > /* This tracks the events assigned to each counter in the PMU. */
> > @@ -107,10 +112,28 @@ struct arm_cspmu_impl_ops {
> > struct attribute *attr, int unused);
> > };
> >
> > +/* Vendor/implementer registration parameter. */
> > +struct arm_cspmu_impl_param {
> > + /* Backend module. */
> > + struct module *module;
> > + /* PMIIDR value/mask. */
> > + u32 pmiidr_val;
> > + u32 pmiidr_mask;
> > + /* Callback to vendor backend to init arm_cspmu_impl::ops. */
> > + int (*impl_init_ops)(struct arm_cspmu *cspmu);
> > +};
> > +
> > +/* Vendor/implementer module. */
> > +struct arm_cspmu_impl_module {
> > + const char *name;
> > + struct arm_cspmu_impl_param param;
> > +};
>
> Nit: FWIW I'd just have a single flat structure and call it something
> like arm_cspmu_impl_match.
>

Sure.

> > +
> > /* Vendor/implementer descriptor. */
> > struct arm_cspmu_impl {
> > u32 pmiidr;
> > struct arm_cspmu_impl_ops ops;
> > + struct module *module;
> > void *ctx;
> > };
> >
> > @@ -139,13 +162,31 @@ struct arm_cspmu {
> > };
> >
> > /* Default function to show event attribute in sysfs. */
> > -ssize_t arm_cspmu_sysfs_event_show(struct device *dev,
> > - struct device_attribute *attr,
> > - char *buf);
> > +static inline ssize_t arm_cspmu_sysfs_event_show(struct device *dev,
> > + struct device_attribute *attr, char *buf)
> > +{
> > + struct dev_ext_attribute *eattr =
> > + container_of(attr, struct dev_ext_attribute, attr);
> > + return sysfs_emit(buf, "event=0x%llx\n",
> > + (unsigned long long)eattr->var);
> > +}
> >
> > /* Default function to show format attribute in sysfs. */
> > -ssize_t arm_cspmu_sysfs_format_show(struct device *dev,
> > - struct device_attribute *attr,
> > - char *buf);
> > +static inline ssize_t arm_cspmu_sysfs_format_show(struct device *dev,
> > + struct device_attribute *attr, char *buf)
> > +{
> > + struct dev_ext_attribute *eattr =
> > + container_of(attr, struct dev_ext_attribute, attr);
> > + return sysfs_emit(buf, "%s\n", (char *)eattr->var);
> > +}
> > +
> > +/* Register vendor backend. */
> > +int arm_cspmu_impl_register(const struct arm_cspmu_impl_param
> *impl_param);
> > +
> > +/* Unregister vendor backend. */
> > +void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_param
> *impl_param);
> > +
> > +/* Get matching vendor module compatible with /p pmiidr. */
> > +const struct arm_cspmu_impl_module
> *arm_cspmu_impl_match_module(u32 pmiidr);
> >
> > #endif /* __ARM_CSPMU_H__ */
> > diff --git a/drivers/perf/arm_cspmu/arm_cspmu_impl.c
> b/drivers/perf/arm_cspmu/arm_cspmu_impl.c
> > new file mode 100644
> > index 000000000000..35e0f4c2410a
> > --- /dev/null
> > +++ b/drivers/perf/arm_cspmu/arm_cspmu_impl.c
> > @@ -0,0 +1,114 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Implementation specific backend registration.
> > + *
> > + * Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > + *
> > + */
> > +
> > +#include <linux/mutex.h>
> > +
> > +#include "arm_cspmu.h"
> > +
> > +static DEFINE_MUTEX(arm_cspmu_lock);
> > +
> > +static struct arm_cspmu_impl_module module_list[] = {
> > + {
> > + .name = "nvidia_cspmu",
> > + .param = {
> > + .pmiidr_val = ARM_CSPMU_IMPL_ID_NVIDIA,
> > + .pmiidr_mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
> > + },
> > + },
> > + {}
> > +};
> > +
> > +static struct arm_cspmu_impl_module *arm_cspmu_impl_find_module(
> > + const struct arm_cspmu_impl_param *impl_param)
> > +{
> > + struct arm_cspmu_impl_module *module;
> > +
> > + for (module = module_list; module->name; module++) {
> > + if (!strcmp(module->name, impl_param->module->name) &&
> > + module->param.pmiidr_val == impl_param->pmiidr_val &&
> > + module->param.pmiidr_mask == impl_param->pmiidr_mask)
> > + return module;
> > + }
> > +
> > + return NULL;
> > +}
>
> I don't think we want different logic for this - probe has to be able to
> get an unambiguous match from a PMIIDR value alone, therefore it must be
> sufficient for backend registration to do the same, e.g.
> arm_cspmu_impl_match_module(ARM_CSPMU_IMPL_ID_NVIDIA). Haveing
> different
> match conditions makes it all to easy to let subtle bugs in.
>

Ok, it's reasonable I think. One reason I also did the name check is to make sure
that it is the "correct" module. But I guess I can add a WARN_ON in case another
module register an entry that was already registered.

> > +
> > +const struct arm_cspmu_impl_module
> *arm_cspmu_impl_match_module(u32 pmiidr)
> > +{
> > + struct arm_cspmu_impl_module *module;
>
> Nit: just initialise to NULL...

Sure.

>
> > + mutex_lock(&arm_cspmu_lock);
> > +
> > + for (module = module_list; module->name; module++) {
> > + u32 mask = module->param.pmiidr_mask;
> > +
> > + if ((module->param.pmiidr_val & mask) == (pmiidr & mask)) {
>
> ...and break here.

Sure.

>
> > + mutex_unlock(&arm_cspmu_lock);
> > + return module;
> > + }
> > + }
> > +
> > + mutex_unlock(&arm_cspmu_lock);
>
> However, this locking doesn't work anyway - module_list and the
> pmiidr_{mask,val} values are static, so there is no need to protect the
> match lookup itself. The potential race which does exist is between
> arm_cspmu_impl_register() writing to module/impl_init_ops and
> arm_cspmu_init_impl_ops() reading them, which is not covered.
>

Ah yes, you are right, thanks for catching this.

> > +
> > + return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(arm_cspmu_impl_match_module);
> > +
> > +int arm_cspmu_impl_register(const struct arm_cspmu_impl_param
> *impl_param)
>
> Ah, I see, this explains the structure split. I reckon it would be fine
> to either just pass the PMIIDR value, module handle and impl_init_ops as
> explicit arguments here, or a whole dummy match template with only those
> fields filled in.
>

Got it.

> > +{
> > + struct arm_cspmu_impl_module *module;
> > + int ret = 0;
> > +
> > + if (!impl_param->module || !impl_param->impl_init_ops) {
>
> Again, this module check prevents it from working with the whole driver
> + backend built-in.
>
> > + pr_err("arm_cspmu reg failed, invalid module or init_ops\n");
>
> If anything, probably just WARN_ON() impl_init_ops being NULL here,
> since that would mean the backend module is fundamentally broken.
> However it would effectvely already be covered if we had a more general
> user warning based on not finding the ops for a known match at probe, as
> suggested earlier. The corner case of loading a broken module and *not*
> trying to use it doesn't seem worth bothering about.
>

Sure, I will add the WARN_ON and warning on falling back to generic path.

> > + return -EINVAL;
> > + }
> > +
> > + mutex_lock(&arm_cspmu_lock);
> > +
> > + module = arm_cspmu_impl_find_module(impl_param);
> > + if (module) {
> > + module->param.module = impl_param->module;
> > + module->param.impl_init_ops = impl_param->impl_init_ops;
> > + } else {
>
> I do however think it might be worth also checking if a match is found
> but the ops are already set. That would mean our match table is broken
> and contains a non-unique PMIIDR match, which isn't going to be easily
> caught anywhere else, and the implications might be subtle and only show
> up for other users later.
>

Agree, this condition needs to be check.

> > + pr_err("arm_cspmu reg failed, unable to find pmiidr: 0x%x, mask:
> 0x%x, module: %s\n",
> > + impl_param->pmiidr_val,
> > + impl_param->pmiidr_mask,
> > + impl_param->module->name);
> > +
> > + ret = -EINVAL;
> > + }
> > +
> > +
> > + mutex_unlock(&arm_cspmu_lock);
> > +
> > + return ret;
> > +}
> > +EXPORT_SYMBOL_GPL(arm_cspmu_impl_register);
> > +
> > +void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_param
> *impl_param)
> > +{
> > + struct arm_cspmu_impl_module *module;
> > +
> > + mutex_lock(&arm_cspmu_lock);
> > +
> > + module = arm_cspmu_impl_find_module(impl_param);
> > + if (module) {
>
> I think it's reasonable to have a usage model where unregister should
> only be called if register succeeded, and thus we can assume this lookup
> never fails. That certainly fits if the expectation is that
> register/unregister are tied to module_init/module_exit.
>

Yup, that is the expectation. It is still good to validate the module pointer right ?
Or do you think it will hide a bug, if any ?

> Thanks,
> Robin.
>
> > + module->param.module = NULL;
> > + module->param.impl_init_ops = NULL;
> > + } else
> > + pr_err("arm_cspmu unreg failed, unable to find pmiidr: 0x%x, mask:
> 0x%x, module: %s\n",
> > + impl_param->pmiidr_val,
> > + impl_param->pmiidr_mask,
> > + impl_param->module->name);
> > +
> > + mutex_unlock(&arm_cspmu_lock);
> > +}
> > +EXPORT_SYMBOL_GPL(arm_cspmu_impl_unregister);
> > +
> > +MODULE_LICENSE("GPL v2");
> > diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.c
> b/drivers/perf/arm_cspmu/nvidia_cspmu.c
> > index 72ef80caa3c8..7ac8f17de116 100644
> > --- a/drivers/perf/arm_cspmu/nvidia_cspmu.c
> > +++ b/drivers/perf/arm_cspmu/nvidia_cspmu.c
> > @@ -1,14 +1,15 @@
> > // SPDX-License-Identifier: GPL-2.0
> > /*
> > - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > + * Copyright (c) 2022-2023, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > *
> > */
> >
> > /* Support for NVIDIA specific attributes. */
> >
> > +#include <linux/module.h>
> > #include <linux/topology.h>
> >
> > -#include "nvidia_cspmu.h"
> > +#include "arm_cspmu.h"
> >
> > #define NV_PCIE_PORT_COUNT 10ULL
> > #define NV_PCIE_FILTER_ID_MASK
> GENMASK_ULL(NV_PCIE_PORT_COUNT - 1, 0)
> > @@ -351,7 +352,7 @@ static char *nv_cspmu_format_name(const struct
> arm_cspmu *cspmu,
> > return name;
> > }
> >
> > -int nv_cspmu_init_ops(struct arm_cspmu *cspmu)
> > +static int nv_cspmu_init_ops(struct arm_cspmu *cspmu)
> > {
> > u32 prodid;
> > struct nv_cspmu_ctx *ctx;
> > @@ -395,6 +396,32 @@ int nv_cspmu_init_ops(struct arm_cspmu
> *cspmu)
> >
> > return 0;
> > }
> > -EXPORT_SYMBOL_GPL(nv_cspmu_init_ops);
> > +
> > +/* Match all NVIDIA Coresight PMU devices */
> > +static const struct arm_cspmu_impl_param nv_cspmu_param = {
> > + .module = THIS_MODULE,
> > + .pmiidr_val = ARM_CSPMU_IMPL_ID_NVIDIA,
> > + .pmiidr_mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
> > + .impl_init_ops = nv_cspmu_init_ops
> > +};
> > +
> > +static int __init nvidia_cspmu_init(void)
> > +{
> > + int ret;
> > +
> > + ret = arm_cspmu_impl_register(&nv_cspmu_param);
> > + if (ret)
> > + pr_err("nvidia_cspmu backend registration error: %d\n", ret);
> > +
> > + return ret;
> > +}
> > +
> > +static void __exit nvidia_cspmu_exit(void)
> > +{
> > + arm_cspmu_impl_unregister(&nv_cspmu_param);
> > +}
> > +
> > +module_init(nvidia_cspmu_init);
> > +module_exit(nvidia_cspmu_exit);
> >
> > MODULE_LICENSE("GPL v2");
> > diff --git a/drivers/perf/arm_cspmu/nvidia_cspmu.h
> b/drivers/perf/arm_cspmu/nvidia_cspmu.h
> > deleted file mode 100644
> > index 71e18f0dc50b..000000000000
> > --- a/drivers/perf/arm_cspmu/nvidia_cspmu.h
> > +++ /dev/null
> > @@ -1,17 +0,0 @@
> > -/* SPDX-License-Identifier: GPL-2.0
> > - *
> > - * Copyright (c) 2022, NVIDIA CORPORATION & AFFILIATES. All rights
> reserved.
> > - *
> > - */
> > -
> > -/* Support for NVIDIA specific attributes. */
> > -
> > -#ifndef __NVIDIA_CSPMU_H__
> > -#define __NVIDIA_CSPMU_H__
> > -
> > -#include "arm_cspmu.h"
> > -
> > -/* Allocate NVIDIA descriptor. */
> > -int nv_cspmu_init_ops(struct arm_cspmu *cspmu);
> > -
> > -#endif /* __NVIDIA_CSPMU_H__ */
> >
> > base-commit: 145e5cddfe8b4bf607510b2dcf630d95f4db420f