Re: [PATCH v3] perf: arm_cspmu: Separate Arm and vendor module

From: Robin Murphy
Date: Thu Jun 01 2023 - 10:34:54 EST


On 2023-05-08 18:04, Besar Wicaksono wrote:
[...]
+obj-$(CONFIG_ARM_CORESIGHT_PMU_ARCH_SYSTEM_PMU) +=
arm_cspmu_impl.o

Not sure what's up with this... I have no complaint with keeping the
impl infrastucture together in its own source file, but it still wants
to end up as part of arm_cspmu_module. Doing otherwise just adds
unnecessary overhead at many levels and invites more problems.

My intention is to separate arm_cspmu_impl, arm_cspmu, and
vendor backend into different modules. Here is the dependency I have in mind:

arm_cspmu_impl
____|____
| |
arm_cspmu nvidia_cspmu

This helps during device probe that the call to request_module can be made
as a blocking call and the backend init_impl_ops will always be ready to use after
request_module returns. The code seems simpler this way. Could you please
elaborate the potential issue that might arise with this approach?

I see the intent; the main issue is that the implementation of it is needlessly fiddly: arm_cspmu_impl is not useful on its own, and probably only represents a few hundred bytes of code, so putting in a distinct .ko which needs to be loaded separately is a relatively massive waste of filesystem space and memory for what it is. Also if anything that dependency is the wrong way round anyway - arm_cspmu could provide generic PMU functionality just fine regardless of arm_cspmu_impl, but arm_cspmu_impl definitely has a logical and functional dependency on arm_cspmu in order to serve any user-visible purpose.

After reading your other comments on built-in kernel, can we use late_initcall
for arm_cspmu module to assume that the main driver will always be init'ed after
backend module ?

Either that or device_initcall_sync should probably be OK.

[...]
-ssize_t arm_cspmu_sysfs_format_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct dev_ext_attribute *eattr =
- container_of(attr, struct dev_ext_attribute, attr);
- return sysfs_emit(buf, "%s\n", (char *)eattr->var);
-}
-EXPORT_SYMBOL_GPL(arm_cspmu_sysfs_format_show);
-

Is there a reason for moving these (other than bodging around issues
caused by the Makefile mishap above)?


The main reason is to remove backend module (nvidia_cspmu)
dependency to main driver.

But it does logically and functionally depend on the main driver, so that still sounds wrong :/

(also, I'm now wondering why they're exported in the first place, since
a backend module is hardly going to need to override the default
implementations with the default implementations...)

My intention is to make the event and format attribute macro
on the header file to be reusable for the backend module. The event/format
attribute on the other PMUs is pretty generic, so I thought it would be
harmless.

Sorry for the confusion, this one's on me - for some reason I started thinking these were used as impl_ops callbacks, since in general they are something an impl may conceivably want to replace, and I'd missed the indirect references hidden in the ARM_CSPMU_{EVENT,FORMAT}_ATTR() macros, which are of course used directly by nvidia_cspmu. So the exports do make sense.

static struct attribute *arm_cspmu_format_attrs[] = {
ARM_CSPMU_FORMAT_EVENT_ATTR,
ARM_CSPMU_FORMAT_FILTER_ATTR,
@@ -379,27 +355,12 @@ static struct attribute_group
arm_cspmu_cpumask_attr_group = {
.attrs = arm_cspmu_cpumask_attrs,
};

-struct impl_match {
- u32 pmiidr;
- u32 mask;
- int (*impl_init_ops)(struct arm_cspmu *cspmu);
-};
-
-static const struct impl_match impl_match[] = {
- {
- .pmiidr = ARM_CSPMU_IMPL_ID_NVIDIA,
- .mask = ARM_CSPMU_PMIIDR_IMPLEMENTER,
- .impl_init_ops = nv_cspmu_init_ops
- },
- {}
-};
-
static int arm_cspmu_init_impl_ops(struct arm_cspmu *cspmu)
{
- int ret;
+ int ret = 0;
struct acpi_apmt_node *apmt_node = cspmu->apmt_node;
struct arm_cspmu_impl_ops *impl_ops = &cspmu->impl.ops;
- const struct impl_match *match = impl_match;
+ const struct arm_cspmu_impl_module *match;

/*
* Get PMU implementer and product id from APMT node.
@@ -411,18 +372,21 @@ static int arm_cspmu_init_impl_ops(struct
arm_cspmu *cspmu)
readl(cspmu->base0 + PMIIDR);

/* Find implementer specific attribute ops. */
- for (; match->pmiidr; match++) {
- const u32 mask = match->mask;
-
- if ((match->pmiidr & mask) == (cspmu->impl.pmiidr & mask)) {
- ret = match->impl_init_ops(cspmu);
- if (ret)
- return ret;
-
- break;
+ match = arm_cspmu_impl_match_module(cspmu->impl.pmiidr);
+ if (match) {
+ request_module(match->name);

Are we confident this can't deadlock when it's already in the middle of
loading the main module?


The backend module does not depend on the main driver module anymore
(please see my top comment). The blocking call to request_module should be
able to return.

Yeah, it just surprises me that loading a module synchronously in the middle of loading another module would actually work. I started trying to test it out under lockdep to reassure myself, but that just found that my dev board already has a locking issue in the UART driver :(

[...]
+void arm_cspmu_impl_unregister(const struct arm_cspmu_impl_param
*impl_param)
+{
+ struct arm_cspmu_impl_module *module;
+
+ mutex_lock(&arm_cspmu_lock);
+
+ module = arm_cspmu_impl_find_module(impl_param);
+ if (module) {

I think it's reasonable to have a usage model where unregister should
only be called if register succeeded, and thus we can assume this lookup
never fails. That certainly fits if the expectation is that
register/unregister are tied to module_init/module_exit.


Yup, that is the expectation. It is still good to validate the module pointer right ?
Or do you think it will hide a bug, if any ?

If it constitutes an egregious programming error to attempt to unregister something which was never registered, such that this pointer could never legitimately be NULL, then it should suffice to validate the pointer naturally by dereferencing it.

Thanks,
Robin.