Re: [PATCH v5 13/13] coresight: Fix CTI module refcount leak by making it a helper device

From: James Clark
Date: Tue Apr 25 2023 - 10:41:39 EST




On 24/04/2023 14:22, Suzuki K Poulose wrote:
> On 24/04/2023 12:09, James Clark wrote:
>>
>>
>> On 24/04/2023 11:43, Suzuki K Poulose wrote:
>>> On 04/04/2023 16:51, James Clark wrote:
>>>> The CTI module has some hard coded refcounting code that has a leak.
>>>> For example running perf and then trying to unload it fails:
>>>>
>>>>     perf record -e cs_etm// -a -- ls
>>>>     rmmod coresight_cti
>>>>
>>>>     rmmod: ERROR: Module coresight_cti is in use
>>>>
>>>> The coresight core already handles references of devices in use, so by
>>>> making CTI a normal helper device, we get working refcounting for free.
>>>>
>>>> Signed-off-by: James Clark <james.clark@xxxxxxx>
>>>> ---
>>>>    drivers/hwtracing/coresight/coresight-core.c  | 104
>>>> ++++++------------
>>>>    .../hwtracing/coresight/coresight-cti-core.c  |  52 +++++----
>>>>    .../hwtracing/coresight/coresight-cti-sysfs.c |   4 +-
>>>>    drivers/hwtracing/coresight/coresight-cti.h   |   4 +-
>>>>    drivers/hwtracing/coresight/coresight-priv.h  |   4 +-
>>>>    drivers/hwtracing/coresight/coresight-sysfs.c |   4 +
>>>>    include/linux/coresight.h                     |  30 +----
>>>>    7 files changed, 75 insertions(+), 127 deletions(-)
>>>>
>>>> diff --git a/drivers/hwtracing/coresight/coresight-core.c
>>>> b/drivers/hwtracing/coresight/coresight-core.c
>>>> index 16689fe4ba98..2af416bba983 100644
>>>> --- a/drivers/hwtracing/coresight/coresight-core.c
>>>> +++ b/drivers/hwtracing/coresight/coresight-core.c
>>>> @@ -236,60 +236,44 @@ void coresight_disclaim_device(struct
>>>> coresight_device *csdev)
>>>>    }
>>>>    EXPORT_SYMBOL_GPL(coresight_disclaim_device);
>>>>    -/* enable or disable an associated CTI device of the supplied CS
>>>> device */
>>>> -static int
>>>> -coresight_control_assoc_ectdev(struct coresight_device *csdev, bool
>>>> enable)
>>>> +/*
>>>> + * Add a helper as an output device. This function takes the
>>>> @coresight_mutex
>>>> + * because it's assumed that it's called from the helper device,
>>>> outside of the
>>>> + * core code where the mutex would already be held. Don't add new
>>>> calls to this
>>>> + * from inside the core code, instead try to add the new helper to
>>>> the DT and
>>>> + * ACPI where it will be picked up and linked automatically.
>>>> + */
>>>> +void coresight_add_helper(struct coresight_device *csdev,
>>>> +              struct coresight_device *helper)
>>>>    {
>>>> -    int ect_ret = 0;
>>>> -    struct coresight_device *ect_csdev = csdev->ect_dev;
>>>> -    struct module *mod;
>>>> +    int i;
>>>> +    struct coresight_connection conn = {};
>>>> +    struct coresight_connection *new_conn;
>>>>    -    if (!ect_csdev)
>>>> -        return 0;
>>>> -    if ((!ect_ops(ect_csdev)->enable) ||
>>>> (!ect_ops(ect_csdev)->disable))
>>>> -        return 0;
>>>> +    mutex_lock(&coresight_mutex);
>>>> +    conn.dest_fwnode = fwnode_handle_get(dev_fwnode(&helper->dev));
>>>> +    conn.dest_dev = helper;
>>>> +    conn.dest_port = conn.src_port = -1;
>>>> +    conn.src_dev = csdev;
>>>>    -    mod = ect_csdev->dev.parent->driver->owner;
>>>> -    if (enable) {
>>>> -        if (try_module_get(mod)) {
>>>> -            ect_ret = ect_ops(ect_csdev)->enable(ect_csdev);
>>>> -            if (ect_ret) {
>>>> -                module_put(mod);
>>>> -            } else {
>>>> -                get_device(ect_csdev->dev.parent);
>>>> -                csdev->ect_enabled = true;
>>>> -            }
>>>> -        } else
>>>> -            ect_ret = -ENODEV;
>>>> -    } else {
>>>> -        if (csdev->ect_enabled) {
>>>> -            ect_ret = ect_ops(ect_csdev)->disable(ect_csdev);
>>>> -            put_device(ect_csdev->dev.parent);
>>>> -            module_put(mod);
>>>> -            csdev->ect_enabled = false;
>>>> -        }
>>>> -    }
>>>> +    /*
>>>> +     * Check for duplicates because this is called every time a helper
>>>> +     * device is re-loaded. Existing connections will get re-linked
>>>> +     * automatically.
>>>> +     */
>>>> +    for (i = 0; i < csdev->pdata->nr_outconns; ++i)
>>>> +        if (csdev->pdata->out_conns[i]->dest_fwnode ==
>>>> conn.dest_fwnode)
>>>> +            goto unlock;
>>>>    -    /* output warning if ECT enable is preventing trace
>>>> operation */
>>>> -    if (ect_ret)
>>>> -        dev_info(&csdev->dev, "Associated ECT device (%s) %s
>>>> failed\n",
>>>> -             dev_name(&ect_csdev->dev),
>>>> -             enable ? "enable" : "disable");
>>>> -    return ect_ret;
>>>> -}
>>>> +    new_conn =
>>>> +        coresight_add_out_conn(csdev->dev.parent, csdev->pdata,
>>>> &conn);
>>>
>>> ultra minor nit:
>>>      new_conn = coresight_add_out_conn(....,
>>>                        .... );
>>
>> This whole patchset is now formatted with the kernel clang-format rules.
>> Are you sure this one is against the conventions?
>
> It is not against convention, but there are no hard line rules for
> these.
>
> The only suggestion is to split the lines sensibly with
> readability stressed.
>
> https://www.kernel.org/doc/html/latest/process/coding-style.html#breaking-long-lines-and-strings
>
> "Statements longer than 80 columns should be broken into sensible
> chunks, unless exceeding 80 columns significantly increases readability
> and does not hide information.
>
> Descendants are always substantially shorter than the parent and are
> placed substantially to the right. A very commonly used style is to
> align descendants to a function open parenthesis."
>
>
> I personally find it :
>
>     result = rather_long_function_statement(arg1, arg2,
>                             ........);
>
> way better readable than :
>
>     result =
>         rather_long_function_statement(.....);
>
>>
>> The problem is running the formatter on all changed lines makes it
>> almost impossible to go back and undo indents like this.
>
> Haven't used it, but it does seem to say it may not be perfect ;-).
> That said, I am not too strict about this. You may leave it unchanged
> if it is painful.
> > Suzuki
>

Upon further inspection I think it might actually be a bug in
clang-format. When only the ); falls over the column limit it doesn't
know that it needs to wrap the previous token to stick with the rules.
Or something like that.

I'll probably leave that debugging rabbit hole for another time. Anyway
I fixed this one in v6.