Re: [PATCH v4 05/14] coresight: get/put module in coresight_build/release_path

From: Suzuki K Poulose
Date: Thu Jun 07 2018 - 05:04:40 EST


Hi Greg,

On 06/07/2018 09:34 AM, Greg Kroah-Hartman wrote:
On Wed, Jun 06, 2018 at 03:55:01PM -0500, Kim Phillips wrote:
On Wed, 6 Jun 2018 10:46:36 +0100
Suzuki K Poulose <suzuki.poulose@xxxxxxx> wrote:

On 06/06/2018 09:24 AM, Greg Kroah-Hartman wrote:
On Tue, Jun 05, 2018 at 04:07:01PM -0500, Kim Phillips wrote:
Increment the refcnt for driver modules in current use by calling
module_get in coresight_build_path and module_put in release_path.

This prevents driver modules from being unloaded when they are in use,
either in sysfs or perf mode.

Why does it matter? Shouldn't you be allowed to remove any module at
any point in time, much like a networking driver?

The user doesn't have an explicit refcount on the individual components
in a trace session. So, when a trace session is in progress, it is as
good as having a "file" open on each component that is part of the
active trace session. So, we don't want the driver to be removed when
the component is being used in the trace collection. This will be
released as soon as the session is ended. It is just like a PMU driver
where the module refcount is held to ensure the module stays until the
session is over. In this case, we have multiple components, each with
its own driver invisible to the PMU driver. Hence the coresight driver
must hold the reference.




Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
Cc: Leo Yan <leo.yan@xxxxxxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Randy Dunlap <rdunlap@xxxxxxxxxxxxx>
Cc: Suzuki K Poulose <Suzuki.Poulose@xxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Russell King <linux@xxxxxxxxxxxxxxx>
Signed-off-by: Kim Phillips <kim.phillips@xxxxxxx>
---
drivers/hwtracing/coresight/coresight.c | 9 +++++++++
1 file changed, 9 insertions(+)

diff --git a/drivers/hwtracing/coresight/coresight.c b/drivers/hwtracing/coresight/coresight.c
index 338f1719641c..1c941351f1d1 100644
--- a/drivers/hwtracing/coresight/coresight.c
+++ b/drivers/hwtracing/coresight/coresight.c
@@ -465,6 +465,12 @@ static int _coresight_build_path(struct coresight_device *csdev,
node->csdev = csdev;
list_add(&node->link, path);
+
+ if (!try_module_get(csdev->dev.parent->driver->owner)) {

What is to keep parent->driver from going away right here? What keeps
parent around? This feels very fragile to me, I don't see any locking
anywhere around this code path to try to keep things in place.

You're right. We do have coresight_mutex, which is held across the build
path and the csdev is removed when a device is unregistered. However, I
see that we don't hold the mutex while removing the connections from
coresight_unregister(). Holding the mutex should protect us from the
csdev being removed, while we build the path.

OK, I'll add this for the next version:

diff --git a/drivers/hwtracing/coresight/coresight-core.c b/drivers/hwtracing/coresight/coresight-core.c
index f96258de1e9b..da702507a55c 100644
--- a/drivers/hwtracing/coresight/coresight-core.c
+++ b/drivers/hwtracing/coresight/coresight-core.c
@@ -1040,8 +1040,12 @@ EXPORT_SYMBOL_GPL(coresight_register);
void coresight_unregister(struct coresight_device *csdev)
{
+ mutex_lock(&coresight_mutex);
+

Locks are to protect data, not code, be careful here please.

The mutex here is to protect updates to the device links. We
keep a list of connections from each device to form a trace path.
When we unregister a device, we must remove the references to the
device from all the other connected components to ensure they don't
end up accessing a device which is gone.


That's the big issue with the module reference counting, it "protects"
code, not data. If at all possible, never grab a module reference
count, as you should always be able to unload a module, unless you have
a file handle open, and if you have that, the kernel core will properly
protect you.

So in a nutshell, we have user invisible components which cannot be
refcounted explicitly by the file handles, and thus the driver must
do it.
Now, one option we could explore is getting the refcount on the
devices itself, rather than the drivers for trace sessions. And each
device could potentially hold a refcount on the driver (which I assume
is already held), which can be dropped when the device is no longer
used and thus get rid of the reference on the module everywhere.

Thoughts ? Suggestions ?

Suzuki



thanks,

greg k-h