[PATCH 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

From: Herve Codina
Date: Thu Nov 30 2023 - 12:41:42 EST


In the following sequence:
1) of_platform_depopulate()
2) of_overlay_remove()

During the step 1, devices are destroyed and devlinks are removed.
During the step 2, OF nodes are destroyed but
__of_changeset_entry_destroy() can raise warnings related to missing
of_node_put():
ERROR: memory leak, expected refcount 1 instead of 2 ...

Indeed, during the devlink removals performed at step 1, the removal
itself releasing the device (and the attached of_node) is done by a job
queued in a workqueue and so, it is done asynchronously with respect to
function calls.
When the warning is present, of_node_put() will be called but wrongly
too late from the workqueue job.

In order to be sure that any ongoing devlink removals are done before
the of_node destruction, synchronize the of_overlay_remove() with the
devlink removals.

Signed-off-by: Herve Codina <herve.codina@xxxxxxxxxxx>
---
drivers/of/overlay.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/of/overlay.c b/drivers/of/overlay.c
index a9a292d6d59b..5c5f808b163e 100644
--- a/drivers/of/overlay.c
+++ b/drivers/of/overlay.c
@@ -1202,6 +1202,12 @@ int of_overlay_remove(int *ovcs_id)
goto out;
}

+ /*
+ * Wait for any ongoing device link removals before removing some of
+ * nodes
+ */
+ device_link_wait_removal();
+
mutex_lock(&of_mutex);

ovcs = idr_find(&ovcs_idr, *ovcs_id);
--
2.42.0