Re: [PATCH v6 1/2] driver core: Introduce device_link_wait_removal()

From: Herve Codina
Date: Tue Mar 26 2024 - 03:56:41 EST


On Mon, 25 Mar 2024 19:37:10 +0100
Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:

> On Mon, Mar 25, 2024 at 04:21:25PM +0100, Herve Codina wrote:
> > The commit 80dd33cf72d1 ("drivers: base: Fix device link removal")
> > introduces a workqueue to release the consumer and supplier devices used
> > in the devlink.
> > In the job queued, devices are release and in turn, when all the
> > references to these devices are dropped, the release function of the
> > device itself is called.
> >
> > Nothing is present to provide some synchronisation with this workqueue
> > in order to ensure that all ongoing releasing operations are done and
> > so, some other operations can be started safely.
> >
> > For instance, in the following sequence:
> > 1) of_platform_depopulate()
> > 2) of_overlay_remove()
> >
> > During the step 1, devices are released and related devlinks are removed
> > (jobs pushed in the workqueue).
> > During the step 2, OF nodes are destroyed but, without any
> > synchronisation with devlink removal jobs, of_overlay_remove() can raise
> > warnings related to missing of_node_put():
> > ERROR: memory leak, expected refcount 1 instead of 2
> >
> > Indeed, the missing of_node_put() call is going to be done, too late,
> > from the workqueue job execution.
> >
> > Introduce device_link_wait_removal() to offer a way to synchronize
> > operations waiting for the end of devlink removals (i.e. end of
> > workqueue jobs).
> > Also, as a flushing operation is done on the workqueue, the workqueue
> > used is moved from a system-wide workqueue to a local one.
> >
> > Cc: stable@xxxxxxxxxxxxxxx
>
> Why is this for stable? You are just adding a new api, no one is using
> it.
>
> Or if they are, you didn't send me that patch...

The patch 2 in this current series uses the new api.

Best regards,
Hervé