Re: [PATCH 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals
From: Nuno Sá
Date: Fri Feb 23 2024 - 05:33:08 EST
On Fri, 2024-02-23 at 10:45 +0100, Herve Codina wrote:
> Hi Saravana, Nuno,
>
> On Tue, 20 Feb 2024 16:37:05 -0800
> Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> ...
> > > @@ -1202,6 +1202,12 @@ int of_overlay_remove(int *ovcs_id)
> > > goto out;
> > > }
> > >
> > > + /*
> > > + * Wait for any ongoing device link removals before removing some
> > > of
> > > + * nodes
> > > + */
> > > + device_link_wait_removal();
> > > +
> >
> > Nuno in his patch[1] had this "wait" happen inside
> > __of_changeset_entry_destroy(). Which seems to be necessary to not hit
> > the issue that Luca reported[2] in this patch series. Is there any
> > problem with doing that?
>
> Is it the right place to wait ?
>
> __of_changeset_entry_destroy() can do some of_node_put() and I am not sure
> that of_node_put() can call device_put() when the of_node refcount reachs
> zero.
>
I don't think of_node_put() can call device_put(). At least by looking at:
https://elixir.bootlin.com/linux/v6.8-rc5/source/drivers/of/dynamic.c#L326
> If of_node_put() cannot call device_put(), I think we can wait in the
> of_changeset_destroy(). I.e. the __of_changeset_entry_destroy() caller.
> https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/dynamic.c#L670
>
> What do you think about this ?
> Does it make sense ?
I think it makes sense from a logical point of view. Like, let's flush the queue
right before checking our assumptions...
In my tests, I did not saw any issue (Hopefully I was not missing any subtlety).
- Nuno Sá