Re: [PATCH 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

From: Herve Codina
Date: Fri Feb 23 2024 - 04:50:12 EST


Hi Saravana, Nuno,

On Tue, 20 Feb 2024 16:37:05 -0800
Saravana Kannan <saravanak@xxxxxxxxxx> wrote:

..
> > @@ -1202,6 +1202,12 @@ int of_overlay_remove(int *ovcs_id)
> > goto out;
> > }
> >
> > + /*
> > + * Wait for any ongoing device link removals before removing some of
> > + * nodes
> > + */
> > + device_link_wait_removal();
> > +
>
> Nuno in his patch[1] had this "wait" happen inside
> __of_changeset_entry_destroy(). Which seems to be necessary to not hit
> the issue that Luca reported[2] in this patch series. Is there any
> problem with doing that?

Is it the right place to wait ?

__of_changeset_entry_destroy() can do some of_node_put() and I am not sure
that of_node_put() can call device_put() when the of_node refcount reachs
zero.

If of_node_put() cannot call device_put(), I think we can wait in the
of_changeset_destroy(). I.e. the __of_changeset_entry_destroy() caller.
https://elixir.bootlin.com/linux/v6.8-rc1/source/drivers/of/dynamic.c#L670

What do you think about this ?
Does it make sense ?

>
> Luca for some reason did a unlock/lock(of_mutex) in his test patch and
> I don't think that's necessary.
>
> Can you move this call to where Nuno did it and see if that works for
> all of you?

I will check.

Best regards,
Hervé