Re: [PATCH v3 2/2] of: overlay: Synchronize of_overlay_remove() with the devlink removals

From: Nuno Sá
Date: Tue Mar 05 2024 - 02:35:16 EST


On Mon, 2024-03-04 at 22:47 -0800, Saravana Kannan wrote:
> On Mon, Mar 4, 2024 at 8:49 AM Herve Codina <herve.codina@bootlincom> wrote:
> >
> > Hi Rob,
> >
> > On Mon, 4 Mar 2024 09:22:02 -0600
> > Rob Herring <robh@xxxxxxxxxx> wrote:
> >
> > ...
> >
> > > > > @@ -853,6 +854,14 @@ static void free_overlay_changeset(struct
> > > > > overlay_changeset *ovcs)
> > > > >  {
> > > > >   int i;
> > > > >
> > > > > + /*
> > > > > +  * Wait for any ongoing device link removals before removing some of
> > > > > +  * nodes. Drop the global lock while waiting
> > > > > +  */
> > > > > + mutex_unlock(&of_mutex);
> > > > > + device_link_wait_removal();
> > > > > + mutex_lock(&of_mutex);
> > > >
> > > > I'm still not convinced we need to drop the lock. What happens if
> > > > someone else
> > > > grabs the lock while we are in device_link_wait_removal()? Can we
> > > > guarantee that
> > > > we can't screw things badly?
> > >
> > > It is also just ugly because it's the callers of
> > > free_overlay_changeset() that hold the lock and now we're releasing it
> > > behind their back.
> > >
> > > As device_link_wait_removal() is called before we touch anything, can't
> > > it be called before we take the lock? And do we need to call it if
> > > applying the overlay fails?
>
> Rob,
>
> This[1] scenario Luca reported seems like a reason for the
> device_link_wait_removal() to be where Herve put it. That example
> seems reasonable.
>
> [1] - https://lore.kernel.org/all/20231220181627.341e8789@booty/
>

I'm still not totally convinced about that. Why not putting the check right
before checking the kref in __of_changeset_entry_destroy(). I'll contradict
myself a bit because this is just theory but if we look at pci_stop_dev(), which
AFAIU, could be reached from a sysfs write(), we have:

device_release_driver(&dev->dev);
..
of_pci_remove_node(dev);
of_changeset_revert(np->data);
of_changeset_destroy(np->data);

So looking at the above we would hit the same issue if we flush the queue in
free_overlay_changeset() - as the queue won't be flushed at all and we could
have devlink removal due to device_release_driver(). Right?

Again, completely theoretical but seems like a reasonable one plus I'm not
understanding the push against having the flush in
__of_changeset_entry_destroy(). Conceptually, it looks the best place to me but
I may be missing some issue in doing it there?

> > >
> >
> > Indeed, having device_link_wait_removal() is not needed when applying the
> > overlay fails.
> >
> > I can call device_link_wait_removal() from the caller of_overlay_remove()
> > but not before the lock is taken.
> > We need to call it between __of_changeset_revert_notify() and
> > free_overlay_changeset() and so, the lock is taken.
> >
> > This lead to the following sequence:
> > --- 8< ---
> > int of_overlay_remove(int *ovcs_id)
> > {
> >         ...
> >         mutex_lock(&of_mutex);
> >         ...
> >
> >         ret = __of_changeset_revert_notify(&ovcs->cset);
> >         ...
> >
> >         ret_tmp = overlay_notify(ovcs, OF_OVERLAY_POST_REMOVE);
> >         ...
> >
> >         mutex_unlock(&of_mutex);
> >         device_link_wait_removal();
> >         mutex_lock(&of_mutex);
> >
> >         free_overlay_changeset(ovcs);
> >         ...
> >         mutex_unlock(&of_mutex);
> >         ...
> > }
> > --- 8< ---
> >
> > In this sequence, the question is:
> > Do we need to release the mutex lock while device_link_wait_removal() is
> > called ?
>
> In general I hate these kinds of sequences that release a lock and
> then grab it again quickly. It's not always a bug, but my personal
> take on that is 90% of these introduce a bug.
>
> Drop the unlock/lock and we'll deal a deadlock if we actually hit one.
> I'm also fairly certain that device_link_wait_removal() can't trigger
> something else that can cause an OF overlay change while we are in the
> middle of one. And like Rob said, I'm not sure this unlock/lock is a
> good solution for that anyway.

Totally agree. Unless we really see a deadlock this is a very bad idea (IMHO).
Even on the PCI code, it seems to me that we're never destroying a changeset
from a device/kobj_type release callback. That would be super weird right?

- Nuno Sá
>