Re: [PATCH net-next] net: mdio: get/put device node during (un)registration

From: Luiz Angelo Daros de Luca
Date: Tue Jan 02 2024 - 17:03:31 EST


> On Wed, Dec 20, 2023 at 01:52:29AM -0300, Luiz Angelo Daros de Luca wrote:
> > The __of_mdiobus_register() function was storing the device node in
> > dev.of_node without increasing its reference count. It implicitly relied
> > on the caller to maintain the allocated node until the mdiobus was
> > unregistered.
> >
> > Now, __of_mdiobus_register() will acquire the node before assigning it,
> > and of_mdiobus_unregister_callback() will be called at the end of
> > mdio_unregister().
> >
> > Drivers can now release the node immediately after MDIO registration.
> > Some of them are already doing that even before this patch.
> >
> > Signed-off-by: Luiz Angelo Daros de Luca <luizluca@xxxxxxxxx>
>
> I don't like this, certainly not the use of a method prefixed by a
> double-underscore, and neither the conditional nature of "putting"
> this. That alone seems to point to there being more issues.

Thanks Russel.

At least one driver (bcm_sf2_mdio_register) is writing directly to the
mii_bus->dev.of_node and not using of_mdiobus_register(). We should
not put a node in the MDIO bus if the bus didn't get it before. That's
the reason for the conditional putting the node.

I wasn't sure about the names. What would be an appropriate name? The
same without the prefix? In order to put the node only when the bus
was registered by __of_mdiobus_register, I opted for a callback but it
might be a better approach.

> I also notice that netdev have applied this without *any* review from
> phylib maintainers. Grr.

Some reviews are required. Should we revert it?

> Indeed there are more issues with the refcounting here. If one looks at
> drivers/net/phy/mdio_bus.c::of_mdiobus_link_mdiodev(), we find this:
>
> if (addr == mdiodev->addr) {
> device_set_node(dev, of_fwnode_handle(child));
> /* The refcount on "child" is passed to the mdio
> * device. Do _not_ use of_node_put(child) here.
> */
> return;
>
> but there is nowhere that this refcount is dropped.

The same file where we have the get should also contain the put,
ideally in a reverse function like register/unregister. It is too easy
to miss a put that should happen in a different context.
fixed_phy_unregister seems to be one case where it put that node after
phy_device_remove() but I didn't investigate it further if that was
related to a different of_node_get. mdiobus_unregister_device might be
a nice place to fit that put but I'm not an expert in MDIO API.

> Really, the patch should be addressing the problem rather than putting
> a sticky-plaster over just one instance of it.

I'm trying to address an issue I ran into while modifying a DSA
driver. We have drivers putting the node passed to of_mdiobus_register
just after it returns. In my option, it feels more natural and this
patch fixes that scenario.
Other drivers keep that reference until the driver is removed, which
might still be too soon without this patch. I guess putting the node
should happen between mdiobus_unregister and mdiobus_free. If the
driver uses devm variants, it does not control the code between those
two methods and it should just hope that it is enough to put the node
as its last step.

I issue that the child node you pointed to should also be addressed.
However, I think they are two different but related issues. Any place
we see a device_set_node(), we should see a of_node_get before and a
of_node_put when the device is gone.

Regards,

Luiz