Re: [PATCH] net: phy: mdio_bus: add missing device_del() in mdiobus_register() error handling

From: Andrew Lunn
Date: Thu Feb 14 2019 - 20:23:18 EST


On Mon, Feb 11, 2019 at 03:31:59PM +0100, Thomas Petazzoni wrote:
> Even if DaveM already merged my simple fix, I had a further look at
> whether we should be using device_unregister(), and in fact we should
> not, but not really for a good reason: because the mdio API is not very
> symmetrical.
>
> The typical flow is:
>
> probe() {
> bus = mdiobus_alloc();
> if (!bus)
> return -ENOMEM;
>
> ret = mdiobus_register(&bus);
> if (ret) {
> mdiobus_free(bus);
>
> ...
> }
>
> remove() {
> mdiobus_unregister();
> mdiobus_free();
> }
>
> mdiobus_alloc() only does memory allocation, i.e it has no side effects
> on the device model data structures.
>
> mdiobus_register() does a device_register(). If it fails, it only
> cleans up with a device_del(), i.e it doesn't do the put_device() that
> it should do to fully "undo" its effect.
>
> mdiobus_unregister() does a device_del(), i.e it also doesn't do the
> opposite of mdiobus_register(), which should be device_del() +
> put_device() (device_unregister() is a shortcut for both).
>
> mdiobus_free() does the put_device()

Hi Thomas

You made some simplifications here. There is a state variable involved
as well. So if you do mdiobus_alloc() ; mdiobus_free(), it will not do
a put_device() but actually call free(). In that case, it is
symmetrical.

>
> So:
>
> * mdiobus_alloc() / mdiobus_free() are not symmetrical in terms of
> their interaction with the device model data structures
>
> * On error, mdiobus_register() leaves a non-zero reference count to the
> bus->dev structure, which will be freed up by mdiobus_free()
>
> * mdiobus_unregister() leaves a non-zero reference count to the
> bus->dev structure, which will be freed up by mdiobus_free()
>
> So, if we were to use device_unregister() in the error path of
> mdiobus_register() and in mdiobus_unregister(), it would break how
> mdiobus_free() works.

I compared mdiobus with alloc_netdev(), register_netdev(),
unregister_netdev() and free_netdev(). The code is actually very
similar, both have a state variable indicating UNITITIALIZED,
UNREGISTERED, REGISTERED, RELEASED, etc. Both have asymmetric
operations.

I think the real issue here is that you cannot destroy the memory
until all references to it are released. So free_netdev() /
mdiobus_free() cannot actual use free() if the device has been
registered so there could be references to it. The free() has to
happen as part of the put_device() once the reference count reaches
zero. If you were to do the put_device() in mdiobus_unregister()
call, by the time you called mdiobus_free(), the structure could of
already been freed, so you cannot look at the state variable to know
if it was ever registered. If it was never registered, you do need to
free it.

So the internals are asymmetric. Which is messy. But the usage of the
API by a driver is symmetric. That i think is more important.

Andrew