Re: [PATCH net] net: phy: Warn if phy is attached when removing

From: Sean Anderson
Date: Mon Aug 22 2022 - 13:02:40 EST




On 8/22/22 12:32 PM, Russell King (Oracle) wrote:
> On Mon, Aug 22, 2022 at 12:00:28PM -0400, Sean Anderson wrote:
>> In the last thread I posted this snippet:
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index a74b320f5b27..05894e1c3e59 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -27,6 +27,7 @@
>> #include <linux/phy.h>
>> #include <linux/phy_led_triggers.h>
>> #include <linux/property.h>
>> +#include <linux/rtnetlink.h>
>> #include <linux/sfp.h>
>> #include <linux/skbuff.h>
>> #include <linux/slab.h>
>> @@ -3111,6 +3112,13 @@ static int phy_remove(struct device *dev)
>> {
>> struct phy_device *phydev = to_phy_device(dev);
>>
>> + // I'm pretty sure this races with multiple unbinds...
>> + rtnl_lock();
>> + device_unlock(dev);
>> + dev_close(phydev->attached_dev);
>> + device_lock(dev);
>> + rtnl_unlock();
>> + WARN_ON(phydev->attached_dev);
>> +
>> cancel_delayed_work_sync(&phydev->state_queue);
>>
>> mutex_lock(&phydev->lock);
>> ---
>>
>> Would this be acceptable? Can the locking be fixed?
>
> I can't comment on that.

:l

I feel like doing device_unlock is very wrong, but someone in the dev_close
call calls device_lock and it deadlocks.

>> > Addressing the PCS part of the patch posting and unrelated to what we
>> > do for phylib...
>> >
>> > However, I don't see "we'll do this for phylib, so we can do it for
>> > PCS as well" as much of a sane argument - we don't have bazillions
>> > of network drivers to fix to make this work for PCS. We currently
>> > have no removable PCS (they don't appear with a driver so aren't
>> > even bound to anything.) So we should add the appropriate handling
>> > when we start to do this.
>> >
>> > Phylink has the capability to take the link down when something goes
>> > away, and if the PCS goes away, that's exactly what should happen,
>> > rather than oopsing the kernel.
>>
>> Yeah, but we can't just call phylink_stop; we have to call the function
>> which will call phylink_stop, which is different for MAC drivers and
>> for DSA drivers.
>
> I think that's way too much and breaks the phylink design. phylink_stop
> is designed to be called only from ndo_close() - and to be paired with
> phylink_start().

Well, the driver almost certainly wants to bring the link down (so that
it can stop using the PCS) Maybe we just need to call
phylink_run_resolve_and_disable(pl, PHYLINK_DISABLE_STOPPED)?

> When I talk about "taking the link down" what I mean by that is telling
> the network device that the *link* *is* *down* and no more. In other
> words, having phylink_resolve() know that there should be a PCS but it's
> gone, and therefore the link should not come up in its current
> configuration.

Fundamentally the driver is the one which owns the PCS, not phylink. The
driver is perfectly capable of touching the PCS by accident or on purpose,
including calling PCS operations directly. We already have two in-tree
drivers which do this (mt7530 and mvpp2). I also used this method when
conversion dpaa to phylink. In the patch before the conversion, I
switched to the lynx PCS instead of using the existing (ad-hoc) helpers.

So we can't just handle everything in phylink; the driver has to be
involved too. And of course, what happens when the link is brought back
up again? The driver will once again offer the (now bogus) PCS. Will we
have to track dead PCSs forever? What happens if another (valid) PCS gets
allocated at the same address as the old PCS?

>> I think we'd need something like
>>
>> struct phylink_pcs *pcs_get(struct device *dev, const char *id,
>> void (*detach)(struct phylink_pcs *, void *),
>> void *priv)
>>
>> which would also require that pcs_get is called before phylink_start,
>> instead of in probe (which is what every existing user does).
>
> That would at least allow the MAC driver to take action when the PCS
> gets removed.
>
>> This whole thing has me asking the question: why do we allow unbinding
>> in-use devices in the first place?
>
> The driver model was designed that way from the start, because in most
> cases when something is unplugged from the system, the "remove" driver
> callback is just a notification that the device has already gone.
> Failing it makes no sense, because software can't magic the device
> back.
>
> Take an example of a USB device. The user unplugs it, it's gone from
> the system, but the system briefly still believes the device to be
> present for a while. It eventually notices that the device has gone,
> and the USB layer unregisters the struct device - which has the effect
> of unbinding the device from the driver and eventually cleaning up the
> struct device.
>
> This can and does happen with Ethernet PHYs ever since we started
> supporting SFPs with Ethernet PHYs. The same thing is true there -
> you can pull the module at any moment, it will be gone, and the
> system will catch up with its physical disconnection some point later.
> It's no good trying to make ->remove say "no, the device is still in
> use, I won't let you remove it" because there's nothing software can
> do to prevent the going of the device - the device has already
> physically gone.

OK, that clears things up.

> I don't think that's the case with PCS - do we really have any PCS
> that can be disconnected from the system while it's running? Maybe
> ones in a FPGA and the FPGA can be reprogrammed at runtime (yes,
> people have really done this in the past.)

This is actually a pretty good case for PCS drivers, since the MAC
has no idea what kind of PCS it's hooked up to when there's a PCS on
the FPGA.

> If we don't want to support "hotpluggable" PCS, then there's a
> simple solution to this - the driver model has the facility to suppress
> the bind/unbind driver files in sysfs, which means userspace can't
> unbind the device. If there's also no way to make the struct device go
> away in the kernel, then effectively the driver can then only be
> unbound if the driver is built as a module.
>
> At that point, we always have the option of insisting that PCS drivers
> are never modules - and then we have the situation where a PCS will
> never disappear from the system once the driver has picked up on it.

Well can't we increment the module refcount?

> Of course, those PCS that are tightly bound to their MACs can still
> come and go along with their MACs, but it's up to the MAC driver to
> make that happen safely (unregistering the netdev first, which will
> have the effect of calling ndo_close(), taking the network device
> down and preventing further access to the netdev - standard netdev
> MAC driver stuff.)

And I'm not too worried about that; there's no need to create a separate
device for the PCS if it's always present.

--Sean