Re: [RFC PATCH net-next 1/3] net: phy: don't bind genphy in phy_attach_direct if the specific driver defers probe

From: Florian Fainelli
Date: Thu Sep 02 2021 - 16:34:05 EST




On 9/2/2021 12:51 PM, Andrew Lunn wrote:
On Thu, Sep 02, 2021 at 07:50:16PM +0100, Russell King (Oracle) wrote:
On Thu, Sep 02, 2021 at 01:50:51AM +0300, Vladimir Oltean wrote:
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 52310df121de..2c22a32f0a1c 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1386,8 +1386,16 @@ int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
/* Assume that if there is no driver, that it doesn't
* exist, and we should use the genphy driver.
+ * The exception is during probing, when the PHY driver might have
+ * attempted a probe but has requested deferral. Since there might be
+ * MAC drivers which also attach to the PHY during probe time, try
+ * harder to bind the specific PHY driver, and defer the MAC driver's
+ * probing until then.
*/
if (!d->driver) {
+ if (device_pending_probe(d))
+ return -EPROBE_DEFER;

Something else that concerns me here.

As noted, many network drivers attempt to attach their PHY when the
device is brought up, and not during their probe function.

Yes, this is going to be a problem. I agree it is too late to return
-EPROBE_DEFER. Maybe phy_attach_direct() needs to wait around, if the
device is still on the deferred list, otherwise use genphy. And maybe
a timeout and return -ENODEV, which is not 100% correct, we know the
device exists, we just cannot drive it.

Is it really going to be a problem though? The two cases where this will matter is if we use IP auto-configuration within the kernel, which this patchset ought to be helping with, if we are already in user-space and the PHY is connected at .ndo_open() time, there is a whole lot of things that did happen prior to getting there, such as udevd using modaliases in order to load every possible module we might, so I am debating whether we will really see a probe deferral at all.


Can we tell we are in the context of a driver probe? Or do we need to
add a parameter to the various phy_attach API calls to let the core
know if this is probe or open?

Actually we do the RTNL lock will be held during ndo_open and it won't during driver probe.


This is more likely to be a problem with NFS root, with the kernel
bringing up an interface as soon as its registered. userspace bringing
up interfaces is generally much later, and udev tends to wait around
until there are no more driver load requests before the boot
continues.

See my point above, with Vladimir's change, we should have fw_devlink do its job such that by the time the network interface is needed for IP auto-configuration, all of its depending resources should also be ready, would not they?
--
Florian