[RFC net] Revert "net: phy: Fix race condition on link status change"

From: Serge Semin
Date: Wed Aug 16 2023 - 14:10:39 EST


Protecting the phy_driver.drv->handle_interrupt() callback invocation by
the phy_device.lock mutex causes all the IRQ-capable PHY drivers to lock
the mutex twice thus deadlocking on the next calls thread:
IRQ: phy_interrupt()
+-> mutex_lock(&phydev->lock); <-------------+
drv->handle_interrupt() | Deadlock due to the
+-> phy_error() + nested PHY-device
+-> phy_process_error() | mutex lock
+-> mutex_lock(&phydev->lock); <-+
phydev->state = PHY_ERROR;
mutex_unlock(&phydev->lock);
mutex_unlock(&phydev->lock);

The problem can be easily reproduced just by calling phy_error() from the
any PHY-device interrupt handler. Reverting the commit 91a7cda1f4b8 ("net:
phy: Fix race condition on link status change") fixes the deadlock.

This reverts commit 91a7cda1f4b8bdf770000a3b60640576dafe0cec.

Fixes: 91a7cda1f4b8 ("net: phy: Fix race condition on link status change")
Signed-off-by: Serge Semin <fancer.lancer@xxxxxxxxx>

---

Since obviously it would be better to fix both the deadlock and the
problem described in the blamed commit the patch is marked as RFC. I am
not aware of a better solution for now than to revert the commit caused
the regression. So let's discuss to find out whether it's possible to have
a better fix here.

---
drivers/net/phy/phy.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index bdf00b2b2c1d..9483bd57158e 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -1235,7 +1235,6 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
{
struct phy_device *phydev = phy_dat;
struct phy_driver *drv = phydev->drv;
- irqreturn_t ret;

/* Wakeup interrupts may occur during a system sleep transition.
* Postpone handling until the PHY has resumed.
@@ -1259,11 +1258,7 @@ static irqreturn_t phy_interrupt(int irq, void *phy_dat)
return IRQ_HANDLED;
}

- mutex_lock(&phydev->lock);
- ret = drv->handle_interrupt(phydev);
- mutex_unlock(&phydev->lock);
-
- return ret;
+ return drv->handle_interrupt(phydev);
}

/**
--
2.41.0