Re: [PATCH] net: phy: adin1100: Fix nullptr exception for phy interrupts

From: Andrew Lunn
Date: Thu Jan 18 2024 - 15:09:44 EST


On Thu, Jan 18, 2024 at 06:36:16PM +0100, Heiner Kallweit wrote:
> On 18.01.2024 17:53, Andrew Lunn wrote:
> > On Thu, Jan 18, 2024 at 11:43:41AM +0100, Andre Werner wrote:
> >> If using ADIN1100 as an external phy, e.g. in combination with
> >> "smsc95xx", we ran into nullptr exception by creating a link.
> >>
> >> In our case the "smsc95xx" does not check for an available interrupt handler
> >> on external phy driver to use poll instead of interrupts if no handler is
> >> available. So we decide to implement a small handler in the phy driver
> >> to support other MACs as well.
> >>
> >> I update the driver to add an interrupt handler because libphy
> >> does not check if their is a interrupt handler available either.
> >>
> >> There are several interrupts maskable at the phy, but only link change interrupts
> >> are handled by the driver yet.
> >>
> >> We tested the combination "smsc95xx" and "adin1100" with Linux Kernel 6.6.9
> >> and Linux Kernel 6.1.0, respectively.
> >
> > Hi Andre
> >
> > A few different things....
> >
> > Please could you give more details of the null pointer
> > exception. phylib should test if the needed methods have been
> > implemented in the PHY driver, and not tried to use interrupts when
> > they are missing. It should of polled the PHY. So i would like to
> > understand what went wrong. Maybe we have a phylib core bug we should
> > be fixing. Or a bug in the smsc95xx driver.
> >
> Seems to be a bug in smsc95xx. The following is the relevant code piece.
>
> ret = mdiobus_register(pdata->mdiobus);
> if (ret) {
> netdev_err(dev->net, "Could not register MDIO bus\n");
> goto free_mdio;
> }
>
> pdata->phydev = phy_find_first(pdata->mdiobus);
> if (!pdata->phydev) {
> netdev_err(dev->net, "no PHY found\n");
> ret = -ENODEV;
> goto unregister_mdio;
> }
>
> pdata->phydev->irq = phy_irq;
>
> The interrupt is set too late, after phy_probe(), where we have this:
>
> if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
> phydev->irq = PHY_POLL;
>
> So we would have two steps:
> 1. Fix the smsc95xx bug (as the same issue could occur with another PHY type)

Its not so nice to fix.

Normally you would do something like:

pdata->mdiobus->priv = dev;
pdata->mdiobus->read = smsc95xx_mdiobus_read;
pdata->mdiobus->write = smsc95xx_mdiobus_write;
pdata->mdiobus->reset = smsc95xx_mdiobus_reset;
pdata->mdiobus->name = "smsc95xx-mdiobus";
pdata->mdiobus->parent = &dev->udev->dev;

snprintf(pdata->mdiobus->id, ARRAY_SIZE(pdata->mdiobus->id),
"usb-%03d:%03d", dev->udev->bus->busnum, dev->udev->devnum);

pdata->mdiobus->irq[X] = phy_irq;

ret = mdiobus_register(pdata->mdiobus);

By setting pdata->mdiobus->irq[X] before registering the PHY, the irq
number gets passed to the phydev->irq very early on. If everything is
O.K, interrupts are then used.

However, because of the use of phy_find_first(), we have no idea what
address on the bus the PHY is using. So we don't know which member of
irq[] to set. Its not ideal, but one solution is to set them all.

However, a better solution is to perform the validation again in
phy_attach_direct(). Add a second:

if (!phy_drv_supports_irq(phydrv) && phy_interrupt_is_valid(phydev))
phydev->irq = PHY_POLL;

which will force phydev->irq back to polling.

Andrew