Re: [PATCH] net: phy: fix PHY possibly unwork after MDIO bus resume back

From: Florian Fainelli
Date: Tue Apr 06 2021 - 14:32:28 EST




On 4/6/2021 4:42 AM, Heiner Kallweit wrote:
>
> Waiting for ANEG_COMPLETE to be set wouldn't be a good option. Aneg may never
> complete for different reasons, e.g. no physical link. And even if we use a
> timeout this may add unwanted delays.
>
>> Do you have any other insights that can help me further locate the issue? Thanks.
>>
>
> I think current MAC/PHY PM handling isn't perfect. Often we have the following
> scenario:
>
> *suspend*
> 1. PHY is suspended (mdio_bus_phy_suspend)
> 2. MAC suspend callback (typically involving phy_stop())
>
> *resume*
> 1. MAC resume callback (typically involving phy_start())
> 2. PHY is resumed (mdio_bus_phy_resume), incl. calling phy_init_hw()
>
> Calling phy_init_hw() after phy_start() doesn't look right.
> It seems to work in most cases, but there's a certain risk
> that phy_init_hw() overwrites something, e.g. the advertised
> modes.
> I think we have two valid scenarios:
>
> 1. phylib PM callbacks are used, then the MAC driver shouldn't
> touch the PHY in its PM callbacks, especially not call
> phy_stop/phy_start.
>
> 2. MAC PM callbacks take care also of the PHY. Then I think we would
> need a flag at the phy_device telling it to make the PHY PM
> callbacks a no-op.

Maybe part of the problem is that the FEC is calling phy_{stop,start} in
its suspend/resume callbacks instead of phy_{suspend,resume} which would
play nice and tell the MDIO bus PM callbacks that the PHY has already
been suspended.

I am also suspicious about whether Wake-on-LAN actually works with the
FEC, you cannot wake from LAN if the PHY is stopped and powered down.
--
Florian