Re: [PATCH net 1/1] net: stmmac: fix MAC and phylink mismatch issue after resume with STMMAC_FLAG_USE_PHY_WOL enabled

From: Russell King (Oracle)
Date: Thu Nov 09 2023 - 04:46:43 EST


On Thu, Nov 09, 2023 at 09:15:36AM +0000, Russell King (Oracle) wrote:
> On Thu, Nov 09, 2023 at 01:00:27PM +0800, Gan Yi Fang wrote:
> > From: "Gan, Yi Fang" <yi.fang.gan@xxxxxxxxx>
> >
> > The issue happened when flag STMMAC_FLAG_USE_PHY_WOL is enabled.
> > It can be reproduced with steps below:
> > 1. Advertise only one speed on the host
> > 2. Enable the WoL on the host
> > 3. Suspend the host
> > 4. Wake up the host
> >
> > When the WoL is disabled, both the PHY and MAC will suspend and wake up
> > with everything configured well. When WoL is enabled, the PHY needs to be
> > stay awake to receive the signal from remote client but MAC will enter
> > suspend mode.
> >
> > When the MAC resumes from suspend, phylink_resume() will call
> > phylink_start() to start the phylink instance which will trigger the
> > phylink machine to invoke the mac_link_up callback function. The
> > stmmac_mac_link_up() will configure the MAC_CTRL_REG based on the current
> > link state. Then the stmmac_hw_setup() will be called to configure the MAC.
> >
> > This sequence might cause mismatch of the link state between MAC and
> > phylink. This patch moves the phylink_resume() after stmamc_hw_setup() to
> > ensure the MAC is initialized before phylink is being configured.
>
> Isn't this going to cause problems?
>
> stmamc_hw_setup() calls stmmac_init_dma_engine(), which then calls
> stmmac_reset() - and stmmac_reset() can fail if the PHY clock isn't
> running, which is why phylink_resume() gets called before this.

I think these two commits should be reviewed to understand why the code
is the way it is, and why changing it may cause regressions:

90702dcd19c0 ("net: stmmac: fix MAC not working when system resume back
with WoL active")

36d18b5664ef ("net: stmmac: start phylink instance before
stmmac_hw_setup()")

As part of my work on stmmac that got junked, I was looking at a
solution to the "we need the PHY clock to be running for the MAC to
work for things like reset" problem - but those patches got thrown
away when stmmac folk were very nitpicky over %u vs %d in format
strings to print what was a _signed_ value that stmmac code stupidly
converts to an unsigned integer... it's still a signed integer no
matter if code decides to use "unsigned int". I suspect all those
patches (and there was a considerable number of them) have now been
expired from git, so are now totally lost, and honestly I have no
desire to put further work into stmmac stuff.

--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!