Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe

From: Florian Fainelli
Date: Mon Mar 25 2024 - 17:27:22 EST

Next message: Chris Li: "Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices"
Previous message: Ingo Molnar: "Re: [PATCH] x86/mm/ident_map: Use full gbpages in identity maps except on UV platform."
In reply to: John Ernberg: "Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe"
Next in thread: John Ernberg: "Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 3/25/24 05:20, John Ernberg wrote:

Hi Florian,

On 3/21/24 17:13, Florian Fainelli wrote:

On 3/21/24 09:02, John Ernberg wrote:

Hi Russell,

On 3/20/24 20:44, Russell King (Oracle) wrote:

On Wed, Mar 20, 2024 at 10:13:55AM -0700, Florian Fainelli wrote:

On 3/20/2024 9:54 AM, Russell King (Oracle) wrote:

On Wed, Mar 20, 2024 at 03:25:54PM +0000, John Ernberg wrote:

Hi Russel,

Growl. Hi Peter.

What we really want is the PHY to be suspended on suspend to RAM
regardless of us having had an initial link up or not.

So what you're asking is for the PHY to be suspended when the system
is entering suspend, which is a long time after the system booted and
thus phy_probe() was called, and could be some time before the system
resumes.

I'm not sure what the relevance is of phy_probe() that was brought up
previously then.

This worked prior to 4c0d2e96ba05 ("net: phy: consider that
suspend2ram
may cut
off PHY power") which was added in Linux 5.11, and 557d5dc83f68
("net:
fec: use
mac-managed PHY PM") which was added in Linux 5.12.

Looking at the former commit, that looks to me like it is only
affecting the resume paths, not the suspend paths, so wouldn't have
any impact itself on what happens when suspend happens.

The latter commit states that it is a work around for an issue with a
particular PHY. What happens if you revert just this commit, does your
problem then go away?

Our PHY does not begin working again without reverting both.
phy_init_hw()
will remain an issue if it occurs after phy_start().

The commit message in 557d5dc83f68 is not explaining nearly enough, I
spent a
few days on it before I proved that commit to be nearly correct (See
whole
thread at [1]), it happened to just explode with that PHY. The issue is a
sequencing issue that was made more prominent by 4c0d2e96ba05, but it
existed
since around 2008. Because FEC is both MDIO controller and MAC,
meaning the
resume of the link in a link up case runs phy_start() in the FEC resume
function, which will trigger a mdio bus resume when it completes, in turn
calling phy_init_hw() (before 4c0d2e96ba05 it was phy_resume() which
wasn't a
problem but still wrong sequence wise).

Also, please clarify. It seems that you are reporting a regression -
it used to work for you prior to 557d5dc83f68, but 557d5dc83f68 stops
it working for you?

Since FEC requires mac_managed_pm the generic PM suspend-resume paths
are not
taken. The resume sequencing with generic PM has been broken with the
FEC since
generic PM of the mdio bus was added, as the FEC will do phy_start()
(via FEC
resume) and then generic PM runs phy_init_hw() via mdio bus resume
(previously:
less damaging phy_resume()) due to how the FEC IP block works.

That suggests that even with 557d5dc83f68 reverted, it's broken.
Digging into the history, what you're referring to dates from January
2016, so are you reporting a regression that occured 8 _years_ ago,
at which point I'd question why it's taken 8 years.

A revert of those is absolutely wrong. Those commits are fixing bigger
issues.

Given the time that has passed, I don't think reverting commits is
a sane approach. Quite what the right solution is though, I'm not
sure.

From the description and the commits pointed to, I just don't see
that there is anything that could've changed with respect to the first
boot - if that has changed, then I think more research into what
caused
it is needed.

If it's the subsequent state after a suspend-resume cycle, then yes,
I would agree that its possible that these changes broke this for you.
Would clearing ndev->phydev->mac_managed_pm just before
phy_disconnect() in fec_enet_close() fix it for you, so the suspend/
resume paths for the PHY get used when the network interface is down?

Maybe, however, that's something that should happen in any case inside
phylib on phy_disconnect() as a matter of course, since the PHY will
at that point be no longer under the control of the network driver for
PM purposes. Could you give this idea a try please?

On phy_disconnect() we will do a phy_detach() which calls
phy_suspend().
Given that phy_disconnect() is called from fec_enet_close(), we
still have a
MDIO bus registered and we are not trying to suspend the MDIO bus,
so we
should have an effective phy_suspend() call here, what am I missing?

I didn't look there, but if that is the case, then what is John's
problem - I can't figure it out, something isn't adding up here.

I could instead add extra phy_suspend() in the suspend path if the
link is
down and the FEC is up and running. I rejected it originally thinking
it was
a much dirtier fix, but maybe that is the more correct thing to do?

This does not seem like the proper solution, the only time where an
explicit phy_suspend() should be done in the Ethernet MAC's ->suspend()
routine is if the network device was brought up at the time
(netif_runninng() returns true) *and* you set mac_managed_pm = true
because you must precisely control the order in which the MAC and the
PHY get suspended with respect to each other (typically because the PHY
supplies a RX clock back to the MAC, and some of the MAC logic depends
upon it to operate properly, e.g.: perform a proper FIFO flush etc.).

I'm having some trouble understanding your message in context of my most
recent reply to Russell, so please bear with me here as I will
potentially ask a really dumb question:

Do I understand this correctly as what used to work in 5.10 was never
meant to work and the behavior now is the correct one in the FEC case?

I am not sure about that, because I do not believe this had been explicitly accounted for, and that is what I am also trying to figure out.

Meaning that if the link has never been up the PHY must never be handled
from a power management perspective?

Let's stray away from saying "link is UP" but instead say interface administratively brought up (ip link set dev <foo> up) so we are on the same page.

The only PHY examples I have come across (though not many in total) the
PHY has done some initial configuration of itself after POR or release
of the reset line.

From there, I see two distinct cases:

- the network device driver probed, but the network device was never
brought up in the first place in that case, I do not see a path whereby
the PHY would have been suspended, unless the boot firmware already took
care of that (which arguably it should if you are trying to be as power
efficient as possible), although arguably there could be a path within
the kernel where this is also done. It could get really complicated however

Generic PM via mdio_bus_phy_suspend() will suspend the PHY if it has a
.suspend callback and mac_managed_pm isn't set.

Correct.

mdio_bus_phy_may_suspend() will see that netdev is NULL, which means it
returns the inverse of phy->suspended (which is false), meaning the
function returns true. Thus phy_suspend() is called.

OK, so right there is where you were were basically depending upon mdio_bus_phy_may_suspend() being called to suspend your PHY device! I don't think this had ever been envisioned to be working that way, if it did that was a byproduct rather than a contract. Introducing mac_managed_pm certainly did break that contract because now we no longer have that "double" suspend and resume that used to happen.

- the network device driver probed, and the network device was brought
up at least once (regardless of whether a link state was detected or
not), such that the PHY has gone through a phy_start()/phy_stop() cycle,
and upon phy_stop() a phy_suspend() has been called

It is safe to assume you fall in the first case only, or do you also see
a problem in the second case as well?

There is only a problem in the first case. The second case is working as
expected.

Thanks for the clarification.
--
Florian

Next message: Chris Li: "Re: [PATCH] mm: zswap: fix data loss on SWP_SYNCHRONOUS_IO devices"
Previous message: Ingo Molnar: "Re: [PATCH] x86/mm/ident_map: Use full gbpages in identity maps except on UV platform."
In reply to: John Ernberg: "Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe"
Next in thread: John Ernberg: "Re: [PATCH net v3 2/2] net: fec: Suspend the PHY on probe"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]