Re: [PATCH 2/2] net: phy: Provide Module 4 KSZ9477 errata (DS80000754C)

From: Lukasz Majewski
Date: Wed Aug 30 2023 - 15:29:37 EST


Hi Oleksij,

> On Wed, Aug 30, 2023 at 02:30:55PM +0100, Russell King (Oracle) wrote:
> > On Wed, Aug 30, 2023 at 03:06:49PM +0200, Oleksij Rempel wrote:
> > > On Wed, Aug 30, 2023 at 01:35:18PM +0100, Russell King (Oracle)
> > > wrote:
> > > > On Wed, Aug 30, 2023 at 02:17:38PM +0200, Oleksij Rempel wrote:
> > > >
> > > > > On Wed, Aug 30, 2023 at 01:51:51PM +0200, Lukasz Majewski
> > > > > wrote:
> > > > > > Hi Oleksij,
> > > > >
> > > > > > It looks like the most optimal solution would be the one
> > > > > > proposed by Tristam:
> > > > > > https://www.spinics.net/lists/netdev/msg932044.html
> > > > >
> > > > > In this case, please add the reason why it would work on this
> > > > > HW and will not break by any changes in PHYlib or micrel.c
> > > > > driver.
> > > > >
> > > > > If I remember it correctly, in KSZ9477 variants, if you write
> > > > > to EEE advertisement register, it will affect the state of a
> > > > > EEE capability register. Which break IEEE 802.3 specification
> > > > > and the reason why ksz9477_get_features() actually exist. But
> > > > > can be used as workaround if it is written early enough
> > > > > before PHYlib tried to read EEE capability register.
> > > > >
> > > > > Please confirm my assumption by applying your workaround and
> > > > > testing it with ethtool --show-eee lanX.
> > > > >
> > > > > It should be commented in the code with all kind of warnings:
> > > > > Don't move!!! We use one bug to workaround another bug!!! If
> > > > > PHYlib start scanning PHYs before this code is executed, then
> > > > > thing may break!!
> > > >
> > > > Why would phylib's scanning cause breakage?
> > > >
> > > > phylib's scanning for PHYs is about reading the ID registers
> > > > etc. It doesn't do anything until the PHY has been found, and
> > > > then the first thing that happens when the phy_device structure
> > > > is created is an appropriate driver is located, and the
> > > > driver's ->probe function is called.
> > > >
> > > > If that is successful, then the fewatures are read. If the PHY
> > > > driver's ->features member is set, then that initialises the
> > > > "supported" mask and we read the EEE abilities.
> > > >
> > > > If ->features is not set, then we look to see whether the driver
> > > > provides a ->get_features method, and call that.
> > > >
> > > > Otherwise we use the generic genphy_c45_pma_read_abilities() or
> > > > genphy_read_abilities() depending whether the PHY's is_c45 is
> > > > set or not.
> > > >
> > > > So, if you want to do something very early before features are
> > > > read, then either don't set .features, and do it early in
> > > > .get_features before calling anything else, or do it in the
> > > > ->probe function.
> > >
> > > Let me summarize my view on the problem, so may be you can
> > > suggest a better way to solve it.
> > > - KSZ9477, KSZ8565, KSZ9893, KSZ9563, seems to have different
> > > quirks by the same PHYid. micrel.c driver do now know what exact
> > > HW is actually in use.
> > > - A set of PHY workarounds was moved from dsa/microchip/ksz9477.c
> > > to micrel.c, one of this workaround was clearing EEE advertisement
> > > register, which by accident was clearing EEE capability
> > > register. Since EEE cap was cleared by the
> > > dsa/microchip/ksz9477.c code before micrel.c was probed, PHYlib
> > > was assuming that his PHY do not supports EEE and dint tried to
> > > use it. After moving this code to micrel.c, it is now trying to
> > > change EEE advertisement state without letting PHYlib to know
> > > about it and PHYlib re enables it as actually excepted.
> > > - so far, only KSZ9477 seems to be broken beyond repair, so it is
> > > better to disable EEE without giving it as a choice for user
> > > configuration.
> >
> > We do have support in phylib for "broken EEE modes" which DT could
> > set for the broken PHYs, and as it is possible to describe the DSA
> > PHYs in DT. This sets phydev->eee_broken_modes.
> >
> > phydev->eee_broken_modes gets looked at when genphy_config_aneg() or
> > genphy_c45_an_config_aneg() gets called - which will happen when the
> > PHY is being "started".
> >
> > So, you could add the DT properties as appropriate to disable all
> > the EEE modes.
> >
> > Alternatively, in your .config_init function, you could detect your
> > flag and force eee_broken_modes to all-ones.
>
> @Lukasz,
>
> can you please try to set eee_broken_modes to all-ones. Somewhat like
> this:
> ksz9477_config_init()
> ...
> ...quirks...
>
> if (phydev->dev_flages & .. NO_EEE...)
> phydev->eee_broken_modes = -1;
>
> err = genphy_restart_aneg(phydev);
> ...
>

The implementation as you suggested seems to work :-)

The ksz_get_phy_flags() - where the MICREL_NO_EEE is set is executed
before ksz9477_config_init().

And then the eee_broken_modes are taken into account.

# ethtool --show-eee lan1
EEE Settings for lan1:
EEE status: disabled
Tx LPI: 0 (us)
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: Not reported
Link partner advertised EEE link modes: Not reported

I will prepare tomorrow a proper patch.

> @Russell, thx!
>
> Regards,
> Oleksij




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@xxxxxxx

Attachment: pgp4WoAQwXo9b.pgp
Description: OpenPGP digital signature