Re: [PATCH 2/2] net: dsa: microchip: Provide Module 4 KSZ9477 errata (DS80000754C)

From: Oleksij Rempel
Date: Tue Aug 29 2023 - 07:48:27 EST


Hi Lukasz,

On Tue, Aug 29, 2023 at 01:24:29PM +0200, Lukasz Majewski wrote:
> Hi Vladimir,
>
> > Hi Lukasz,
> >
> > On Tue, Aug 29, 2023 at 10:35:33AM +0200, Lukasz Majewski wrote:
> > > Hi Vladimir,
> > >
> > > > On Fri, Aug 25, 2023 at 06:48:41PM +0000,
> > > > Tristram.Ha@xxxxxxxxxxxxx wrote:
> > > > > > > IMHO adding functions to MMD modification would facilitate
> > > > > > > further development (for example LED setup).
> > > > > >
> > > > > > We already have some KSZ9477 specific initialization done in
> > > > > > the Micrel PHY driver under drivers/net/phy/micrel.c, can we
> > > > > > converge on the PHY driver which has a reasonable amount of
> > > > > > infrastructure for dealing with workarounds, indirect or
> > > > > > direct MMD accesses etc.?
> > > > >
> > > > > Actually the internal PHY used in the KSZ9897/KSZ9477/KSZ9893
> > > > > switches are special and only used inside those switches.
> > > > > Putting all the switch related code in Micrel PHY driver does
> > > > > not really help. When the switch is reset all those PHY
> > > > > registers need to be set again, but the PHY driver only
> > > > > executes those code during PHY initialization. I do not know
> > > > > if there is a good way to tell the PHY to re-initialize again.
> > > > >
> > > >
> > > > Suppose there was a method to tell the PHY driver to re-initialize
> > > > itself. What would be the key points in which the DSA switch
> > > > driver would need to trigger that method? Where is the switch
> > > > reset at runtime?
> > >
> > > Tristam has explained why adding the internal switch PHY errata to
> > > generic PHY code is not optimal.
> >
> > Yes, and I didn't understand that explanation, so I asked a
> > clarification question.
>
> Ok. Let's wait for Tristram's answer.
>
> >
> > > If adding MMD generic code is a problem - then I'm fine with just
> > > clearing proper bits with just two indirect writes in the
> > > drivers/net/dsa/microchip/ksz9477.c
> > >
> > > I would also prefer to keep the separate ksz9477_errata() function,
> > > so we could add other errata code there.
> > >
> > > Just informative - without this patch the KSZ9477-EVB board's
> > > network is useless when the other peer has EEE enabled by default
> > > (like almost all non managed ETH switches).
> >
> > No, adding direct PHY MMD access code to the ksz9477 switch driver is
> > not even the biggest problem - even though, IIUC, the "workaround" to
> > disable EEE advertisement could be moved to ksz9477_get_features() in
> > drivers/net/phy/micrel.c, where phydev->supported_eee could be
> > cleared.
>
> To be even more interesting (after looking into the PHY micrel.c code):
> https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/micrel.c#L1804
>
> The errata from this patch is already present.
>
> The issue is that ksz9477_config_init() (drivers/net/phy/micrel.c) is
> executed AFTER generic phy_probe():
> https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L3256
> in which the EEE advertisement registers are read.
>
> Hence, those registers needs to be cleared earlier - as I do in
> ksz9477_setup() in drivers/net/dsa/microchip/ksz9477.
>
> Here the precedence matters ...
> >
> > The biggest problem that I see is that Oleksij Rempel has "just" added
> > EEE support to the KSZ9477 earlier this year, with an ack from Arun
> > Ramadoss: 69d3b36ca045 ("net: dsa: microchip: enable EEE support").
> > I'm not understanding why the erratum wasn't a discussion topic then.
>
> +1

As this erratum states: "this feature _can_ cause link drops".
For example I was indeed able to have EEE relates issue between this
switch and a link partner with AR8035 PHY. Following patch addressing
this issue:
https://lore.kernel.org/all/20230327142202.3754446-8-o.rempel@xxxxxxxxxxxxxx/
So, in this case KSZ9477 was not the bad side.

Since this erratum do not describe exact cause of this issue or specific
link partners where this functionality is not working, I would prefer to
give the user the freedom of choice.

The same issue we have with Pause Frame support. It is not always a good
choice, but user has freedom to configure it.

Today I wont to create a test setup with different EEE capable link
partners on one side and KSZ9477 on other side and let it run some days.
Just to make sure.

Beside, are you able to reproduce this issue?

Regards,
Oleksij
--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |