Re: [PATCH 2/2] net: dsa: microchip: Provide Module 4 KSZ9477 errata (DS80000754C)

From: Lukasz Majewski
Date: Tue Aug 29 2023 - 11:30:04 EST


Hi Oleksij,

> On Tue, Aug 29, 2023 at 02:38:29PM +0200, Lukasz Majewski wrote:
> > Hi Oleksij,
>
> ...
>
> > Hence, I would prefer to apply the Errata and then somebody, who
> > would like to enable EEE can try if it works for him.
>
> ok.
>
> > IMHO, code to fix erratas shall be added unconditionally, without
> > any "freedom of choic
>
> This claim is not consistent with the patch. To make it without
> ability to enable EEE, you will need to clear all eee_supported bits.
> If this HW is really so broken, then it is the we how it should be
> fixed.
>
> > > Beside, are you able to reproduce this issue?
> >
> > Yes, I can reproduce the issue. I do use two Microchip's development
> > boards (KSZ9477-EVB [1]) connected together to test HSR as well as
> > communication with HOST PC.
>
> I use KSZ9477-EVB as well.
>
> > The network on this board without this patch is not usable
> > (continually I do encounter link up/downs).
>
> My test setup runs currently about two hours. It had 4 link drops on
> LAN3 and none on other ports. Swapping cables connected to LAN2 and
> LAN3 still let the LAN3 sometimes drop the connection. So far, for
> example LAN2 works stable and this is probably the reason why I have
> not seen this issue before. After disabling EEE on LAN3 I start
> getting drops on LAN2.
>

Ok.

> > Please be also aware, that this errata fix is (implicitly I think)
> > already present in the kernel:
> > https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/micrel.c#L1804
> >
> > However, the execution order of PHY/DSA functions with newest
> > mainline makes it not working any more (I've described it in
> > details in the earlier mail to Vladimir).
>
> Ok, since it was already not advertised by default, I have nothing
> against having default policy to not advertise EEE for this switch.
>

Ok.

> On other hand, since this functionality is not listed as supported by
> the KSZ9477 datasheet (No word about IEEE 802.3az Energy Efficient
> Ethernet (EEE)) compared to KSZ8565R datasheet (where EEE support is
> listed) and it is confirmed to work not stable enough, then it should
> be disabled properly.

I've described this problem in more details here:
https://lore.kernel.org/lkml/20230829132429.529283be@wsk/

-------->8---------
The issue is that ksz9477_config_init() (drivers/net/phy/micrel.c) is
executed AFTER generic phy_probe():
https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy_device.c#L3256
in which the EEE advertisement registers are read.

Hence, those registers needs to be cleared earlier - as I do in
ksz9477_setup() in drivers/net/dsa/microchip/ksz9477.

Here the precedence matters ...
----------8<-------------

> The phydev->supported_eee should be cleared.
> See ksz9477_get_features().
>

Removing the linkmod_and() from ksz9477_get_features():
https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/micrel.c#L1408

doesn't help.

It looks like it is done too late (please read the above e-mail). We
would need to disable the eee support at all for this switch IC or
apply the original version of this patch (I mean clear in-KSZ9477 EEE
advertisement register early).

>
> Regards,
> Oleksij




Best regards,

Lukasz Majewski

--

DENX Software Engineering GmbH, Managing Director: Erika Unter
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-59 Fax: (+49)-8142-66989-80 Email: lukma@xxxxxxx

Attachment: pgp0fRQ2edd_B.pgp
Description: OpenPGP digital signature