Re: [RFC net-next PATCH 10/16] net: macb: Move PCS settings to PCS callbacks

From: Sean Anderson
Date: Tue Oct 05 2021 - 17:44:23 EST




On 10/5/21 2:53 PM, Russell King (Oracle) wrote:
On Tue, Oct 05, 2021 at 12:03:50PM -0400, Sean Anderson wrote:
Hi Russell,

On 10/5/21 6:06 AM, Russell King (Oracle) wrote:
> On Mon, Oct 04, 2021 at 03:15:21PM -0400, Sean Anderson wrote:
> > +static void macb_pcs_get_state(struct phylink_pcs *pcs,
> > + struct phylink_link_state *state)
> > +{
> > + struct macb *bp = pcs_to_macb(pcs);
> > +
> > + if (gem_readl(bp, NCFGR) & GEM_BIT(SGMIIEN))
> > + state->interface = PHY_INTERFACE_MODE_SGMII;
> > + else
> > + state->interface = PHY_INTERFACE_MODE_1000BASEX;
>
> There is no requirement to set state->interface here. Phylink doesn't
> cater for interface changes when reading the state. As documented,
> phylink will set state->interface already before calling this function
> to indicate what interface mode it is currently expecting from the
> hardware.

Ok, so instead I should be doing something like

if (gem_readl(bp, NCFGR) & GEM_BIT(SGMIIEN))
interface = PHY_INTERFACE_MODE_SGMII;
else
interface = PHY_INTERFACE_MODE_1000BASEX;

if (interface != state->interface) {
state->link = 0;
return;
}

Why would it be different? If we've called the pcs_config method to
set the interface to one mode, why would it change?

config() does not always come before get_state due to (e.g.)
phylink_ethtool_ksettings_get. Though in that instance, state->interface
is not read. And of course this ordering isn't documented.

That being said, I will just do

if (interface != PHY_INTERFACE_MODE_SGMII ||
interface != PHY_INTERFACE_MODE_1000BASEX) {
state->link = 0;
return;
}

for next time.

> There has been the suggestion that we should allow in-band AN to be
> disabled in 1000base-X if we're in in-band mode according to the
> ethtool state.

This logic is taken from phylink_mii_c22_pcs_config. Maybe I should add
another _encode variant? I hadn't done this here because the logic was
only one if statement.

> I have a patch that adds that.

Have you posted it?

I haven't - it is a patch from Robert Hancock, "net: phylink: Support
disabling autonegotiation for PCS". I've had it in my tree for a while,
but I do want to make some changes to it before re-posting.

(for those following along this is [1])

OK. I'll add an _encode variant for this function in the next revision then.

[1] https://lore.kernel.org/netdev/20210630174927.1077249-1-robert.hancock@xxxxxxxxxx/

> You can't actually abort at this point - phylink will print the error
> and carry on regardless. The checking is all done via the validate()
> callback and if that indicates the interface mode is acceptable, then
> it should be accepted.

Ok, so where can the PCS NAK an interface? This is the only callback
which has a return code, so I assumed this was the correct place to say
"no, we don't support this." This is what lynx_pcs_config does as well.

At the moment, the PCS doesn't get to NAK an inappropriate interface.
That's currently the job of the MAC's validate callback with the
assumtion that the MAC knows what interfaces are supportable.

Which is a rather silly assumption because then you have to update the
MAC's validate function every time you add a new PCS. And this gets
messy rather fast. For example, you might want to connect your SFP
module to a MAC which only has an RGMII interface. So you put a DP83869
on your board connected like

MAC <--RGMII--> DP83869 <--SGMII--> SFP

For the moment, I think you have to just pretend that the DP83869 is the
only PHY in the system and hope that you don't need to talk to the SFP's
PHY. But if you want to use the DP83869 as a PCS, then you need to
update the MAC's validate() to allow SGMII, even though the MAC doesn't
support that without an external converter.

In an ideal world, the MAC would select its interface based on the PCS
(or lack of one), and the PCS would validate the interface mode. But of
course, there may be multiple PCSs available, so it is not so easy.

(Selecting between multiple PCSs (or no PCS at all) seems to be similar
in spirit to the PORT_XXX settings)

Trying to do it later, once the configuration has been worked out can
_only_ lead to a failure of some kind - in many paths, there is no way
to report the problem except by printing a message into the kernel log.

For example, by the time we reach pcs_config(), we've already prepared
the MAC for a change to the interface, we've told the MAC to configure
for that interface. Now the PCS rejects it - we have no record of the
old configuration to restore. Even if we had a way to restore it, then
we could return an error to the user - but the user doesn't get to
control the interface themselves. If it was the result of a PHY changing
its interface, then what - we can only log an error to the kernel log.
If it's the result of a SFP being plugged in, we have no way to
renegotiate.

pcs_config() is too late to be making decisions about whether the
requested configuration is acceptable or not. It needs to be done as
part of the validation step.

Well, if these are the constraints, then IMO the PCS must have its own
validate() callback. Otherwise there is no way to tell a MAC that (for
example) supports both SGMII and 1000BASE-X that the PCS only supports
1000BASE-X. As another example, the MAC could support half duplex, but
the PCS might only suppport full duplex.

However, the validation step is not purely just validation, but it's
negotiation too for SFPs to be able to work out what interface mode
they should use in combination with the support that the MAC/PCS
offers.

I do feel that the implementation around the validation/selection of
interface for SFP etc is starting to creak, and I've some patches that
introduce a bitmap of interface types that are supported by the various
components. I haven't had the motivation to finish that off as my last
attempt at making a phylink API change was not pleasant in terms of
either help updating network drivers or getting patches tested. So I
now try to avoid phylink API changes at all cost.

People have whinged that phylink's API changes too quickly... I'm
guessing we're now going to get other people arguing that it needs
change to fix issues like this...

I think it should change, and I can help test things out on my own
setup, for whatever that's worth.

At the very least, it should be clearer what things are allowed to fail
for what reasons. Several callbacks are void when things can fail under
the hood (e.g. link_up or an_restart). And the API seems to have been
primarily designed around PCSs which are tightly-coupled to their MACs.

--Sean