Re: [PATCH net-next v5 1/8] net: phylink: Document MAC_(A)SYM_PAUSE

From: Sean Anderson
Date: Wed Sep 07 2022 - 18:39:54 EST


On 9/7/22 17:01, Russell King (Oracle) wrote:
On Wed, Sep 07, 2022 at 04:11:14PM -0400, Sean Anderson wrote:
On 9/7/22 2:04 PM, Russell King (Oracle) wrote:
MAC_SYM_PAUSE and MAC_ASYM_PAUSE are used when configuring our autonegotiation
advertisement. They correspond to the PAUSE and ASM_DIR bits defined by 802.3,
respectively.

My intention here is to clarify the relationship between the terminology. Your
proposed modification has "our autonegotiation advertisement" apply to PAUSE/ASM_DIR
instead of MAC_*_PAUSE, which is also confusing, since those bits can apply to either
party's advertisement.

Please amend to make it clearer.

Does what I proposed work?

+ * MAC_SYM_PAUSE MAC_ASYM_PAUSE Valid pause modes
+ * ============= ============== ==============================
+ * 0 0 MLO_PAUSE_NONE
+ * 0 1 MLO_PAUSE_NONE, MLO_PAUSE_TX
+ * 1 0 MLO_PAUSE_NONE, MLO_PAUSE_TXRX
+ * 1 1 MLO_PAUSE_NONE, MLO_PAUSE_TXRX,
+ * MLO_PAUSE_RX

Any of none, tx, txrx and rx can occur with both bits set in the last
case, the tx-only case will be due to user configuration.

What flow did you have in mind? According to the comment on linkmode_set_pause,
if ethtool requests tx-only, we will use MAC_ASYM_PAUSE for the advertising,
which is the second row above.

I think you're missing some points on the ethtool interface. Let me
go through it:

First, let's go through the man page:

autoneg on|off
Specifies whether pause autonegotiation should be enabled.

rx on|off
Specifies whether RX pause should be enabled.

tx on|off
Specifies whether TX pause should be enabled.

This is way too vague and doesn't convey very much inforamtion about
the function of these options. One can rightfully claim that it is
actually wrong and misleading, especially the first option, because
there is no way to control whether "pause autonegotiation should be
enabled." Either autonegotiation with the link partner is enabled
or it isn't.
Thankfully, the documentation of the field in struct
ethtool_pauseparam documents this more fully:

* If @autoneg is non-zero, the MAC is configured to send and/or
* receive pause frames according to the result of autonegotiation.
* Otherwise, it is configured directly based on the @rx_pause and
* @tx_pause flags.

So, autoneg controls whether the result of autonegotiation is used, or
we override the result of autonegotiation with the specified transmit
and receive settings.

The next issue with the man page is that it doesn't specify that tx
and rx control the advertisement of pause modes - and it doesn't
specify how. Again, the documentation of struct ethtool_pauseparam
helps somewhat:

* If the link is autonegotiated, drivers should use
* mii_advertise_flowctrl() or similar code to set the advertised
* pause frame capabilities based on the @rx_pause and @tx_pause flags,
* even if @autoneg is zero. They should also allow the advertised
* pause frame capabilities to be controlled directly through the
* advertising field of &struct ethtool_cmd.

So:

1. in the case of autoneg=0:
1a. local end's enablement of tx and rx pause frames depends solely
on the capabilities of the network adapter and the tx and rx
parameters, ignoring the results of any autonegotiation
resolution.
1b. the behaviour in mii_advertise_flowctrl() or similar code shall
be used to derive the advertisement, which results in the
tx=1 rx=0 case advertising ASYM_DIR only which does not tie up
with what we actually end up configuring on the local end.

2. in the case of autoneg=1, the tx and rx parameters are used to
derive the advertisement as in 1b and the results of
autonegotiation resolution are used.

The full behaviour of mii_advertise_flowctrl() is:

ethtool local advertisement possible autoneg resolutions
rx tx Pause AsymDir
0 0 0 0 !tx !rx
1 0 1 1 !tx !rx, !tx rx, tx rx
0 1 0 1 !tx !rx, tx !rx
1 1 1 0 !tx !rx, tx rx

but as I say, the "possible autoneg resolutions" and table 28B-3
is utterly meaningless when ethtool specifies "autoneg off" for
the pause settings.

So, "ethtool -A autoneg off tx on rx off" will result in an
advertisement with PAUSE=0 ASYM_DIR=1 and we force the local side
to enable transmit pause and disabel receive pause no matter what
the remote side's advertisement is.

I hope this clears the point up.

My intent here is to provide some help for driver authors when they
need to fill in their mac capabilities. The driver author probably
knows things like "My device supports MLO_PAUSE_TX and MLO_PAUSE_TXRX
but not MLO_PAUSE_RX." They have to translate that into the correct
values for MAC_*_PAUSE. When the user starts messing with this process,
it's no longer the driver author's problem whether the result is sane
or not.

Given that going from tx/rx back to pause/asym_dir bits is not trivial
(because the translation depends on the remote advertisement) it is
highly unlikely that the description would frame the support in terms
of whether the hardware can transmit and/or receive pause frames.

I think it is? Usually if both symmetric and asymmetric pause is
possible then there are two PAUSE_TX and PAUSE_RX fields in a register
somewhere. Similarly, if there is only symmetric pause, then there is a
PAUSE_EN bit in a register. And if only one of TX and RX is possible,
then there will only be one field. There are a few drivers where you
program the advertisement and let the hardware do the rest, but even
then there's usually a manual mode (which should be enabled by the
poorly-documented permit_pause_to_mac parameter).

However, it is not obvious (at least it wasn't to me)

- That MAC_SYM_PAUSE/MAC_ASYM_PAUSE control to the PAUSE and ASYM_DIR
bits (when MLO_PAUSE_AN is set).
- How MAC_*_PAUSE related to the resolved pause mode in mac_link_up.

Note from the table above, it is not possible to advertise that you
do not support transmission of pause frames.

Just don't set either of MAC_*_PAUSE :)

Of course, hardware manufacturers are hopefully aware that only half of
the possible combinations are supported and don't produce hardware with
capabilities that can't be advertised.


How about

The following table lists the values of tx_pause and rx_pause which
might be requested in mac_link_up depending on the results of> autonegotiation (when MLO_PAUSE_AN is set):>
MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
============= ============== ======== ========
0 0 0 0
0 1 0 0> 1 0
1 0 0 0
1 1> 1 1 0 0
0 1
1 1

When MLO_PAUSE_AN is not set, any combination of tx_pause and> rx_pause may be requested. This depends on user configuration,
without regard to the values of MAC_SYM_PAUSE and MAC_ASYM_PAUSE.

The above is how I'm viewing this, and because of the broken formatting,
it's impossible to make sense of, sorry.

Sorry, my mail client mangled it. Second attempt:

MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
============= ============== ======== ========
0 0 0 0
0 1 0 0
1 0
1 0 0 0
1 1
1 1 0 0
0 1
1 1

Perhaps there should be a note either here or in mac_link_up documenting
what to do if e.g. the user requests just MLO_PAUSE_TX but only symmetric
pause is supported. In mvneta_mac_link_up we enable symmetric pause if
either tx_pause or rx_pause is requested.

If the MAC only supports symmetric pause, the logic in phylink ensures
that the MAC will always be called with tx_pause == rx_pause:
- it will fail attempts by ethtool to set autoneg off with different rx
and tx settings.
- we will only advertise support for symmetric pause, for which there
are only two autonegotiation outcomes, both of which satisfy the
requirement that tx_pause == rx_pause.

So, if a MAC only supports symmetric pause, it can key off either of
these two flags as it is guaranteed that they will be identical in
for a MAC that only supports symmetric pause.

OK, so taking that into account then perhaps the post-table explanation
should be reworded to

When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is set, any
combination of tx_pause and rx_pause may be requested. This depends on
user configuration, without regard to the value of MAC_SYM_PAUSE. When
When MLO_PAUSE_AN is not set and MAC_ASYM_PAUSE is also unset, then tx_pause and rx_pause will still depend on user configuration, but
will always equal each other.

Or maybe the above table should be extended to be

MLO_PAUSE_AN MAC_SYM_PAUSE MAC_ASYM_PAUSE tx_pause rx_pause
============ ============= ============== ======== ========
0 0 0 0 0
0 0 1 0 0
1 0
0 1 0 0 0
1 1
0 1 1 0 0
0 1
1 1
1 0 0 0 0
1 X 1 X X
1 1 0 0 0
1 1

With a note like

When MLO_PAUSE_AN is not set, the values of tx_pause and rx_pause
depend on user configuration. When MAC_ASYM_PAUSE is not set, tx_pause
and rx_pause will be restricted to be either both enabled or both
disabled. Otherwise, no restrictions are placed on their values,
allowing configurations which would not be attainable as a result of
autonegotiation.

IMO we should really switch to something like MAX_RX_PAUSE,
MAC_TX_PAUSE, MAC_RXTX_PAUSE and let phylink handle all the details of
turning that into sane advertisement. This would also let us return
-EINVAL in phylink_ethtool_set_pauseparam when the user requests e.g.
TX-only pause when the MAC only supports RX and RXTX.

Adding in the issue of rate adaption (sorry, I use "adaption" not
"adaptation" which I find rather irksome as in my - and apparently
a subsection of English speakers, the two have slightly different
meanings)

802.3 calls it "rate adaptation" in clause 49 (10GBASE-R) and "rate
matching" in clause 61 (10PASS-TL and 2BASE-TS). If you are opposed to
the former, then I think the latter could also work. It's also shorter,
which is definitely a plus.

Interestingly, wiktionary (with which I attempted to determine what that
slightly-different meaning was) labels "adaption" as "rare" :)

brings with it the problem that when using pause frames,
we need RX pause enabled, but on a MAC which only supports symmetric
pause, we can't enable RX pause without also transmitting pause frames.
So I would say such a setup was fundamentally mis-designed, and there's
little we can do to correct such a stupidity. Should we detect such
stupidities? Maybe, but what then? Refuse to function?

Previous discussion [1]. Right now we refuse to turn on rate adaptation
if the MAC only supports symmetric pause. The maximally-robust solution
would be to first try and autonegotiate with rate adaptation enabled and
using symmetric pause, and then renegotiate without either enabled. I
think that's significantly more complex, so I propose deferring such an
implementation to whoever first complains about their link not being
rate-adapted.

--Sean

[1] https://lore.kernel.org/netdev/4172fd87-8e51-e67d-bf86-fdc6829fa9b3@xxxxxxxx/