Re: [RFC net-next PATCH 16/16] net: sfp: Add quirk to ignore PHYs

From: Sean Anderson
Date: Tue Oct 05 2021 - 16:38:34 EST




On 10/5/21 3:12 PM, Russell King (Oracle) wrote:
On Tue, Oct 05, 2021 at 12:45:28PM -0400, Sean Anderson wrote:


On 10/5/21 6:33 AM, Russell King (Oracle) wrote:
> On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
> > Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
> > probe it, we might attach genphy anyway if addresses 2 and 3 return
> > something other than all 1s. To avoid this, add a quirk for these modules
> > so that we do not probe their PHY.
> >
> > The particular module in this case is a Finisar SFP-GB-GE-T. This module is
> > also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
> > manually. However, I do not believe that it has a PHY in the first place:
> >
> > $ i2cdump -y -r 0-31 $BUS 0x56 w
> > 0,8 1,9 2,a 3,b 4,c 5,d 6,e 7,f
> > 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
> > 08: fc48 000e ff78 0000 0000 0000 0000 00f0
> > 10: 7800 00bc 0000 401c 680c 0300 0000 0000
> > 18: ff41 0000 0a00 8890 0000 0000 0000 0000
>
> Actually, I think that is a PHY. It's byteswapped (which is normal using
> i2cdump in this way).The real contents of the registers are:
>
> 00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
> 08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
> 10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
> 18: 41ff 0000 000a 9088 0000 0000 0000 0000

Ah, thanks for catching this.

> It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
> advertising 1000BASE-T FD but no pause abilities.
>
> When comparing this with a Marvell 88e1111:
>
> 00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
> 08: 0000 0e00 4000 0000 0000 0000 0000 f000
> 10: 0078 8100 0000 0040 0568 0000 0000 0000
> 18: 4100 0000 0002 8084 0000 0000 0000 0000
>
> It looks remarkably similar. However, The first few reads seem to be
> corrupted with 0x01ff. It may be that the module is slow to allow the
> PHY to start responding - we've had similar with Champion One SFPs.

Do you have an an example of how to work around this? Even reading one
register at a time I still get the bogus 0x01ff. Reading bytewise, a
reasonable-looking upper byte is returned every other read, but the
lower byte is 0xff every time.

I think the Champion One modules just don't respond to the I2C
transactions, so we keep retrying for a while. We try every
50ms for 12 retries, which seems to be long enough for their
modules.

> It looks like it's a Marvell 88e1111. The register at 0x11 is the
> Marvell status register, and 0xbc00 indicates 1000Mbit, FD, AN
> resolved, link up which agrees with what's in the various other
> registers.

That matches some supplemental info on the manufacturer's website
(which was frustratingly not associated with the model number of
this particular module).

The interesting thing is, many modules use 88e1111, which is about
the only PHY that I'm aware that supports I2C access mode natively.
So, it's really surprising that you're getting corrupted data,
unless...

There's been a history of using too strong pull-ups on the SFP I2C
lines. The SFP MSA gives a minimum value of the resistors (4.7k).
SFP+ lowers the minimum value and raises the maximum clock frequency.
Some SFP modules are unable to drive the I2C bus low against the
lower resistances resulting in corrupted data (or worse, it can
corrupt the EEPROMs.)

There is a level shifter. Between the shifter and the SoC there were
1.8k (!) pull-ups, and between the shifter and the SFP there were 10k
pull-ups. I tried replacing the pull-ups between the SoC and the shifter
with 10k pull-ups, but noticed no difference. I have also noticed no
issues accessing the EEPROM, and I have not noticed any difference
accessing other registers (see below). Additionally, this same error is
"present" already in xgbe_phy_finisar_phy_quirks(), as noted in the
commit message.

Other problems on some platforms have been with I2C level shifters
locking up, but that doesn't look like what's happening here - they
lockup at logic low not logic high. Even so-called "impossible to
lockup" level shifters have locked up despite their manufacturer
stating that it is impossible.

Is it always the same addresses?

Yes.

What if you read from a different offset?

Same thing.

What if you re-read after it seems to have cleared?

Here are some various transfers which hopefully will clarify the
behavior:

First, reading two bytes at a time
$ i2ctransfer -y 2 w1@0x56 2 r2
0x01 0xff
This behavior is repeatable
$ i2ctransfer -y 2 w1@0x56 2 r2
0x01 0xff
Now, reading one byte at a time
$ i2ctransfer -y 2 w1@0x56 2 r1
0x01
A second write/single read gets us the first byte again.
$ i2ctransfer -y 2 w1@0x56 2 r1
0x41
And doing it for a third time gets us the first byte again.
$ i2ctransfer -y 2 w1@0x56 2 r1
0x01
If we start another one-byte read without writing the address, we get
the second byte
$ i2ctransfer -y 2 r1@0x56
0x41
And continuing this pattern, we get the next byte.
$ i2ctransfer -y 2 r1@0x56
0x0c
This can be repeated indefinitely
$ i2ctransfer -y 2 r1@0x56
0xc2
$ i2ctransfer -y 2 r1@0x56
0x0c
But stopping in the "middle" of a register fails
$ i2ctransfer -y 2 w1@0x56 2 r1
Error: Sending messages failed: Input/output error
We don't have to immediately read a byte:
$ i2ctransfer -y 2 w1@0x56 2
$ i2ctransfer -y 2 r1@0x56
0x01
$ i2ctransfer -y 2 r1@0x56
0x41
We can read two bytes indefinitely after "priming the pump"
$ i2ctransfer -y 2 w1@0x56 2 r1
0x01
$ i2ctransfer -y 2 r1@0x56
0x41
$ i2ctransfer -y 2 r2@0x56
0x0c 0xc2
$ i2ctransfer -y 2 r2@0x56
0x0c 0x01
$ i2ctransfer -y 2 r2@0x56
0x00 0x00
$ i2ctransfer -y 2 r2@0x56
0x00 0x04
$ i2ctransfer -y 2 r2@0x56
0x20 0x01
$ i2ctransfer -y 2 r2@0x56
0x00 0x00
But more than that "runs out"
$ i2ctransfer -y 2 w1@0x56 2 r1
0x01
$ i2ctransfer -y 2 r1@0x56
0x41
$ i2ctransfer -y 2 r4@0x56
0x0c 0xc2 0x0c 0x01
$ i2ctransfer -y 2 r4@0x56
0x00 0x00 0x00 0x04
$ i2ctransfer -y 2 r4@0x56
0x20 0x01 0xff 0xff
$ i2ctransfer -y 2 r4@0x56
0x01 0xff 0xff 0xff
However, the above multi-byte reads only works when starting at register
2 or greater.
$ i2ctransfer -y 2 w1@0x56 0 r1
0x01
$ i2ctransfer -y 2 r1@0x56
0x40
$ i2ctransfer -y 2 r2@0x56
0x01 0xff

Based on the above session, I believe that it may be best to treat this
phy as having an autoincrementing register address which must be read
one byte at a time, in multiples of two bytes. I think that existing SFP
phys may compatible with this, but unfortunately I do not have any on
hand to test with.

--Sean