[RFC,net-next,16/16] net: sfp: Add quirk to ignore PHYs

Message ID	20211004191527.1610759-17-sean.anderson@seco.com (mailing list archive)
State	RFC
Delegated to:	Netdev Maintainers
Headers	show Return-Path: <netdev-owner@kernel.org> From: Sean Anderson <sean.anderson@seco.com> To: netdev@vger.kernel.org, "David S . Miller" <davem@davemloft.net>, Jakub Kicinski <kuba@kernel.org>, linux-kernel@vger.kernel.org Cc: Andrew Lunn <andrew@lunn.ch>, Heiner Kallweit <hkallweit1@gmail.com>, Russell King <linux@armlinux.org.uk>, Sean Anderson <sean.anderson@seco.com> Subject: [RFC net-next PATCH 16/16] net: sfp: Add quirk to ignore PHYs Date: Mon, 4 Oct 2021 15:15:27 -0400 Message-Id: <20211004191527.1610759-17-sean.anderson@seco.com> In-Reply-To: <20211004191527.1610759-1-sean.anderson@seco.com> References: <20211004191527.1610759-1-sean.anderson@seco.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain MIME-Version: 1.0 Precedence: bulk
Series	Add support for Xilinx PCS \| expand [RFC,net-next,00/16] Add support for Xilinx PCS [RFC,net-next,01/16] dt-bindings: net: Add pcs property [RFC,net-next,02/16] dt-bindings: net: Add binding for Xilinx PCS [RFC,net-next,03/16] net: sfp: Fix typo in state machine debug string [RFC,net-next,04/16] net: phylink: Move phylink_set_pcs before phylink_create [RFC,net-next,05/16] net: phylink: Automatically attach PCS devices [RFC,net-next,06/16] net: phylink: Add function for optionally adding a PCS [RFC,net-next,07/16] net: phylink: Add helpers for c22 registers without MDIO [RFC,net-next,08/16] net: macb: Clean up macb_validate [RFC,net-next,09/16] net: macb: Move most of mac_prepare to mac_config [RFC,net-next,10/16] net: macb: Move PCS settings to PCS callbacks [RFC,net-next,11/16] net: macb: Support restarting PCS autonegotiation [RFC,net-next,12/16] net: macb: Support external PCSs [RFC,net-next,13/16] net: phy: Export get_phy_c22_id [RFC,net-next,14/16] net: mdio: Add helper functions for accessing MDIO devices [RFC,net-next,15/16] net: pcs: Add Xilinx PCS driver [RFC,net-next,16/16] net: sfp: Add quirk to ignore PHYs

Sean Anderson Oct. 4, 2021, 7:15 p.m. UTC

Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
probe it, we might attach genphy anyway if addresses 2 and 3 return
something other than all 1s. To avoid this, add a quirk for these modules
so that we do not probe their PHY.

The particular module in this case is a Finisar SFP-GB-GE-T. This module is
also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
manually. However, I do not believe that it has a PHY in the first place:

$ i2cdump -y -r 0-31 $BUS 0x56 w
     0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
08: fc48 000e ff78 0000 0000 0000 0000 00f0
10: 7800 00bc 0000 401c 680c 0300 0000 0000
18: ff41 0000 0a00 8890 0000 0000 0000 0000

The first several addresses contain the same value, which should almost
never be the case for a proper phy. In addition, the "OUI" 00-7F-C3 does
not match Finisar's OUI of 00-90-65 (or any other OUI for that matter).

Signed-off-by: Sean Anderson <sean.anderson@seco.com>
---

 drivers/net/phy/sfp-bus.c | 12 +++++++++++-
 drivers/net/phy/sfp.c     |  3 +++
 2 files changed, 14 insertions(+), 1 deletion(-)

Andrew Lunn Oct. 4, 2021, 10:01 p.m. UTC | #1

On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
> Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
> probe it, we might attach genphy anyway if addresses 2 and 3 return
> something other than all 1s. To avoid this, add a quirk for these modules
> so that we do not probe their PHY.
> 
> The particular module in this case is a Finisar SFP-GB-GE-T. This module is
> also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
> manually. However, I do not believe that it has a PHY in the first place:
> 
> $ i2cdump -y -r 0-31 $BUS 0x56 w
>      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
> 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
> 08: fc48 000e ff78 0000 0000 0000 0000 00f0
> 10: 7800 00bc 0000 401c 680c 0300 0000 0000
> 18: ff41 0000 0a00 8890 0000 0000 0000 0000
> 
> The first several addresses contain the same value, which should almost
> never be the case for a proper phy. In addition, the "OUI" 00-7F-C3 does
> not match Finisar's OUI of 00-90-65 (or any other OUI for that matter).
> 
> Signed-off-by: Sean Anderson <sean.anderson@seco.com>

Hi Sean

This does not really have anything to do with PCS. I would send it on
its own.

    Andrew

Russell King (Oracle) Oct. 5, 2021, 10:33 a.m. UTC | #2

On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
> Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
> probe it, we might attach genphy anyway if addresses 2 and 3 return
> something other than all 1s. To avoid this, add a quirk for these modules
> so that we do not probe their PHY.
> 
> The particular module in this case is a Finisar SFP-GB-GE-T. This module is
> also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
> manually. However, I do not believe that it has a PHY in the first place:
> 
> $ i2cdump -y -r 0-31 $BUS 0x56 w
>      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
> 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
> 08: fc48 000e ff78 0000 0000 0000 0000 00f0
> 10: 7800 00bc 0000 401c 680c 0300 0000 0000
> 18: ff41 0000 0a00 8890 0000 0000 0000 0000

Actually, I think that is a PHY. It's byteswapped (which is normal using
i2cdump in this way). The real contents of the registers are:

00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
18: 41ff 0000 000a 9088 0000 0000 0000 0000

It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
advertising 1000BASE-T FD but no pause abilities.

When comparing this with a Marvell 88e1111:

00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
08: 0000 0e00 4000 0000 0000 0000 0000 f000
10: 0078 8100 0000 0040 0568 0000 0000 0000
18: 4100 0000 0002 8084 0000 0000 0000 0000

It looks remarkably similar. However, The first few reads seem to be
corrupted with 0x01ff. It may be that the module is slow to allow the
PHY to start responding - we've had similar with Champion One SFPs.

It looks like it's a Marvell 88e1111. The register at 0x11 is the
Marvell status register, and 0xbc00 indicates 1000Mbit, FD, AN
resolved, link up which agrees with what's in the various other
registers.

Sean Anderson Oct. 5, 2021, 4:45 p.m. UTC | #3

On 10/5/21 6:33 AM, Russell King (Oracle) wrote:
> On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
>> Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
>> probe it, we might attach genphy anyway if addresses 2 and 3 return
>> something other than all 1s. To avoid this, add a quirk for these modules
>> so that we do not probe their PHY.
>>
>> The particular module in this case is a Finisar SFP-GB-GE-T. This module is
>> also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
>> manually. However, I do not believe that it has a PHY in the first place:
>>
>> $ i2cdump -y -r 0-31 $BUS 0x56 w
>>      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
>> 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
>> 08: fc48 000e ff78 0000 0000 0000 0000 00f0
>> 10: 7800 00bc 0000 401c 680c 0300 0000 0000
>> 18: ff41 0000 0a00 8890 0000 0000 0000 0000
>
> Actually, I think that is a PHY. It's byteswapped (which is normal using
> i2cdump in this way).The real contents of the registers are:
>
> 00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
> 08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
> 10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
> 18: 41ff 0000 000a 9088 0000 0000 0000 0000

Ah, thanks for catching this.

> It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
> advertising 1000BASE-T FD but no pause abilities.
>
> When comparing this with a Marvell 88e1111:
>
> 00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
> 08: 0000 0e00 4000 0000 0000 0000 0000 f000
> 10: 0078 8100 0000 0040 0568 0000 0000 0000
> 18: 4100 0000 0002 8084 0000 0000 0000 0000
>
> It looks remarkably similar. However, The first few reads seem to be
> corrupted with 0x01ff. It may be that the module is slow to allow the
> PHY to start responding - we've had similar with Champion One SFPs.

Do you have an an example of how to work around this? Even reading one
register at a time I still get the bogus 0x01ff. Reading bytewise, a
reasonable-looking upper byte is returned every other read, but the
lower byte is 0xff every time.

> It looks like it's a Marvell 88e1111. The register at 0x11 is the
> Marvell status register, and 0xbc00 indicates 1000Mbit, FD, AN
> resolved, link up which agrees with what's in the various other
> registers.

That matches some supplemental info on the manufacturer's website (which
was frustratingly not associated with the model number of this
particular module).

--Sean

Sean Anderson Oct. 5, 2021, 6:10 p.m. UTC | #4

On 10/5/21 12:45 PM, Sean Anderson wrote:
>
>
> On 10/5/21 6:33 AM, Russell King (Oracle) wrote:
>> On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
>>> Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
>>> probe it, we might attach genphy anyway if addresses 2 and 3 return
>>> something other than all 1s. To avoid this, add a quirk for these modules
>>> so that we do not probe their PHY.
>>>
>>> The particular module in this case is a Finisar SFP-GB-GE-T. This module is
>>> also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
>>> manually. However, I do not believe that it has a PHY in the first place:
>>>
>>> $ i2cdump -y -r 0-31 $BUS 0x56 w
>>>      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
>>> 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
>>> 08: fc48 000e ff78 0000 0000 0000 0000 00f0
>>> 10: 7800 00bc 0000 401c 680c 0300 0000 0000
>>> 18: ff41 0000 0a00 8890 0000 0000 0000 0000
>>
>> Actually, I think that is a PHY. It's byteswapped (which is normal using
>> i2cdump in this way).The real contents of the registers are:
>>
>> 00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
>> 08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
>> 10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
>> 18: 41ff 0000 000a 9088 0000 0000 0000 0000
>
> Ah, thanks for catching this.
>
>> It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
>> advertising 1000BASE-T FD but no pause abilities.
>>
>> When comparing this with a Marvell 88e1111:
>>
>> 00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
>> 08: 0000 0e00 4000 0000 0000 0000 0000 f000
>> 10: 0078 8100 0000 0040 0568 0000 0000 0000
>> 18: 4100 0000 0002 8084 0000 0000 0000 0000
>>
>> It looks remarkably similar. However, The first few reads seem to be
>> corrupted with 0x01ff. It may be that the module is slow to allow the
>> PHY to start responding - we've had similar with Champion One SFPs.
>
> Do you have an an example of how to work around this? Even reading one
> register at a time I still get the bogus 0x01ff. Reading bytewise, a
> reasonable-looking upper byte is returned every other read, but the
> lower byte is 0xff every time.

Ok, upon further experimentation, I can read out something reasonable by using the "consecutive byte" mode

# i2cdump -y -r 0-0x3f 2 0x56 c
      0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f    0123456789abcdef
00: 01 40 01 6d 01 41 0c c2 0c 01 c5 e1 00 0d 20 01    ?@?m?A??????.? ?
10: 59 f9 0e 00 78 00 00 00 00 00 00 00 00 00 f0 00    Y??.x.........?.
20: 00 78 ac 40 00 00 00 00 0c 68 00 00 00 00 00 00    .x?@....?h......
30: 41 00 00 00 00 0a 90 88 00 00 00 00 00 00 00 00    A....???........

I believe this is just doing i2c_smbus_write_byte+i2c_smbus_read_byte

S Addr Wr [A] Phyaddr [A] P
S Addr Rd [A] [DataHigh] NA P
S Addr Rd [A] [DataLow] NA P

as opposed to i2c_smbus_read_word_data

S Addr Wr [A] Phyaddr [A]
S Addr Rd [A] [DataHigh] A [DataLow] NA P

or i2c_smbus_read_word_data

S Addr Wr [A] Phyaddr [A]
S Addr Rd [A] [DataHigh] NA P
S Addr Wr [A] Phyaddr [A]
S Addr Rd [A] [DataLow] NA P

So now I suppose the question is whether replacing the existing i2c
logic with something like

	i2c_smbus_write_byte(i2c, i2c_mii_phy_addr(phy_id));
	i2c_smbus_read_byte(i2c);
	i2c_smbus_read_byte(i2c);

will break everyone else's SFP phys. If it does, this could be difficult
to do as a quirk because the MII-I2C bus is created before we read the
SFP EEPROM.

--Sean

Russell King (Oracle) Oct. 5, 2021, 7:12 p.m. UTC | #5

On Tue, Oct 05, 2021 at 12:45:28PM -0400, Sean Anderson wrote:
> 
> 
> On 10/5/21 6:33 AM, Russell King (Oracle) wrote:
> > On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
> > > Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
> > > probe it, we might attach genphy anyway if addresses 2 and 3 return
> > > something other than all 1s. To avoid this, add a quirk for these modules
> > > so that we do not probe their PHY.
> > > 
> > > The particular module in this case is a Finisar SFP-GB-GE-T. This module is
> > > also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
> > > manually. However, I do not believe that it has a PHY in the first place:
> > > 
> > > $ i2cdump -y -r 0-31 $BUS 0x56 w
> > >      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
> > > 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
> > > 08: fc48 000e ff78 0000 0000 0000 0000 00f0
> > > 10: 7800 00bc 0000 401c 680c 0300 0000 0000
> > > 18: ff41 0000 0a00 8890 0000 0000 0000 0000
> > 
> > Actually, I think that is a PHY. It's byteswapped (which is normal using
> > i2cdump in this way).The real contents of the registers are:
> > 
> > 00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
> > 08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
> > 10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
> > 18: 41ff 0000 000a 9088 0000 0000 0000 0000
> 
> Ah, thanks for catching this.
> 
> > It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
> > advertising 1000BASE-T FD but no pause abilities.
> > 
> > When comparing this with a Marvell 88e1111:
> > 
> > 00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
> > 08: 0000 0e00 4000 0000 0000 0000 0000 f000
> > 10: 0078 8100 0000 0040 0568 0000 0000 0000
> > 18: 4100 0000 0002 8084 0000 0000 0000 0000
> > 
> > It looks remarkably similar. However, The first few reads seem to be
> > corrupted with 0x01ff. It may be that the module is slow to allow the
> > PHY to start responding - we've had similar with Champion One SFPs.
> 
> Do you have an an example of how to work around this? Even reading one
> register at a time I still get the bogus 0x01ff. Reading bytewise, a
> reasonable-looking upper byte is returned every other read, but the
> lower byte is 0xff every time.

I think the Champion One modules just don't respond to the I2C
transactions, so we keep retrying for a while. We try every
50ms for 12 retries, which seems to be long enough for their
modules.

> > It looks like it's a Marvell 88e1111. The register at 0x11 is the
> > Marvell status register, and 0xbc00 indicates 1000Mbit, FD, AN
> > resolved, link up which agrees with what's in the various other
> > registers.
> 
> That matches some supplemental info on the manufacturer's website
> (which was frustratingly not associated with the model number of
> this particular module).

The interesting thing is, many modules use 88e1111, which is about
the only PHY that I'm aware that supports I2C access mode natively.
So, it's really surprising that you're getting corrupted data,
unless...

There's been a history of using too strong pull-ups on the SFP I2C
lines. The SFP MSA gives a minimum value of the resistors (4.7k).
SFP+ lowers the minimum value and raises the maximum clock frequency.
Some SFP modules are unable to drive the I2C bus low against the
lower resistances resulting in corrupted data (or worse, it can
corrupt the EEPROMs.)

Other problems on some platforms have been with I2C level shifters
locking up, but that doesn't look like what's happening here - they
lockup at logic low not logic high. Even so-called "impossible to
lockup" level shifters have locked up despite their manufacturer
stating that it is impossible.

Is it always the same addresses? What if you read from a different
offset? What if you re-read after it seems to have cleared?

Sean Anderson Oct. 5, 2021, 8:38 p.m. UTC | #6

On 10/5/21 3:12 PM, Russell King (Oracle) wrote:
> On Tue, Oct 05, 2021 at 12:45:28PM -0400, Sean Anderson wrote:
>>
>>
>> On 10/5/21 6:33 AM, Russell King (Oracle) wrote:
>> > On Mon, Oct 04, 2021 at 03:15:27PM -0400, Sean Anderson wrote:
>> > > Some modules have something at SFP_PHY_ADDR which isn't a PHY. If we try to
>> > > probe it, we might attach genphy anyway if addresses 2 and 3 return
>> > > something other than all 1s. To avoid this, add a quirk for these modules
>> > > so that we do not probe their PHY.
>> > >
>> > > The particular module in this case is a Finisar SFP-GB-GE-T. This module is
>> > > also worked around in xgbe_phy_finisar_phy_quirks() by setting the support
>> > > manually. However, I do not believe that it has a PHY in the first place:
>> > >
>> > > $ i2cdump -y -r 0-31 $BUS 0x56 w
>> > >      0,8  1,9  2,a  3,b  4,c  5,d  6,e  7,f
>> > > 00: ff01 ff01 ff01 c20c 010c 01c0 0f00 0120
>> > > 08: fc48 000e ff78 0000 0000 0000 0000 00f0
>> > > 10: 7800 00bc 0000 401c 680c 0300 0000 0000
>> > > 18: ff41 0000 0a00 8890 0000 0000 0000 0000
>> >
>> > Actually, I think that is a PHY. It's byteswapped (which is normal using
>> > i2cdump in this way).The real contents of the registers are:
>> >
>> > 00: 01ff 01ff 01ff 0cc2 0c01 c001 000f 2001
>> > 08: 48fc 0e00 78ff 0000 0000 0000 0000 f000
>> > 10: 0078 bc00 0000 1c40 0c68 0003 0000 0000
>> > 18: 41ff 0000 000a 9088 0000 0000 0000 0000
>>
>> Ah, thanks for catching this.
>>
>> > It's advertising pause + asym pause, 1000BASE-T FD, link partner is also
>> > advertising 1000BASE-T FD but no pause abilities.
>> >
>> > When comparing this with a Marvell 88e1111:
>> >
>> > 00: 1140 7949 0141 0cc2 05e1 0000 0004 2001
>> > 08: 0000 0e00 4000 0000 0000 0000 0000 f000
>> > 10: 0078 8100 0000 0040 0568 0000 0000 0000
>> > 18: 4100 0000 0002 8084 0000 0000 0000 0000
>> >
>> > It looks remarkably similar. However, The first few reads seem to be
>> > corrupted with 0x01ff. It may be that the module is slow to allow the
>> > PHY to start responding - we've had similar with Champion One SFPs.
>>
>> Do you have an an example of how to work around this? Even reading one
>> register at a time I still get the bogus 0x01ff. Reading bytewise, a
>> reasonable-looking upper byte is returned every other read, but the
>> lower byte is 0xff every time.
>
> I think the Champion One modules just don't respond to the I2C
> transactions, so we keep retrying for a while. We try every
> 50ms for 12 retries, which seems to be long enough for their
> modules.
>
>> > It looks like it's a Marvell 88e1111. The register at 0x11 is the
>> > Marvell status register, and 0xbc00 indicates 1000Mbit, FD, AN
>> > resolved, link up which agrees with what's in the various other
>> > registers.
>>
>> That matches some supplemental info on the manufacturer's website
>> (which was frustratingly not associated with the model number of
>> this particular module).
>
> The interesting thing is, many modules use 88e1111, which is about
> the only PHY that I'm aware that supports I2C access mode natively.
> So, it's really surprising that you're getting corrupted data,
> unless...
>
> There's been a history of using too strong pull-ups on the SFP I2C
> lines. The SFP MSA gives a minimum value of the resistors (4.7k).
> SFP+ lowers the minimum value and raises the maximum clock frequency.
> Some SFP modules are unable to drive the I2C bus low against the
> lower resistances resulting in corrupted data (or worse, it can
> corrupt the EEPROMs.)

There is a level shifter. Between the shifter and the SoC there were
1.8k (!) pull-ups, and between the shifter and the SFP there were 10k
pull-ups. I tried replacing the pull-ups between the SoC and the shifter
with 10k pull-ups, but noticed no difference. I have also noticed no
issues accessing the EEPROM, and I have not noticed any difference
accessing other registers (see below). Additionally, this same error is
"present" already in xgbe_phy_finisar_phy_quirks(), as noted in the
commit message.

> Other problems on some platforms have been with I2C level shifters
> locking up, but that doesn't look like what's happening here - they
> lockup at logic low not logic high. Even so-called "impossible to
> lockup" level shifters have locked up despite their manufacturer
> stating that it is impossible.
>
> Is it always the same addresses?

Yes.

> What if you read from a different offset?

Same thing.

> What if you re-read after it seems to have cleared?

Here are some various transfers which hopefully will clarify the
behavior:

First, reading two bytes at a time
	$ i2ctransfer -y 2 w1@0x56 2 r2
	0x01 0xff
This behavior is repeatable
	$ i2ctransfer -y 2 w1@0x56 2 r2
	0x01 0xff
Now, reading one byte at a time
	$ i2ctransfer -y 2 w1@0x56 2 r1
	0x01
A second write/single read gets us the first byte again.
	$ i2ctransfer -y 2 w1@0x56 2 r1
	0x41
And doing it for a third time gets us the first byte again.
	$ i2ctransfer -y 2 w1@0x56 2 r1
	0x01
If we start another one-byte read without writing the address, we get
the second byte
	$ i2ctransfer -y 2 r1@0x56
	0x41
And continuing this pattern, we get the next byte.
	$ i2ctransfer -y 2 r1@0x56
	0x0c
This can be repeated indefinitely
	$ i2ctransfer -y 2 r1@0x56
	0xc2
	$ i2ctransfer -y 2 r1@0x56
	0x0c
But stopping in the "middle" of a register fails
	$ i2ctransfer -y 2 w1@0x56 2 r1
	Error: Sending messages failed: Input/output error
We don't have to immediately read a byte:
	$ i2ctransfer -y 2 w1@0x56 2
	$ i2ctransfer -y 2 r1@0x56
	0x01
	$ i2ctransfer -y 2 r1@0x56
	0x41
We can read two bytes indefinitely after "priming the pump"
	$ i2ctransfer -y 2 w1@0x56 2 r1
	0x01
	$ i2ctransfer -y 2 r1@0x56
	0x41
	$ i2ctransfer -y 2 r2@0x56
	0x0c 0xc2
	$ i2ctransfer -y 2 r2@0x56
	0x0c 0x01
	$ i2ctransfer -y 2 r2@0x56
	0x00 0x00
	$ i2ctransfer -y 2 r2@0x56
	0x00 0x04
	$ i2ctransfer -y 2 r2@0x56
	0x20 0x01
	$ i2ctransfer -y 2 r2@0x56
	0x00 0x00
But more than that "runs out"
	$ i2ctransfer -y 2 w1@0x56 2 r1
	0x01
	$ i2ctransfer -y 2 r1@0x56
	0x41
	$ i2ctransfer -y 2 r4@0x56
	0x0c 0xc2 0x0c 0x01
	$ i2ctransfer -y 2 r4@0x56
	0x00 0x00 0x00 0x04
	$ i2ctransfer -y 2 r4@0x56
	0x20 0x01 0xff 0xff
	$ i2ctransfer -y 2 r4@0x56
	0x01 0xff 0xff 0xff
However, the above multi-byte reads only works when starting at register
2 or greater.
	$ i2ctransfer -y 2 w1@0x56 0 r1
	0x01
	$ i2ctransfer -y 2 r1@0x56
	0x40
	$ i2ctransfer -y 2 r2@0x56
	0x01 0xff

Based on the above session, I believe that it may be best to treat this
phy as having an autoincrementing register address which must be read
one byte at a time, in multiples of two bytes. I think that existing SFP
phys may compatible with this, but unfortunately I do not have any on
hand to test with.

--Sean

Russell King (Oracle) Oct. 5, 2021, 10:17 p.m. UTC | #7

On Tue, Oct 05, 2021 at 04:38:23PM -0400, Sean Anderson wrote:
> There is a level shifter. Between the shifter and the SoC there were
> 1.8k (!) pull-ups, and between the shifter and the SFP there were 10k
> pull-ups. I tried replacing the pull-ups between the SoC and the shifter
> with 10k pull-ups, but noticed no difference. I have also noticed no
> issues accessing the EEPROM, and I have not noticed any difference
> accessing other registers (see below). Additionally, this same error is
> "present" already in xgbe_phy_finisar_phy_quirks(), as noted in the
> commit message.

Hmm, thanks for checking. So it's something "weird" that this module
is doing.

As I say, the 88E1111 has a native I2C mode, it sounds like they're not
using it but have their own, seemingly broken, protocol conversion from
the I2C bus to MDIO. I've opened and traced the I2C connections on this
module - they only go to an EEPROM and the 88E1111, so we know this is
a "genuine" 88E1111 in I2C mode we are talking to.

> First, reading two bytes at a time
> 	$ i2ctransfer -y 2 w1@0x56 2 r2
> 	0x01 0xff
> This behavior is repeatable
> 	$ i2ctransfer -y 2 w1@0x56 2 r2
> 	0x01 0xff
> Now, reading one byte at a time
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	0x01
> A second write/single read gets us the first byte again.
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	0x41

I think you mean you get the other half of the first word. When I try
this with a 88E1111 directly connected to the I2C bus, I get:

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r2
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r2
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
0x01
root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
0x01

So a completely different behaviour. Continuing...

> And doing it for a third time gets us the first byte again.
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	0x01
> If we start another one-byte read without writing the address, we get
> the second byte
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x41

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
0x01
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01

Again, different behaviour.

> And continuing this pattern, we get the next byte.
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x0c
> This can be repeated indefinitely
> 	$ i2ctransfer -y 2 r1@0x56
> 	0xc2
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x0c

root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x41
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01

Here we eventually start toggling between the high and low bytes of
the word.

> But stopping in the "middle" of a register fails
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	Error: Sending messages failed: Input/output error

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
0x01

No error for me.

> We don't have to immediately read a byte:
> 	$ i2ctransfer -y 2 w1@0x56 2
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x01
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x41

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01

Again, no toggling between high/low bytes of the word.

> We can read two bytes indefinitely after "priming the pump"
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	0x01
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x41
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x0c 0xc2
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x0c 0x01
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x00 0x00
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x00 0x04
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x20 0x01
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x00 0x00

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41
root@clearfog21:~# i2ctransfer -y 1 r2@0x56
0x01 0x41

No auto-increment of the register.

> But more than that "runs out"
> 	$ i2ctransfer -y 2 w1@0x56 2 r1
> 	0x01
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x41
> 	$ i2ctransfer -y 2 r4@0x56
> 	0x0c 0xc2 0x0c 0x01
> 	$ i2ctransfer -y 2 r4@0x56
> 	0x00 0x00 0x00 0x04
> 	$ i2ctransfer -y 2 r4@0x56
> 	0x20 0x01 0xff 0xff
> 	$ i2ctransfer -y 2 r4@0x56
> 	0x01 0xff 0xff 0xff

root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r1@0x56
0x01
root@clearfog21:~# i2ctransfer -y 1 r4@0x56
0x01 0x41 0x0c 0xc2
root@clearfog21:~# i2ctransfer -y 1 r4@0x56
0x01 0x41 0x0c 0xc2
root@clearfog21:~# i2ctransfer -y 1 r4@0x56
0x01 0x41 0x0c 0xc2
root@clearfog21:~# i2ctransfer -y 1 r4@0x56
0x01 0x41 0x0c 0xc2

> However, the above multi-byte reads only works when starting at register
> 2 or greater.
> 	$ i2ctransfer -y 2 w1@0x56 0 r1
> 	0x01
> 	$ i2ctransfer -y 2 r1@0x56
> 	0x40
> 	$ i2ctransfer -y 2 r2@0x56
> 	0x01 0xff
> 
> Based on the above session, I believe that it may be best to treat this
> phy as having an autoincrementing register address which must be read
> one byte at a time, in multiples of two bytes. I think that existing SFP
> phys may compatible with this, but unfortunately I do not have any on
> hand to test with.

Sadly, according to my results above, I think your module is doing
something strange with the 88E1111.

You say that it's Finisar, but I can only find this module in
Fiberstore's website: https://www.fs.com/uk/products/20057.html
Fiberstore commonly use "FS" in the vendor field.

You have me wondering what they've done to this PHY to make it respond
in the way you are seeing.

Sean Anderson Oct. 5, 2021, 11:16 p.m. UTC | #8

On 10/5/21 6:17 PM, Russell King (Oracle) wrote:
> On Tue, Oct 05, 2021 at 04:38:23PM -0400, Sean Anderson wrote:
>> There is a level shifter. Between the shifter and the SoC there were
>> 1.8k (!) pull-ups, and between the shifter and the SFP there were 10k
>> pull-ups. I tried replacing the pull-ups between the SoC and the shifter
>> with 10k pull-ups, but noticed no difference. I have also noticed no
>> issues accessing the EEPROM, and I have not noticed any difference
>> accessing other registers (see below). Additionally, this same error is
>> "present" already in xgbe_phy_finisar_phy_quirks(), as noted in the
>> commit message.
>
> Hmm, thanks for checking. So it's something "weird" that this module
> is doing.
>
> As I say, the 88E1111 has a native I2C mode, it sounds like they're not
> using it but have their own, seemingly broken, protocol conversion from
> the I2C bus to MDIO. I've opened and traced the I2C connections on this
> module - they only go to an EEPROM and the 88E1111, so we know this is
> a "genuine" 88E1111 in I2C mode we are talking to.

Well, I had a look inside mine and it had a "Custom Code/Die Revision"
of B2. Nothing else unusual. Just the PHY, magnetics, EEPROM, crystal,
and a regulator.

>> First, reading two bytes at a time
>> 	$ i2ctransfer -y 2 w1@0x56 2 r2
>> 	0x01 0xff
>> This behavior is repeatable
>> 	$ i2ctransfer -y 2 w1@0x56 2 r2
>> 	0x01 0xff
>> Now, reading one byte at a time
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	0x01
>> A second write/single read gets us the first byte again.
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	0x41
>
> I think you mean you get the other half of the first word.

Yes.

> When I try this with a 88E1111 directly connected to the I2C bus, I
> get:
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r2
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r2
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
> 0x01
>
> So a completely different behaviour. Continuing...
>
>> And doing it for a third time gets us the first byte again.
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	0x01
>> If we start another one-byte read without writing the address, we get
>> the second byte
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x41
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
>
> Again, different behaviour.
>
>> And continuing this pattern, we get the next byte.
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x0c
>> This can be repeated indefinitely
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0xc2
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x0c
>
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x41
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
>
> Here we eventually start toggling between the high and low bytes of
> the word.
>
>> But stopping in the "middle" of a register fails
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	Error: Sending messages failed: Input/output error
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2 r1
> 0x01
>
> No error for me.
>
>> We don't have to immediately read a byte:
>> 	$ i2ctransfer -y 2 w1@0x56 2
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x01
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x41
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
>
> Again, no toggling between high/low bytes of the word.
>
>> We can read two bytes indefinitely after "priming the pump"
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	0x01
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x41
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x0c 0xc2
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x0c 0x01
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x00 0x00
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x00 0x04
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x20 0x01
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x00 0x00
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
> root@clearfog21:~# i2ctransfer -y 1 r2@0x56
> 0x01 0x41
>
> No auto-increment of the register.
>
>> But more than that "runs out"
>> 	$ i2ctransfer -y 2 w1@0x56 2 r1
>> 	0x01
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x41
>> 	$ i2ctransfer -y 2 r4@0x56
>> 	0x0c 0xc2 0x0c 0x01
>> 	$ i2ctransfer -y 2 r4@0x56
>> 	0x00 0x00 0x00 0x04
>> 	$ i2ctransfer -y 2 r4@0x56
>> 	0x20 0x01 0xff 0xff
>> 	$ i2ctransfer -y 2 r4@0x56
>> 	0x01 0xff 0xff 0xff
>
> root@clearfog21:~# i2ctransfer -y 1 w1@0x56 2
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r1@0x56
> 0x01
> root@clearfog21:~# i2ctransfer -y 1 r4@0x56
> 0x01 0x41 0x0c 0xc2
> root@clearfog21:~# i2ctransfer -y 1 r4@0x56
> 0x01 0x41 0x0c 0xc2
> root@clearfog21:~# i2ctransfer -y 1 r4@0x56
> 0x01 0x41 0x0c 0xc2
> root@clearfog21:~# i2ctransfer -y 1 r4@0x56
> 0x01 0x41 0x0c 0xc2
>
>> However, the above multi-byte reads only works when starting at register
>> 2 or greater.
>> 	$ i2ctransfer -y 2 w1@0x56 0 r1
>> 	0x01
>> 	$ i2ctransfer -y 2 r1@0x56
>> 	0x40
>> 	$ i2ctransfer -y 2 r2@0x56
>> 	0x01 0xff
>>
>> Based on the above session, I believe that it may be best to treat this
>> phy as having an autoincrementing register address which must be read
>> one byte at a time, in multiples of two bytes. I think that existing SFP
>> phys may compatible with this, but unfortunately I do not have any on
>> hand to test with.
>
> Sadly, according to my results above, I think your module is doing
> something strange with the 88E1111.
>
> You say that it's Finisar, but I can only find this module in
> Fiberstore's website: https://www.fs.com/uk/products/20057.html
> Fiberstore commonly use "FS" in the vendor field.

So you are correct. I had seen the xgbe fixup for the same phy_id and
assumed that FS meant the same thing in both cases. You may also notice
that nowhere on their site do they spell out their name.

> You have me wondering what they've done to this PHY to make it respond
> in the way you are seeing.

You may notice on their site that there are different "compatible"
options for e.g. "FS", "Dell", "Cisco", and around 50 other
manufacturers. I wonder if I've come across a module which supports
autoincrementing byte reads in order to be compatible with some
manufacturer's hardware. And perhaps this change has inadvertently
broken the two-byte read capability, but only for the first 3 registers.

--Sean

[RFC,net-next,16/16] net: sfp: Add quirk to ignore PHYs

Commit Message

Comments

Patch