Message ID | Z3_n_5BXkxQR4zEG@shell.armlinux.org.uk (mailing list archive) |
---|---|
Headers | show |
Series | net: phylink: fix PCS without autoneg | expand |
Hi, Eric Woudstra reported that a PCS attached using 2500base-X does not see link when phylink is using in-band mode, but autoneg is disabled, despite there being a valid 2500base-X signal being received. We have these settings: act_link_an_mode = MLO_AN_INBAND pcs_neg_mode = PHYLINK_PCS_NEG_INBAND_DISABLED Eric diagnosed it to phylink_decode_c37_word() setting state->link false because the full-duplex bit isn't set in the non-existent link partner advertisement word (which doesn't exist because in-band autoneg is disabled!) The test in phylink_mii_c22_pcs_decode_state() is supposed to catch this state, but since we converted PCS to use neg_mode, testing the Autoneg in the local advertisement is no longer sufficient - we need to be looking at the neg_mode, which currently isn't provided. We need to provide this via the .pcs_get_state() method, and this will require modifying all PCS implementations to add the extra argument to this method. Patch 1 uses the PCS neg_mode in phylink_mac_pcs_get_state() to correct the now obsolute usage of the Autoneg bit in the advertisement. Patch 2 passes neg_mode into the .pcs_get_state() method, and updates all users. Patch 3 adds neg_mode as an argument to the various clause 22 state decoder functions in phylink, modifying drivers to pass the neg_mode through. Patch 4 makes use of phylink_mii_c22_pcs_decode_state() rather than using the Autoneg bit in the advertising field. Patch 5 may be required for Eric's case - it ensures that we report the correct state for interface types that we support only one set of modes for when autoneg is disabled. Changes in v2: - Add test for NULL pcs in patch 1 I haven't added Eric's t-b because I used a different fix in patch 1. drivers/net/dsa/b53/b53_serdes.c | 4 +- drivers/net/dsa/mt7530.c | 2 +- drivers/net/dsa/mv88e6xxx/pcs-6185.c | 1 + drivers/net/dsa/mv88e6xxx/pcs-6352.c | 1 + drivers/net/dsa/mv88e6xxx/pcs-639x.c | 5 +- drivers/net/dsa/qca/qca8k-8xxx.c | 2 +- drivers/net/ethernet/cadence/macb_main.c | 3 +- drivers/net/ethernet/freescale/fman/fman_dtsec.c | 4 +- drivers/net/ethernet/marvell/mvneta.c | 2 +- drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c | 2 + .../net/ethernet/marvell/prestera/prestera_main.c | 1 + drivers/net/ethernet/meta/fbnic/fbnic_phylink.c | 2 +- .../net/ethernet/microchip/lan966x/lan966x_main.h | 2 +- .../ethernet/microchip/lan966x/lan966x_phylink.c | 3 +- .../net/ethernet/microchip/lan966x/lan966x_port.c | 4 +- .../net/ethernet/microchip/sparx5/sparx5_phylink.c | 2 +- drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 3 +- drivers/net/pcs/pcs-lynx.c | 4 +- drivers/net/pcs/pcs-mtk-lynxi.c | 4 +- drivers/net/pcs/pcs-xpcs.c | 7 +-- drivers/net/phy/phylink.c | 60 ++++++++++++++++------ include/linux/phylink.h | 11 ++-- 22 files changed, 87 insertions(+), 42 deletions(-)
On Mon, Jan 13, 2025 at 09:21:18AM +0000, Russell King (Oracle) wrote: > Hi, > > Eric Woudstra reported that a PCS attached using 2500base-X does not > see link when phylink is using in-band mode, but autoneg is disabled, > despite there being a valid 2500base-X signal being received. We have > these settings: Please ignore this inappropriately threaded cover message. Thanks.
On 1/13/25 10:21 AM, Russell King (Oracle) wrote: > Hi, > > Eric Woudstra reported that a PCS attached using 2500base-X does not > see link when phylink is using in-band mode, but autoneg is disabled, > despite there being a valid 2500base-X signal being received. We have > these settings: > > act_link_an_mode = MLO_AN_INBAND > pcs_neg_mode = PHYLINK_PCS_NEG_INBAND_DISABLED > > Eric diagnosed it to phylink_decode_c37_word() setting state->link > false because the full-duplex bit isn't set in the non-existent link > partner advertisement word (which doesn't exist because in-band > autoneg is disabled!) > > The test in phylink_mii_c22_pcs_decode_state() is supposed to catch > this state, but since we converted PCS to use neg_mode, testing the > Autoneg in the local advertisement is no longer sufficient - we need > to be looking at the neg_mode, which currently isn't provided. > > We need to provide this via the .pcs_get_state() method, and this > will require modifying all PCS implementations to add the extra > argument to this method. > > Patch 1 uses the PCS neg_mode in phylink_mac_pcs_get_state() to correct > the now obsolute usage of the Autoneg bit in the advertisement. > > Patch 2 passes neg_mode into the .pcs_get_state() method, and updates > all users. > > Patch 3 adds neg_mode as an argument to the various clause 22 state > decoder functions in phylink, modifying drivers to pass the neg_mode > through. > > Patch 4 makes use of phylink_mii_c22_pcs_decode_state() rather than > using the Autoneg bit in the advertising field. > > Patch 5 may be required for Eric's case - it ensures that we report > the correct state for interface types that we support only one set > of modes for when autoneg is disabled. > > Changes in v2: > - Add test for NULL pcs in patch 1 > > I haven't added Eric's t-b because I used a different fix in patch 1. So I tested this V2 patch and with the first link up command, I get the link up which is functional end to end. Tested-by: Eric Woudstra <ericwouds@gmail.com> PS, FYI: I do however still have the difference in phylink_mac_config(). At first the link is up with (before phy attached): phylink_mac_config: mode=inband/2500base-x When the phy is attached, phylink_mac_config() is not called, but we have functional link. When I do: ethtool -s eth1 advertise 0x28, then: phylink_mac_config: mode=inband/sgmii And back again: ethtool -s eth1 advertise 0x800000000028, then: phylink_mac_config: mode=phy/2500base-x I can share more log entries if needed.
On Tue, Jan 14, 2025 at 11:59:06AM +0100, Eric Woudstra wrote: > On 1/13/25 10:21 AM, Russell King (Oracle) wrote: > > Hi, > > > > Eric Woudstra reported that a PCS attached using 2500base-X does not > > see link when phylink is using in-band mode, but autoneg is disabled, > > despite there being a valid 2500base-X signal being received. We have > > these settings: > > > > act_link_an_mode = MLO_AN_INBAND > > pcs_neg_mode = PHYLINK_PCS_NEG_INBAND_DISABLED > > > > Eric diagnosed it to phylink_decode_c37_word() setting state->link > > false because the full-duplex bit isn't set in the non-existent link > > partner advertisement word (which doesn't exist because in-band > > autoneg is disabled!) > > > > The test in phylink_mii_c22_pcs_decode_state() is supposed to catch > > this state, but since we converted PCS to use neg_mode, testing the > > Autoneg in the local advertisement is no longer sufficient - we need > > to be looking at the neg_mode, which currently isn't provided. > > > > We need to provide this via the .pcs_get_state() method, and this > > will require modifying all PCS implementations to add the extra > > argument to this method. > > > > Patch 1 uses the PCS neg_mode in phylink_mac_pcs_get_state() to correct > > the now obsolute usage of the Autoneg bit in the advertisement. > > > > Patch 2 passes neg_mode into the .pcs_get_state() method, and updates > > all users. > > > > Patch 3 adds neg_mode as an argument to the various clause 22 state > > decoder functions in phylink, modifying drivers to pass the neg_mode > > through. > > > > Patch 4 makes use of phylink_mii_c22_pcs_decode_state() rather than > > using the Autoneg bit in the advertising field. > > > > Patch 5 may be required for Eric's case - it ensures that we report > > the correct state for interface types that we support only one set > > of modes for when autoneg is disabled. > > > > Changes in v2: > > - Add test for NULL pcs in patch 1 > > > > I haven't added Eric's t-b because I used a different fix in patch 1. > > So I tested this V2 patch and with the first link up command, I get the > link up which is functional end to end. > > Tested-by: Eric Woudstra <ericwouds@gmail.com> > > PS, FYI: I do however still have the difference in phylink_mac_config(). > > At first the link is up with (before phy attached): > phylink_mac_config: mode=inband/2500base-x > > When the phy is attached, phylink_mac_config() is not called, but we > have functional link. > > When I do: ethtool -s eth1 advertise 0x28, then: > phylink_mac_config: mode=inband/sgmii > > And back again: ethtool -s eth1 advertise 0x800000000028, then: > phylink_mac_config: mode=phy/2500base-x > > I can share more log entries if needed. I'm not worried - there's a sixth patch that I need to send once this set of five are merged that will fix the above weirdness. Thanks for testing.