mbox series

[net-next,v9,0/4] net: dsa: Add Airoha AN8855 support

Message ID 20241205145142.29278-1-ansuelsmth@gmail.com (mailing list archive)
Headers show
Series net: dsa: Add Airoha AN8855 support | expand

Message

Christian Marangi Dec. 5, 2024, 2:51 p.m. UTC
This small series add the initial support for the Airoha AN8855 Switch.

It's a 5 port Gigabit Switch with SGMII/HSGMII upstream port.

This is starting to get in the wild and there are already some router
having this switch chip.

It's conceptually similar to mediatek switch but register and bits
are different. And there is that massive Hell that is the PCS
configuration.
Saddly for that part we have absolutely NO documentation currently.

There is this special thing where PHY needs to be calibrated with values
from the switch efuse. (the thing have a whole cpu timer and MCU)

From v8 Driver is now evaluated with Kernel selftest scripts for DSA:

Output local_termination.sh
TEST: lan2: Unicast IPv4 to primary MAC address                     [ OK ]
TEST: lan2: Unicast IPv4 to macvlan MAC address                     [ OK ]
TEST: lan2: Unicast IPv4 to unknown MAC address                     [ OK ]
TEST: lan2: Unicast IPv4 to unknown MAC address, promisc            [ OK ]
TEST: lan2: Unicast IPv4 to unknown MAC address, allmulti           [ OK ]
TEST: lan2: Multicast IPv4 to joined group                          [ OK ]
TEST: lan2: Multicast IPv4 to unknown group                         [XFAIL]
        reception succeeded, but should have failed
TEST: lan2: Multicast IPv4 to unknown group, promisc                [ OK ]
TEST: lan2: Multicast IPv4 to unknown group, allmulti               [ OK ]
TEST: lan2: Multicast IPv6 to joined group                          [ OK ]
TEST: lan2: Multicast IPv6 to unknown group                         [XFAIL]
        reception succeeded, but should have failed
TEST: lan2: Multicast IPv6 to unknown group, promisc                [ OK ]
TEST: lan2: Multicast IPv6 to unknown group, allmulti               [ OK ]
TEST: lan2: 1588v2 over L2 transport, Sync                          [ OK ]
TEST: lan2: 1588v2 over L2 transport, Follow-Up                     [ OK ]
TEST: lan2: 1588v2 over L2 transport, Peer Delay Request            [ OK ]
TEST: lan2: 1588v2 over IPv4, Sync                                  [FAIL]
        reception failed
TEST: lan2: 1588v2 over IPv4, Follow-Up                             [FAIL]
        reception failed
TEST: lan2: 1588v2 over IPv4, Peer Delay Request                    [FAIL]
        reception failed
TEST: lan2: 1588v2 over IPv6, Sync                                  [FAIL]
        reception failed
TEST: lan2: 1588v2 over IPv6, Follow-Up                             [FAIL]
        reception failed
TEST: lan2: 1588v2 over IPv6, Peer Delay Request                    [FAIL]
        reception failed
TEST: vlan_filtering=0 bridge: Unicast IPv4 to primary MAC address   [ OK ]
TEST: vlan_filtering=0 bridge: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address   [ OK ]
TEST: vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address, allmulti   [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv4 to joined group       [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv4 to unknown group      [XFAIL]
        reception succeeded, but should have failed
TEST: vlan_filtering=0 bridge: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv6 to joined group       [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv6 to unknown group      [XFAIL]
        reception succeeded, but should have failed
TEST: vlan_filtering=0 bridge: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: vlan_filtering=0 bridge: Multicast IPv6 to unknown group, allmulti   [ OK ]
TEST: vlan_filtering=1 bridge: Unicast IPv4 to primary MAC address   [ OK ]
TEST: vlan_filtering=1 bridge: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address   [ OK ]
TEST: vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address, allmulti   [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv4 to joined group       [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv4 to unknown group      [XFAIL]
        reception succeeded, but should have failed
TEST: vlan_filtering=1 bridge: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv6 to joined group       [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv6 to unknown group      [XFAIL]
        reception succeeded, but should have failed
TEST: vlan_filtering=1 bridge: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: vlan_filtering=1 bridge: Multicast IPv6 to unknown group, allmulti   [ OK ]
TEST: VLAN upper: Unicast IPv4 to primary MAC address               [ OK ]
TEST: VLAN upper: Unicast IPv4 to macvlan MAC address               [ OK ]
TEST: VLAN upper: Unicast IPv4 to unknown MAC address               [ OK ]
TEST: VLAN upper: Unicast IPv4 to unknown MAC address, promisc      [ OK ]
TEST: VLAN upper: Unicast IPv4 to unknown MAC address, allmulti     [ OK ]
TEST: VLAN upper: Multicast IPv4 to joined group                    [ OK ]
TEST: VLAN upper: Multicast IPv4 to unknown group                   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN upper: Multicast IPv4 to unknown group, promisc          [ OK ]
TEST: VLAN upper: Multicast IPv4 to unknown group, allmulti         [ OK ]
TEST: VLAN upper: Multicast IPv6 to joined group                    [ OK ]
TEST: VLAN upper: Multicast IPv6 to unknown group                   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN upper: Multicast IPv6 to unknown group, promisc          [ OK ]
TEST: VLAN upper: Multicast IPv6 to unknown group, allmulti         [ OK ]
TEST: VLAN upper: 1588v2 over L2 transport, Sync                    [ OK ]
TEST: VLAN upper: 1588v2 over L2 transport, Follow-Up               [FAIL]
        reception failed
TEST: VLAN upper: 1588v2 over L2 transport, Peer Delay Request      [ OK ]
TEST: VLAN upper: 1588v2 over IPv4, Sync                            [FAIL]
        reception failed
;TEST: VLAN upper: 1588v2 over IPv4, Follow-Up                       [FAIL]
        reception failed
TEST: VLAN upper: 1588v2 over IPv4, Peer Delay Request              [FAIL]
        reception failed
TEST: VLAN upper: 1588v2 over IPv6, Sync                            [FAIL]
        reception failed
TEST: VLAN upper: 1588v2 over IPv6, Follow-Up                       [FAIL]
        reception failed
TEST: VLAN upper: 1588v2 over IPv6, Peer Delay Request              [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: Unicast IPv4 to primary MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Unicast IPv4 to unknown MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Unicast IPv4 to unknown MAC address, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv4 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv4 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv6 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv6 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: Multicast IPv6 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over L2 transport, Sync   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over L2 transport, Follow-Up   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over L2 transport, Peer Delay Request   [ OK ]
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv4, Sync   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv4, Follow-Up   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv4, Peer Delay Request   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv6, Sync   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv6, Follow-Up   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridged port: 1588v2 over IPv6, Peer Delay Request   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: Unicast IPv4 to primary MAC address   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Unicast IPv4 to unknown MAC address   [FAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridged port: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Unicast IPv4 to unknown MAC address, allmulti   [FAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv4 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv4 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv6 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv6 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: Multicast IPv6 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over L2 transport, Sync   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over L2 transport, Follow-Up   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over L2 transport, Peer Delay Request   [ OK ]
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv4, Sync   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv4, Follow-Up   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv4, Peer Delay Request   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv6, Sync   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv6, Follow-Up   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=1 bridged port: 1588v2 over IPv6, Peer Delay Request   [FAIL]
        reception failed
TEST: VLAN over vlan_filtering=0 bridge: Unicast IPv4 to primary MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Unicast IPv4 to unknown MAC address, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv4 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv4 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv6 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv6 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=0 bridge: Multicast IPv6 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Unicast IPv4 to primary MAC address   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Unicast IPv4 to macvlan MAC address   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Unicast IPv4 to unknown MAC address, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv4 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv4 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv4 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv4 to unknown group, allmulti   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv6 to joined group   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv6 to unknown group   [XFAIL]
        reception succeeded, but should have failed
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv6 to unknown group, promisc   [ OK ]
TEST: VLAN over vlan_filtering=1 bridge: Multicast IPv6 to unknown group, allmulti   [ OK ]

Output bridge_vlan_unaware.sh
TEST: ping                                                          [ OK ]
TEST: ping6                                                         [ OK ]
TEST: FDB learning                                                  [ OK ]
TEST: Unknown unicast flood                                         [ OK ]
TEST: Unregistered multicast flood                                  [ OK ]

Output bridge_vlan_aware.sh
TEST: ping                                                          [ OK ]
TEST: ping6                                                         [ OK ]
TEST: FDB learning                                                  [ OK ]
TEST: Unknown unicast flood                                         [ OK ]
TEST: Unregistered multicast flood                                  [ OK ]
INFO: Add and delete a VLAN on bridge port lan2
TEST: ping                                                          [ OK ]
TEST: ping6                                                         [ OK ]
TEST: Externally learned FDB entry - ageing & roaming               [ OK ]
TEST: FDB entry in PVID for VLAN-tagged with other TPID             [FAIL]
        FDB entry was not learned when it should
TEST: Reception of VLAN with other TPID as untagged                 [FAIL]
        Packet was not forwarded when it should
TEST: Reception of VLAN with other TPID as untagged (no PVID)       [FAIL]
        Packet was forwarded when should not

Changes v9:
- Error out on using 5G speed as currently not supported
- Add missing MAC_2500FD in phylink mac_capabilities
- Add comment and improve if condition for an8855_phylink_mac_config
Changes v8:
- Add port Fast Age support
- Add support for Port Isolation
- Use correct register for Learning Disable
- Add support for Ageing Time OP
- Set default PVID to 0 by default
- Add mdb OPs
- Add port change MTU
- Fix support for Upper VLAN
Changes v7:
- Fix devm_dsa_register_switch wrong export symbol
Changes v6:
- Drop standard MIB and handle with ethtool OPs (as requested by Jakub)
- Cosmetic: use bool instead of 0 or 1
Changes v5:
- Add devm_dsa_register_switch() patch
- Add Reviewed-by tag for DT patch
Changes v4:
- Set regmap readable_table static (mute compilation warning)
- Add support for port_bridge flags (LEARNING, FLOOD)
- Reset fdb struct in fdb_dump
- Drop support_asym_pause in port_enable
- Add define for get_phy_flags
- Fix bug for port not inititially part of a bridge
  (in an8855_setup the port matrix was always cleared but
   the CPU port was never initially added)
- Disable learning and flood for user port by default
- Set CPU port to flood and learning by default
- Correctly AND force duplex and flow control in an8855_phylink_mac_link_up
- Drop RGMII from pcs_config
- Check ret in "Disable AN if not in autoneg"
- Use devm_mutex_init
- Fix typo for AN8855_PORT_CHECK_MODE
- Better define AN8855_STP_LISTENING = AN8855_STP_BLOCKING
- Fix typo in AN8855_PHY_EN_DOWN_SHIFT
- Use paged helper for PHY
- Skip calibration in config_init if priv not defined
Changes v3:
- Out of RFC
- Switch PHY code to select_page API
- Better describe masks and bits in PHY driver for ADC register
- Drop raw values and use define for mii read/write
- Switch to absolute PHY address
- Replace raw values with mask and bits for pcs_config
- Fix typo for ext-surge property name
- Drop support for relocating Switch base PHY address on the bus
Changes v2:
- Drop mutex guard patch
- Drop guard usage in DSA driver
- Use __mdiobus_write/read
- Check return condition and return errors for mii read/write
- Fix wrong logic for EEE
- Fix link_down (don't force link down with autoneg)
- Fix forcing speed on sgmii autoneg
- Better document link speed for sgmii reg
- Use standard define for sgmii reg
- Imlement nvmem support to expose switch EFUSE
- Rework PHY calibration with the use of NVMEM producer/consumer
- Update DT with new NVMEM property
- Move aneg validation for 2500-basex in pcs_config
- Move r50Ohm table and function to PHY driver

Christian Marangi (4):
  net: dsa: add devm_dsa_register_switch()
  dt-bindings: net: dsa: Add Airoha AN8855 Gigabit Switch documentation
  net: dsa: Add Airoha AN8855 5-Port Gigabit DSA Switch driver
  net: phy: Add Airoha AN8855 Internal Switch Gigabit PHY

 .../bindings/net/dsa/airoha,an8855.yaml       |  242 ++
 MAINTAINERS                                   |   11 +
 drivers/net/dsa/Kconfig                       |    9 +
 drivers/net/dsa/Makefile                      |    1 +
 drivers/net/dsa/an8855.c                      | 2671 +++++++++++++++++
 drivers/net/dsa/an8855.h                      |  810 +++++
 drivers/net/phy/Kconfig                       |    5 +
 drivers/net/phy/Makefile                      |    1 +
 drivers/net/phy/air_an8855.c                  |  267 ++
 include/net/dsa.h                             |    1 +
 net/dsa/dsa.c                                 |   19 +
 11 files changed, 4037 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/dsa/airoha,an8855.yaml
 create mode 100644 drivers/net/dsa/an8855.c
 create mode 100644 drivers/net/dsa/an8855.h
 create mode 100644 drivers/net/phy/air_an8855.c

Comments

Russell King (Oracle) Dec. 5, 2024, 4:12 p.m. UTC | #1
hi Christian,

On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
> index 2d10b4d6cfbb..6b6d0b7bae72 100644
> --- a/drivers/net/dsa/Kconfig
> +++ b/drivers/net/dsa/Kconfig
> @@ -24,6 +24,15 @@ config NET_DSA_LOOP
>  	  This enables support for a fake mock-up switch chip which
>  	  exercises the DSA APIs.
>  
> +

Niggle - no need for the extra blank line.

> +static int an8855_set_mac_eee(struct dsa_switch *ds, int port,
> +			      struct ethtool_keee *eee)
> +{
> +	struct an8855_priv *priv = ds->priv;
> +	u32 reg;
> +	int ret;
> +
> +	if (eee->eee_enabled) {
> +		ret = regmap_read(priv->regmap, AN8855_PMCR_P(port), &reg);
> +		if (ret)
> +			return ret;
> +		/* Force enable EEE if force mode and LINK */
> +		if (reg & AN8855_PMCR_FORCE_MODE &&
> +		    reg & AN8855_PMCR_FORCE_LNK) {
> +			switch (reg & AN8855_PMCR_FORCE_SPEED) {
> +			case AN8855_PMCR_FORCE_SPEED_1000:
> +				reg |= AN8855_PMCR_FORCE_EEE1G;
> +				break;
> +			case AN8855_PMCR_FORCE_SPEED_100:
> +				reg |= AN8855_PMCR_FORCE_EEE100;
> +				break;
> +			default:
> +				break;
> +			}

Does this forcing force EEE to be enabled for the port, irrespective
of the negotiation?

> +			ret = regmap_write(priv->regmap, AN8855_PMCR_P(port), reg);
> +			if (ret)
> +				return ret;
> +		}
> +		ret = regmap_update_bits(priv->regmap, AN8855_PMEEECR_P(port),
> +					 AN8855_LPI_MODE_EN,
> +					 eee->tx_lpi_enabled ? AN8855_LPI_MODE_EN : 0);
> +		if (ret)
> +			return ret;
> +	} else {
> +		ret = regmap_clear_bits(priv->regmap, AN8855_PMCR_P(port),
> +					AN8855_PMCR_FORCE_EEE1G |
> +					AN8855_PMCR_FORCE_EEE100);
> +		if (ret)
> +			return ret;
> +
> +		ret = regmap_clear_bits(priv->regmap, AN8855_PMEEECR_P(port),
> +					AN8855_LPI_MODE_EN);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	return 0;
> +}
> +
> +static int an8855_get_mac_eee(struct dsa_switch *ds, int port,
> +			      struct ethtool_keee *eee)
> +{
> +	struct an8855_priv *priv = ds->priv;
> +	u32 reg;
> +	int ret;
> +
> +	ret = regmap_read(priv->regmap, AN8855_PMEEECR_P(port), &reg);
> +	if (ret)
> +		return ret;
> +	eee->tx_lpi_enabled = reg & AN8855_LPI_MODE_EN;
> +
> +	ret = regmap_read(priv->regmap, AN8855_CKGCR, &reg);
> +	if (ret)
> +		return ret;
> +	/* Global LPI TXIDLE Threshold, default 60ms (unit 2us) */
> +	eee->tx_lpi_timer = FIELD_GET(AN8855_LPI_TXIDLE_THD_MASK, reg) / 500;

There is no point setting tx_lpi_enabled and tx_lpi_timer, as phylib
will immediately overwrite then in its phy_ethtool_get_eee() method.
In fact, there's no point to the DSA get_mac_eee() method.

> +
> +	ret = regmap_read(priv->regmap, AN8855_PMSR_P(port), &reg);
> +	if (ret)
> +		return ret;

What's the purpose of this read?

> +
> +	return 0;
> +}

Overall, I would suggest not implementing EEE just yet - I sent out a
22 patch RFC patch set last week which implements EEE support in
phylink, and I have another bunch of patches that clean up DSA that
I didn't send, which includes getting rid of the DSA get_mac_eee()
method.

Now that the bulk of the phylink in-band changes have been merged, my
plan is to work through getting the RFC patch set merged - I've sent
out the first four patches today for final review and merging.

I stated some questions in the RFC cover message that everyone ignored,
maybe you could look at that and give your views on the points I raised
there?

> +static struct phylink_pcs *
> +an8855_phylink_mac_select_pcs(struct phylink_config *config,
> +			      phy_interface_t interface)
> +{
> +	struct dsa_port *dp = dsa_phylink_to_port(config);
> +	struct an8855_priv *priv = dp->ds->priv;
> +
> +	switch (interface) {
> +	case PHY_INTERFACE_MODE_SGMII:
> +	case PHY_INTERFACE_MODE_2500BASEX:
> +		return &priv->pcs;
> +	default:
> +		return NULL;
> +	}

Great, exactly what I like to see in mac_select_pcs()!

> +}
> +
> +static void
> +an8855_phylink_mac_config(struct phylink_config *config, unsigned int mode,
> +			  const struct phylink_link_state *state)
> +{
> +	struct dsa_port *dp = dsa_phylink_to_port(config);
> +	struct dsa_switch *ds = dp->ds;
> +	struct an8855_priv *priv;
> +	int port = dp->index;
> +
> +	priv = ds->priv;
> +
> +	/* Nothing to configure for internal ports */
> +	if (port != 5)
> +		return;
> +
> +	regmap_update_bits(priv->regmap, AN8855_PMCR_P(port),
> +			   AN8855_PMCR_IFG_XMIT | AN8855_PMCR_MAC_MODE |
> +			   AN8855_PMCR_BACKOFF_EN | AN8855_PMCR_BACKPR_EN,
> +			   FIELD_PREP(AN8855_PMCR_IFG_XMIT, 0x1) |
> +			   AN8855_PMCR_MAC_MODE | AN8855_PMCR_BACKOFF_EN |
> +			   AN8855_PMCR_BACKPR_EN);
> +}

This looks much better, thanks.

> +
> +static void an8855_phylink_get_caps(struct dsa_switch *ds, int port,
> +				    struct phylink_config *config)
> +{
> +	switch (port) {
> +	case 0:
> +	case 1:
> +	case 2:
> +	case 3:
> +	case 4:
> +		__set_bit(PHY_INTERFACE_MODE_GMII,
> +			  config->supported_interfaces);
> +		__set_bit(PHY_INTERFACE_MODE_INTERNAL,
> +			  config->supported_interfaces);
> +		break;
> +	case 5:
> +		phy_interface_set_rgmii(config->supported_interfaces);
> +		__set_bit(PHY_INTERFACE_MODE_SGMII,
> +			  config->supported_interfaces);
> +		__set_bit(PHY_INTERFACE_MODE_2500BASEX,
> +			  config->supported_interfaces);
> +		break;
> +	}
> +
> +	config->mac_capabilities = MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
> +				   MAC_10 | MAC_100 | MAC_1000FD | MAC_2500FD;
> +}
> +
> +static void
> +an8855_phylink_mac_link_down(struct phylink_config *config, unsigned int mode,
> +			     phy_interface_t interface)
> +{
> +	struct dsa_port *dp = dsa_phylink_to_port(config);
> +	struct an8855_priv *priv = dp->ds->priv;
> +
> +	/* With autoneg just disable TX/RX else also force link down */
> +	if (phylink_autoneg_inband(mode)) {
> +		regmap_clear_bits(priv->regmap, AN8855_PMCR_P(dp->index),
> +				  AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN);
> +	} else {
> +		regmap_update_bits(priv->regmap, AN8855_PMCR_P(dp->index),
> +				   AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN |
> +				   AN8855_PMCR_FORCE_MODE | AN8855_PMCR_FORCE_LNK,
> +				   AN8855_PMCR_FORCE_MODE);
> +	}
> +}
> +
> +static void
> +an8855_phylink_mac_link_up(struct phylink_config *config,
> +			   struct phy_device *phydev, unsigned int mode,
> +			   phy_interface_t interface, int speed, int duplex,
> +			   bool tx_pause, bool rx_pause)
> +{
> +	struct dsa_port *dp = dsa_phylink_to_port(config);
> +	struct an8855_priv *priv = dp->ds->priv;
> +	int port = dp->index;
> +	u32 reg;
> +
> +	reg = regmap_read(priv->regmap, AN8855_PMCR_P(port), &reg);
> +	if (phylink_autoneg_inband(mode)) {
> +		reg &= ~AN8855_PMCR_FORCE_MODE;
> +	} else {
> +		reg |= AN8855_PMCR_FORCE_MODE | AN8855_PMCR_FORCE_LNK;
> +
> +		reg &= ~AN8855_PMCR_FORCE_SPEED;
> +		switch (speed) {
> +		case SPEED_10:
> +			reg |= AN8855_PMCR_FORCE_SPEED_10;
> +			break;
> +		case SPEED_100:
> +			reg |= AN8855_PMCR_FORCE_SPEED_100;
> +			break;
> +		case SPEED_1000:
> +			reg |= AN8855_PMCR_FORCE_SPEED_1000;
> +			break;
> +		case SPEED_2500:
> +			reg |= AN8855_PMCR_FORCE_SPEED_2500;
> +			break;
> +		case SPEED_5000:
> +			dev_err(priv->dev, "Missing support for 5G speed. Aborting...\n");
> +			return;
> +		}
> +
> +		reg &= ~AN8855_PMCR_FORCE_FDX;
> +		if (duplex == DUPLEX_FULL)
> +			reg |= AN8855_PMCR_FORCE_FDX;
> +
> +		reg &= ~AN8855_PMCR_RX_FC_EN;
> +		if (rx_pause || dsa_port_is_cpu(dp))
> +			reg |= AN8855_PMCR_RX_FC_EN;
> +
> +		reg &= ~AN8855_PMCR_TX_FC_EN;
> +		if (rx_pause || dsa_port_is_cpu(dp))
> +			reg |= AN8855_PMCR_TX_FC_EN;
> +
> +		/* Disable any EEE options */
> +		reg &= ~(AN8855_PMCR_FORCE_EEE5G | AN8855_PMCR_FORCE_EEE2P5G |
> +			 AN8855_PMCR_FORCE_EEE1G | AN8855_PMCR_FORCE_EEE100);
> +	}
> +
> +	reg |= AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN;
> +
> +	regmap_write(priv->regmap, AN8855_PMCR_P(port), reg);
> +}

All the above looks fine to me.

> +
> +static void an8855_pcs_get_state(struct phylink_pcs *pcs,
> +				 struct phylink_link_state *state)
> +{
> +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> +	u32 val;
> +	int ret;
> +
> +	ret = regmap_read(priv->regmap, AN8855_PMSR_P(AN8855_CPU_PORT), &val);
> +	if (ret < 0) {
> +		state->link = false;
> +		return;
> +	}
> +
> +	state->link = !!(val & AN8855_PMSR_LNK);
> +	state->an_complete = state->link;
> +	state->duplex = (val & AN8855_PMSR_DPX) ? DUPLEX_FULL :
> +						  DUPLEX_HALF;
> +
> +	switch (val & AN8855_PMSR_SPEED) {
> +	case AN8855_PMSR_SPEED_10:
> +		state->speed = SPEED_10;
> +		break;
> +	case AN8855_PMSR_SPEED_100:
> +		state->speed = SPEED_100;
> +		break;
> +	case AN8855_PMSR_SPEED_1000:
> +		state->speed = SPEED_1000;
> +		break;
> +	case AN8855_PMSR_SPEED_2500:
> +		state->speed = SPEED_2500;
> +		break;
> +	case AN8855_PMSR_SPEED_5000:
> +		dev_err(priv->dev, "Missing support for 5G speed. Setting Unknown.\n");
> +		fallthrough;
> +	default:
> +		state->speed = SPEED_UNKNOWN;
> +		break;
> +	}
> +
> +	if (val & AN8855_PMSR_RX_FC)
> +		state->pause |= MLO_PAUSE_RX;
> +	if (val & AN8855_PMSR_TX_FC)
> +		state->pause |= MLO_PAUSE_TX;
> +}
> +
> +static int an8855_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
> +			     phy_interface_t interface,
> +			     const unsigned long *advertising,
> +			     bool permit_pause_to_mac)
> +{
> +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> +	u32 val;
> +	int ret;
> +
> +	switch (interface) {
> +	case PHY_INTERFACE_MODE_SGMII:
> +		break;
> +	case PHY_INTERFACE_MODE_2500BASEX:
> +		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> +			dev_err(priv->dev, "in-band negotiation unsupported");
> +			return -EINVAL;
> +		}
> +		break;

Now that we have the phylink in-band changes merged, along with a new
PCS .pcs_inband_caps() method, please implement this method to indicate
that in-band is not supported in 2500BASE-X mode. Phylink will issue a
warning if it attempts to go against that (particularly when e.g. the
PHY demands in-band but the PCS doesn't support it - a case that likely
will never work in practice.)

> +	default:
> +		return -EINVAL;
> +	}
> +
> +	/*                   !!! WELCOME TO HELL !!!                   */
> +
> +	/* TX FIR - improve TX EYE */
> +	ret = regmap_update_bits(priv->regmap, AN8855_INTF_CTRL_10,
> +				 AN8855_RG_DA_QP_TX_FIR_C2_SEL |
> +				 AN8855_RG_DA_QP_TX_FIR_C2_FORCE |
> +				 AN8855_RG_DA_QP_TX_FIR_C1_SEL |
> +				 AN8855_RG_DA_QP_TX_FIR_C1_FORCE,
> +				 AN8855_RG_DA_QP_TX_FIR_C2_SEL |
> +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C2_FORCE, 0x4) |
> +				 AN8855_RG_DA_QP_TX_FIR_C1_SEL |
> +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C1_FORCE, 0x0));
> +	if (ret)
> +		return ret;
> +
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x0;
> +	else
> +		val = 0xd;
> +	ret = regmap_update_bits(priv->regmap, AN8855_INTF_CTRL_11,
> +				 AN8855_RG_DA_QP_TX_FIR_C0B_SEL |
> +				 AN8855_RG_DA_QP_TX_FIR_C0B_FORCE,
> +				 AN8855_RG_DA_QP_TX_FIR_C0B_SEL |
> +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C0B_FORCE, val));
> +	if (ret)
> +		return ret;
> +
> +	/* RX CDR - improve RX Jitter Tolerance */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x5;
> +	else
> +		val = 0x6;
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_BOT_LIM,
> +				 AN8855_RG_QP_CDR_LPF_KP_GAIN |
> +				 AN8855_RG_QP_CDR_LPF_KI_GAIN,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_KP_GAIN, val) |
> +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_KI_GAIN, val));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x1;
> +	else
> +		val = 0x0;
> +	ret = regmap_update_bits(priv->regmap, AN8855_QP_DIG_MODE_CTRL_1,
> +				 AN8855_RG_TPHY_SPEED,
> +				 FIELD_PREP(AN8855_RG_TPHY_SPEED, val));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - LPF */
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				 AN8855_RG_DA_QP_PLL_RICO_SEL_INTF |
> +				 AN8855_RG_DA_QP_PLL_FBKSEL_INTF |
> +				 AN8855_RG_DA_QP_PLL_BR_INTF |
> +				 AN8855_RG_DA_QP_PLL_BPD_INTF |
> +				 AN8855_RG_DA_QP_PLL_BPA_INTF |
> +				 AN8855_RG_DA_QP_PLL_BC_INTF,
> +				 AN8855_RG_DA_QP_PLL_RICO_SEL_INTF |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_FBKSEL_INTF, 0x0) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BR_INTF, 0x3) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BPD_INTF, 0x0) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BPA_INTF, 0x5) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BC_INTF, 0x1));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - ICO */
> +	ret = regmap_set_bits(priv->regmap, AN8855_PLL_CTRL_4,
> +			      AN8855_RG_DA_QP_PLL_ICOLP_EN_INTF);
> +	if (ret)
> +		return ret;
> +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				AN8855_RG_DA_QP_PLL_ICOIQ_EN_INTF);
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - CHP */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x6;
> +	else
> +		val = 0x4;
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				 AN8855_RG_DA_QP_PLL_IR_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_IR_INTF, val));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - PFD */
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				 AN8855_RG_DA_QP_PLL_PFD_OFFSET_EN_INTRF |
> +				 AN8855_RG_DA_QP_PLL_PFD_OFFSET_INTF |
> +				 AN8855_RG_DA_QP_PLL_KBAND_PREDIV_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_PFD_OFFSET_INTF, 0x1) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_KBAND_PREDIV_INTF, 0x1));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - POSTDIV */
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				 AN8855_RG_DA_QP_PLL_POSTDIV_EN_INTF |
> +				 AN8855_RG_DA_QP_PLL_PHY_CK_EN_INTF |
> +				 AN8855_RG_DA_QP_PLL_PCK_SEL_INTF,
> +				 AN8855_RG_DA_QP_PLL_PCK_SEL_INTF);
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - SDM */
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				 AN8855_RG_DA_QP_PLL_SDM_HREN_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SDM_HREN_INTF, 0x0));
> +	if (ret)
> +		return ret;
> +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CTRL_2,
> +				AN8855_RG_DA_QP_PLL_SDM_IFM_INTF);
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_SS_LCPLL_PWCTL_SETTING_2,
> +				 AN8855_RG_NCPO_ANA_MSB,
> +				 FIELD_PREP(AN8855_RG_NCPO_ANA_MSB, 0x1));
> +	if (ret)
> +		return ret;
> +
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x7a000000;
> +	else
> +		val = 0x48000000;
> +	ret = regmap_write(priv->regmap, AN8855_SS_LCPLL_TDC_FLT_2,
> +			   FIELD_PREP(AN8855_RG_LCPLL_NCPO_VALUE, val));
> +	if (ret)
> +		return ret;
> +	ret = regmap_write(priv->regmap, AN8855_SS_LCPLL_TDC_PCW_1,
> +			   FIELD_PREP(AN8855_RG_LCPLL_PON_HRDDS_PCW_NCPO_GPON, val));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_clear_bits(priv->regmap, AN8855_SS_LCPLL_TDC_FLT_5,
> +				AN8855_RG_LCPLL_NCPO_CHG);
> +	if (ret)
> +		return ret;
> +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CK_CTRL_0,
> +				AN8855_RG_DA_QP_PLL_SDM_DI_EN_INTF);
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - SS */
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_3,
> +				 AN8855_RG_DA_QP_PLL_SSC_DELTA_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_DELTA_INTF, 0x0));
> +	if (ret)
> +		return ret;
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_4,
> +				 AN8855_RG_DA_QP_PLL_SSC_DIR_DLY_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_DIR_DLY_INTF, 0x0));
> +	if (ret)
> +		return ret;
> +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_3,
> +				 AN8855_RG_DA_QP_PLL_SSC_PERIOD_INTF,
> +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_PERIOD_INTF, 0x0));
> +	if (ret)
> +		return ret;
> +
> +	/* PLL - TDC */
> +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CK_CTRL_0,
> +				AN8855_RG_DA_QP_PLL_TDC_TXCK_SEL_INTF);
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_PLL_SDM_ORD,
> +			      AN8855_RG_QP_PLL_SSC_TRI_EN);
> +	if (ret)
> +		return ret;
> +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_PLL_SDM_ORD,
> +			      AN8855_RG_QP_PLL_SSC_PHASE_INI);
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_RX_DAC_EN,
> +				 AN8855_RG_QP_SIGDET_HF,
> +				 FIELD_PREP(AN8855_RG_QP_SIGDET_HF, 0x2));
> +	if (ret)
> +		return ret;
> +
> +	/* TCL Disable (only for Co-SIM) */
> +	ret = regmap_clear_bits(priv->regmap, AN8855_PON_RXFEDIG_CTRL_0,
> +				AN8855_RG_QP_EQ_RX500M_CK_SEL);
> +	if (ret)
> +		return ret;
> +
> +	/* TX Init */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x4;
> +	else
> +		val = 0x0;
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_TX_MODE,
> +				 AN8855_RG_QP_TX_RESERVE |
> +				 AN8855_RG_QP_TX_MODE_16B_EN,
> +				 FIELD_PREP(AN8855_RG_QP_TX_RESERVE, val));
> +	if (ret)
> +		return ret;
> +
> +	/* RX Control/Init */
> +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_RXAFE_RESERVE,
> +			      AN8855_RG_QP_CDR_PD_10B_EN);
> +	if (ret)
> +		return ret;
> +
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x1;
> +	else
> +		val = 0x2;
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_MJV_LIM,
> +				 AN8855_RG_QP_CDR_LPF_RATIO,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_RATIO, val));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_SETVALUE,
> +				 AN8855_RG_QP_CDR_PR_BUF_IN_SR |
> +				 AN8855_RG_QP_CDR_PR_BETA_SEL,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_BUF_IN_SR, 0x6) |
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_BETA_SEL, 0x1));
> +	if (ret)
> +		return ret;
> +
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0xf;
> +	else
> +		val = 0xc;
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_CKREF_DIV1,
> +				 AN8855_RG_QP_CDR_PR_DAC_BAND,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_DAC_BAND, val));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE,
> +				 AN8855_RG_QP_CDR_PR_KBAND_PCIE_MODE |
> +				 AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE_MASK,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE_MASK, 0x19));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_FORCE_IBANDLPF_R_OFF,
> +				 AN8855_RG_QP_CDR_PHYCK_SEL |
> +				 AN8855_RG_QP_CDR_PHYCK_RSTB |
> +				 AN8855_RG_QP_CDR_PHYCK_DIV,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PHYCK_SEL, 0x2) |
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PHYCK_DIV, 0x21));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_clear_bits(priv->regmap, AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE,
> +				AN8855_RG_QP_CDR_PR_XFICK_EN);
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_CKREF_DIV1,
> +				 AN8855_RG_QP_CDR_PR_KBAND_DIV,
> +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_KBAND_DIV, 0x4));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_26,
> +				 AN8855_RG_QP_EQ_RETRAIN_ONLY_EN |
> +				 AN8855_RG_LINK_NE_EN |
> +				 AN8855_RG_LINK_ERRO_EN,
> +				 AN8855_RG_QP_EQ_RETRAIN_ONLY_EN |
> +				 AN8855_RG_LINK_ERRO_EN);
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_DLY_0,
> +				 AN8855_RG_QP_RX_SAOSC_EN_H_DLY |
> +				 AN8855_RG_QP_RX_PI_CAL_EN_H_DLY,
> +				 FIELD_PREP(AN8855_RG_QP_RX_SAOSC_EN_H_DLY, 0x3f) |
> +				 FIELD_PREP(AN8855_RG_QP_RX_PI_CAL_EN_H_DLY, 0x6f));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_42,
> +				 AN8855_RG_QP_EQ_EN_DLY,
> +				 FIELD_PREP(AN8855_RG_QP_EQ_EN_DLY, 0x150));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_2,
> +				 AN8855_RG_QP_RX_EQ_EN_H_DLY,
> +				 FIELD_PREP(AN8855_RG_QP_RX_EQ_EN_H_DLY, 0x150));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_PON_RXFEDIG_CTRL_9,
> +				 AN8855_RG_QP_EQ_LEQOSC_DLYCNT,
> +				 FIELD_PREP(AN8855_RG_QP_EQ_LEQOSC_DLYCNT, 0x1));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_8,
> +				 AN8855_RG_DA_QP_SAOSC_DONE_TIME |
> +				 AN8855_RG_DA_QP_LEQOS_EN_TIME,
> +				 FIELD_PREP(AN8855_RG_DA_QP_SAOSC_DONE_TIME, 0x200) |
> +				 FIELD_PREP(AN8855_RG_DA_QP_LEQOS_EN_TIME, 0xfff));
> +	if (ret)
> +		return ret;
> +
> +	/* Frequency meter */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +		val = 0x10;
> +	else
> +		val = 0x28;
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_5,
> +				 AN8855_RG_FREDET_CHK_CYCLE,
> +				 FIELD_PREP(AN8855_RG_FREDET_CHK_CYCLE, val));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_6,
> +				 AN8855_RG_FREDET_GOLDEN_CYCLE,
> +				 FIELD_PREP(AN8855_RG_FREDET_GOLDEN_CYCLE, 0x64));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_7,
> +				 AN8855_RG_FREDET_TOLERATE_CYCLE,
> +				 FIELD_PREP(AN8855_RG_FREDET_TOLERATE_CYCLE, 0x2710));
> +	if (ret)
> +		return ret;
> +
> +	ret = regmap_set_bits(priv->regmap, AN8855_PLL_CTRL_0,
> +			      AN8855_RG_PHYA_AUTO_INIT);
> +	if (ret)
> +		return ret;
> +
> +	/* PCS Init */
> +	if (interface == PHY_INTERFACE_MODE_SGMII &&
> +	    neg_mode == PHYLINK_PCS_NEG_INBAND_DISABLED) {
> +		ret = regmap_clear_bits(priv->regmap, AN8855_QP_DIG_MODE_CTRL_0,
> +					AN8855_RG_SGMII_MODE | AN8855_RG_SGMII_AN_EN);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	ret = regmap_clear_bits(priv->regmap, AN8855_RG_HSGMII_PCS_CTROL_1,
> +				AN8855_RG_TBI_10B_MODE);
> +	if (ret)
> +		return ret;
> +
> +	if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> +		/* Set AN Ability - Interrupt */
> +		ret = regmap_set_bits(priv->regmap, AN8855_SGMII_REG_AN_FORCE_CL37,
> +				      AN8855_RG_FORCE_AN_DONE);
> +		if (ret)
> +			return ret;
> +
> +		ret = regmap_update_bits(priv->regmap, AN8855_SGMII_REG_AN_13,
> +					 AN8855_SGMII_REMOTE_FAULT_DIS |
> +					 AN8855_SGMII_IF_MODE,
> +					 AN8855_SGMII_REMOTE_FAULT_DIS |
> +					 FIELD_PREP(AN8855_SGMII_IF_MODE, 0xb));
> +		if (ret)
> +			return ret;
> +	}
> +
> +	/* Rate Adaption - GMII path config. */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX) {
> +		ret = regmap_clear_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> +					AN8855_RG_P0_DIS_MII_MODE);
> +		if (ret)
> +			return ret;
> +	} else {
> +		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> +			ret = regmap_set_bits(priv->regmap, AN8855_MII_RA_AN_ENABLE,
> +					      AN8855_RG_P0_RA_AN_EN);
> +			if (ret)
> +				return ret;
> +		} else {
> +			ret = regmap_update_bits(priv->regmap, AN8855_RG_AN_SGMII_MODE_FORCE,
> +						 AN8855_RG_FORCE_CUR_SGMII_MODE |
> +						 AN8855_RG_FORCE_CUR_SGMII_SEL,
> +						 AN8855_RG_FORCE_CUR_SGMII_SEL);
> +			if (ret)
> +				return ret;
> +
> +			ret = regmap_clear_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> +						AN8855_RG_P0_MII_RA_RX_EN |
> +						AN8855_RG_P0_MII_RA_TX_EN |
> +						AN8855_RG_P0_MII_RA_RX_MODE |
> +						AN8855_RG_P0_MII_RA_TX_MODE);
> +			if (ret)
> +				return ret;
> +		}
> +
> +		ret = regmap_set_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> +				      AN8855_RG_P0_MII_MODE);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	ret = regmap_set_bits(priv->regmap, AN8855_RG_RATE_ADAPT_CTRL_0,
> +			      AN8855_RG_RATE_ADAPT_RX_BYPASS |
> +			      AN8855_RG_RATE_ADAPT_TX_BYPASS |
> +			      AN8855_RG_RATE_ADAPT_RX_EN |
> +			      AN8855_RG_RATE_ADAPT_TX_EN);
> +	if (ret)
> +		return ret;
> +
> +	/* Disable AN if not in autoneg */
> +	ret = regmap_update_bits(priv->regmap, AN8855_SGMII_REG_AN0, BMCR_ANENABLE,
> +				 neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED ? BMCR_ANENABLE :
> +									      0);
> +	if (ret)
> +		return ret;
> +
> +	if (interface == PHY_INTERFACE_MODE_SGMII &&
> +	    neg_mode == PHYLINK_PCS_NEG_INBAND_DISABLED) {
> +		ret = regmap_set_bits(priv->regmap, AN8855_PHY_RX_FORCE_CTRL_0,
> +				      AN8855_RG_FORCE_TXC_SEL);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	/* Force Speed with fixed-link or 2500base-x as doesn't support aneg */
> +	if (interface == PHY_INTERFACE_MODE_2500BASEX ||
> +	    neg_mode != PHYLINK_PCS_NEG_INBAND_ENABLED) {
> +		if (interface == PHY_INTERFACE_MODE_2500BASEX)
> +			val = AN8855_RG_LINK_MODE_P0_SPEED_2500;
> +		else
> +			val = AN8855_RG_LINK_MODE_P0_SPEED_1000;
> +		ret = regmap_update_bits(priv->regmap, AN8855_SGMII_STS_CTRL_0,
> +					 AN8855_RG_LINK_MODE_P0 |
> +					 AN8855_RG_FORCE_SPD_MODE_P0,
> +					 val | AN8855_RG_FORCE_SPD_MODE_P0);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	/* bypass flow control to MAC */
> +	ret = regmap_write(priv->regmap, AN8855_MSG_RX_LIK_STS_0,
> +			   AN8855_RG_DPX_STS_P3 | AN8855_RG_DPX_STS_P2 |
> +			   AN8855_RG_DPX_STS_P1 | AN8855_RG_TXFC_STS_P0 |
> +			   AN8855_RG_RXFC_STS_P0 | AN8855_RG_DPX_STS_P0);
> +	if (ret)
> +		return ret;
> +	ret = regmap_write(priv->regmap, AN8855_MSG_RX_LIK_STS_2,
> +			   AN8855_RG_RXFC_AN_BYPASS_P3 |
> +			   AN8855_RG_RXFC_AN_BYPASS_P2 |
> +			   AN8855_RG_RXFC_AN_BYPASS_P1 |
> +			   AN8855_RG_TXFC_AN_BYPASS_P3 |
> +			   AN8855_RG_TXFC_AN_BYPASS_P2 |
> +			   AN8855_RG_TXFC_AN_BYPASS_P1 |
> +			   AN8855_RG_DPX_AN_BYPASS_P3 |
> +			   AN8855_RG_DPX_AN_BYPASS_P2 |
> +			   AN8855_RG_DPX_AN_BYPASS_P1 |
> +			   AN8855_RG_DPX_AN_BYPASS_P0);
> +	if (ret)
> +		return ret;

Is the above disruptive to the link if it is executed when the link is
already up? Do you need to re-execute it even when "interface" hasn't
changed?

This gets called to possibly change the PCS advertisement in response
to ethtool -s, so ought not have any link disrupting effects if
nothing has changed.

> +
> +	return 0;
> +}
> +
> +static void an8855_pcs_an_restart(struct phylink_pcs *pcs)
> +{
> +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> +
> +	regmap_set_bits(priv->regmap, AN8855_SGMII_REG_AN0, BMCR_ANRESTART);

This won't get called for SGMII - Cisco SGMII is merely the PHY telling
the PCS/MAC what was negotiated at the media and how it's going to be
operating the SGMII link. AN restart is a media side thing, so that
would happen at the PHY in that case. I suggest this becomes an empty
no-op function.

Thanks!
Vladimir Oltean Dec. 5, 2024, 4:27 p.m. UTC | #2
On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> +static int an8855_efuse_read(void *context, unsigned int offset,
> +			     void *val, size_t bytes)
> +{
> +	struct an8855_priv *priv = context;
> +
> +	return regmap_bulk_read(priv->regmap, AN8855_EFUSE_DATA0 + offset,
> +				val, bytes / sizeof(u32));
> +}
> +
> +static struct nvmem_config an8855_nvmem_config = {
> +	.name = "an8855-efuse",
> +	.size = AN8855_EFUSE_CELL * sizeof(u32),
> +	.stride = sizeof(u32),
> +	.word_size = sizeof(u32),
> +	.reg_read = an8855_efuse_read,
> +};
> +
> +static int an8855_sw_register_nvmem(struct an8855_priv *priv)
> +{
> +	struct nvmem_device *nvmem;
> +
> +	an8855_nvmem_config.priv = priv;
> +	an8855_nvmem_config.dev = priv->dev;
> +	nvmem = devm_nvmem_register(priv->dev, &an8855_nvmem_config);
> +	if (IS_ERR(nvmem))
> +		return PTR_ERR(nvmem);
> +
> +	return 0;
> +}

At some point we should enforce the rule that new drivers for switch
SoCs with complex peripherals should use MFD and move all non-networking
peripherals to drivers handled by their respective subsystems.

I don't have the expertise to review a nvmem driver, and the majority of
them are in drivers/nvmem, with a dedicated subsystem and maintainer.
In general I want to make sure it is clear that I don't encourage the
model where DSA owns the entire mdio_device.

What other peripherals are there on this SoC other than an MDIO bus and
an EFUSE? IRQCHIP, GPIOs, LED controller, sensors?

You can take a look at drivers/mfd/ocelot* and
Documentation/devicetree/bindings/mfd/mscc,ocelot.yaml for an example on
how to use mfd for the top-level MDIO device, and DSA as just the driver
for the Ethernet switch component (which will be represented as a
platform_device).
Vladimir Oltean Dec. 5, 2024, 5:06 p.m. UTC | #3
On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> +	.port_fdb_add = an8855_port_fdb_add,
> +	.port_fdb_del = an8855_port_fdb_del,
> +	.port_fdb_dump = an8855_port_fdb_dump,
> +	.port_mdb_add = an8855_port_mdb_add,
> +	.port_mdb_del = an8855_port_mdb_del,

Please handle the "struct dsa_db" argument of these functions, so that
you can turn on ds->fdb_isolation. It is likely that instead of a single
AN8855_FID_BRIDGED, there needs to be a unique FID allocated for each
VLAN-unaware bridge in order for their FDBs to be isolated from each
other, and so that the same MAC address could live under both bridges.
Christian Marangi Dec. 5, 2024, 5:17 p.m. UTC | #4
On Thu, Dec 05, 2024 at 06:27:59PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> > +static int an8855_efuse_read(void *context, unsigned int offset,
> > +			     void *val, size_t bytes)
> > +{
> > +	struct an8855_priv *priv = context;
> > +
> > +	return regmap_bulk_read(priv->regmap, AN8855_EFUSE_DATA0 + offset,
> > +				val, bytes / sizeof(u32));
> > +}
> > +
> > +static struct nvmem_config an8855_nvmem_config = {
> > +	.name = "an8855-efuse",
> > +	.size = AN8855_EFUSE_CELL * sizeof(u32),
> > +	.stride = sizeof(u32),
> > +	.word_size = sizeof(u32),
> > +	.reg_read = an8855_efuse_read,
> > +};
> > +
> > +static int an8855_sw_register_nvmem(struct an8855_priv *priv)
> > +{
> > +	struct nvmem_device *nvmem;
> > +
> > +	an8855_nvmem_config.priv = priv;
> > +	an8855_nvmem_config.dev = priv->dev;
> > +	nvmem = devm_nvmem_register(priv->dev, &an8855_nvmem_config);
> > +	if (IS_ERR(nvmem))
> > +		return PTR_ERR(nvmem);
> > +
> > +	return 0;
> > +}
> 
> At some point we should enforce the rule that new drivers for switch
> SoCs with complex peripherals should use MFD and move all non-networking
> peripherals to drivers handled by their respective subsystems.
> 
> I don't have the expertise to review a nvmem driver, and the majority of
> them are in drivers/nvmem, with a dedicated subsystem and maintainer.
> In general I want to make sure it is clear that I don't encourage the
> model where DSA owns the entire mdio_device.
> 
> What other peripherals are there on this SoC other than an MDIO bus and
> an EFUSE? IRQCHIP, GPIOs, LED controller, sensors?
> 
> You can take a look at drivers/mfd/ocelot* and
> Documentation/devicetree/bindings/mfd/mscc,ocelot.yaml for an example on
> how to use mfd for the top-level MDIO device, and DSA as just the driver
> for the Ethernet switch component (which will be represented as a
> platform_device).

Hi Vladimir,

I checked the examples and one problems that comes to me is how to model
this if only MDIO is used as a comunication method. Ocelot have PCIE or
SPI but this switch only comunicate with MDIO on his address. So where
should I place the SoC or MFD node? In the switch root node?

Also the big problem is how to model accessing the register with MDIO
with an MFD implementation.

Anyway just to make sure the Switch SoC doesn't expose an actualy MDIO
bus, that is just to solve the problem with the Switch Address shared
with one of the port. (Switch Address can be accessed by every switch
port with a specific page set)

But yes the problem is there... Function is not implemented but the
switch have i2c interface, minimal CPU, GPIO and Timer in it.

Happy to make the required changes, just very confused on how the final
DT node structure.
Christian Marangi Dec. 5, 2024, 5:26 p.m. UTC | #5
On Thu, Dec 05, 2024 at 07:06:29PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> > +	.port_fdb_add = an8855_port_fdb_add,
> > +	.port_fdb_del = an8855_port_fdb_del,
> > +	.port_fdb_dump = an8855_port_fdb_dump,
> > +	.port_mdb_add = an8855_port_mdb_add,
> > +	.port_mdb_del = an8855_port_mdb_del,
> 
> Please handle the "struct dsa_db" argument of these functions, so that
> you can turn on ds->fdb_isolation. It is likely that instead of a single
> AN8855_FID_BRIDGED, there needs to be a unique FID allocated for each
> VLAN-unaware bridge in order for their FDBs to be isolated from each
> other, and so that the same MAC address could live under both bridges.

Mh ok, I hoped we could first have the base DSA driver merged before
starting to applying these kind of feature.

Concept looks handy, ideally I can just assign one ID for each port
like:
port 1 -> FIB 1
port 2 -> FIB 1
port 3 -> FIB 2

Question:
Ports of the same bridge should have the same FIB?

What I need to check is how the switch handle this for learning. Does
the switch correctly create FDB entry with the right FIB? If that's not
the case then I think assisted_learning is needed and HW Learn can't be
used?

(I still need to check if I can assign a default FIB for a port...
Currently the STP register are 2 bit for each FIB id, so 16 different
FIB are possible)

Also do we have a script for selft tests? I remember there was one back
in the days for fdb isolation?
Christian Marangi Dec. 5, 2024, 5:44 p.m. UTC | #6
On Thu, Dec 05, 2024 at 04:12:03PM +0000, Russell King (Oracle) wrote:
> hi Christian,
> 
> On Thu, Dec 05, 2024 at 03:51:33PM +0100, Christian Marangi wrote:
> > diff --git a/drivers/net/dsa/Kconfig b/drivers/net/dsa/Kconfig
> > index 2d10b4d6cfbb..6b6d0b7bae72 100644
> > --- a/drivers/net/dsa/Kconfig
> > +++ b/drivers/net/dsa/Kconfig
> > @@ -24,6 +24,15 @@ config NET_DSA_LOOP
> >  	  This enables support for a fake mock-up switch chip which
> >  	  exercises the DSA APIs.
> >  
> > +
> 
> Niggle - no need for the extra blank line.
>

Will drop!

> > +static int an8855_set_mac_eee(struct dsa_switch *ds, int port,
> > +			      struct ethtool_keee *eee)
> > +{
> > +	struct an8855_priv *priv = ds->priv;
> > +	u32 reg;
> > +	int ret;
> > +
> > +	if (eee->eee_enabled) {
> > +		ret = regmap_read(priv->regmap, AN8855_PMCR_P(port), &reg);
> > +		if (ret)
> > +			return ret;
> > +		/* Force enable EEE if force mode and LINK */
> > +		if (reg & AN8855_PMCR_FORCE_MODE &&
> > +		    reg & AN8855_PMCR_FORCE_LNK) {
> > +			switch (reg & AN8855_PMCR_FORCE_SPEED) {
> > +			case AN8855_PMCR_FORCE_SPEED_1000:
> > +				reg |= AN8855_PMCR_FORCE_EEE1G;
> > +				break;
> > +			case AN8855_PMCR_FORCE_SPEED_100:
> > +				reg |= AN8855_PMCR_FORCE_EEE100;
> > +				break;
> > +			default:
> > +				break;
> > +			}
> 
> Does this forcing force EEE to be enabled for the port, irrespective
> of the negotiation?
> 

This apply only if autoneg is off. In that case EEE is forced. I think
you already had this comment at times for this exact question. If
autoneg is on, EEE is not forced if EEE is turned on.

> > +			ret = regmap_write(priv->regmap, AN8855_PMCR_P(port), reg);
> > +			if (ret)
> > +				return ret;
> > +		}
> > +		ret = regmap_update_bits(priv->regmap, AN8855_PMEEECR_P(port),
> > +					 AN8855_LPI_MODE_EN,
> > +					 eee->tx_lpi_enabled ? AN8855_LPI_MODE_EN : 0);
> > +		if (ret)
> > +			return ret;
> > +	} else {
> > +		ret = regmap_clear_bits(priv->regmap, AN8855_PMCR_P(port),
> > +					AN8855_PMCR_FORCE_EEE1G |
> > +					AN8855_PMCR_FORCE_EEE100);
> > +		if (ret)
> > +			return ret;
> > +
> > +		ret = regmap_clear_bits(priv->regmap, AN8855_PMEEECR_P(port),
> > +					AN8855_LPI_MODE_EN);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	return 0;
> > +}
> > +
> > +static int an8855_get_mac_eee(struct dsa_switch *ds, int port,
> > +			      struct ethtool_keee *eee)
> > +{
> > +	struct an8855_priv *priv = ds->priv;
> > +	u32 reg;
> > +	int ret;
> > +
> > +	ret = regmap_read(priv->regmap, AN8855_PMEEECR_P(port), &reg);
> > +	if (ret)
> > +		return ret;
> > +	eee->tx_lpi_enabled = reg & AN8855_LPI_MODE_EN;
> > +
> > +	ret = regmap_read(priv->regmap, AN8855_CKGCR, &reg);
> > +	if (ret)
> > +		return ret;
> > +	/* Global LPI TXIDLE Threshold, default 60ms (unit 2us) */
> > +	eee->tx_lpi_timer = FIELD_GET(AN8855_LPI_TXIDLE_THD_MASK, reg) / 500;
> 
> There is no point setting tx_lpi_enabled and tx_lpi_timer, as phylib
> will immediately overwrite then in its phy_ethtool_get_eee() method.
> In fact, there's no point to the DSA get_mac_eee() method.
> 
> > +
> > +	ret = regmap_read(priv->regmap, AN8855_PMSR_P(port), &reg);
> > +	if (ret)
> > +		return ret;
> 
> What's the purpose of this read?
> 
> > +
> > +	return 0;
> > +}
> 
> Overall, I would suggest not implementing EEE just yet - I sent out a
> 22 patch RFC patch set last week which implements EEE support in
> phylink, and I have another bunch of patches that clean up DSA that
> I didn't send, which includes getting rid of the DSA get_mac_eee()
> method.
> 
> Now that the bulk of the phylink in-band changes have been merged, my
> plan is to work through getting the RFC patch set merged - I've sent
> out the first four patches today for final review and merging.
> 
> I stated some questions in the RFC cover message that everyone ignored,
> maybe you could look at that and give your views on the points I raised
> there?
> 

Ok to drop these OPs and reimplement later once we have a better
implementation? (really wants the base driver merged so I don't have to
send big patch every time, hope you understand the problem)

> > +static struct phylink_pcs *
> > +an8855_phylink_mac_select_pcs(struct phylink_config *config,
> > +			      phy_interface_t interface)
> > +{
> > +	struct dsa_port *dp = dsa_phylink_to_port(config);
> > +	struct an8855_priv *priv = dp->ds->priv;
> > +
> > +	switch (interface) {
> > +	case PHY_INTERFACE_MODE_SGMII:
> > +	case PHY_INTERFACE_MODE_2500BASEX:
> > +		return &priv->pcs;
> > +	default:
> > +		return NULL;
> > +	}
> 
> Great, exactly what I like to see in mac_select_pcs()!
> 
> > +}
> > +
> > +static void
> > +an8855_phylink_mac_config(struct phylink_config *config, unsigned int mode,
> > +			  const struct phylink_link_state *state)
> > +{
> > +	struct dsa_port *dp = dsa_phylink_to_port(config);
> > +	struct dsa_switch *ds = dp->ds;
> > +	struct an8855_priv *priv;
> > +	int port = dp->index;
> > +
> > +	priv = ds->priv;
> > +
> > +	/* Nothing to configure for internal ports */
> > +	if (port != 5)
> > +		return;
> > +
> > +	regmap_update_bits(priv->regmap, AN8855_PMCR_P(port),
> > +			   AN8855_PMCR_IFG_XMIT | AN8855_PMCR_MAC_MODE |
> > +			   AN8855_PMCR_BACKOFF_EN | AN8855_PMCR_BACKPR_EN,
> > +			   FIELD_PREP(AN8855_PMCR_IFG_XMIT, 0x1) |
> > +			   AN8855_PMCR_MAC_MODE | AN8855_PMCR_BACKOFF_EN |
> > +			   AN8855_PMCR_BACKPR_EN);
> > +}
> 
> This looks much better, thanks.
> 
> > +
> > +static void an8855_phylink_get_caps(struct dsa_switch *ds, int port,
> > +				    struct phylink_config *config)
> > +{
> > +	switch (port) {
> > +	case 0:
> > +	case 1:
> > +	case 2:
> > +	case 3:
> > +	case 4:
> > +		__set_bit(PHY_INTERFACE_MODE_GMII,
> > +			  config->supported_interfaces);
> > +		__set_bit(PHY_INTERFACE_MODE_INTERNAL,
> > +			  config->supported_interfaces);
> > +		break;
> > +	case 5:
> > +		phy_interface_set_rgmii(config->supported_interfaces);
> > +		__set_bit(PHY_INTERFACE_MODE_SGMII,
> > +			  config->supported_interfaces);
> > +		__set_bit(PHY_INTERFACE_MODE_2500BASEX,
> > +			  config->supported_interfaces);
> > +		break;
> > +	}
> > +
> > +	config->mac_capabilities = MAC_ASYM_PAUSE | MAC_SYM_PAUSE |
> > +				   MAC_10 | MAC_100 | MAC_1000FD | MAC_2500FD;
> > +}
> > +
> > +static void
> > +an8855_phylink_mac_link_down(struct phylink_config *config, unsigned int mode,
> > +			     phy_interface_t interface)
> > +{
> > +	struct dsa_port *dp = dsa_phylink_to_port(config);
> > +	struct an8855_priv *priv = dp->ds->priv;
> > +
> > +	/* With autoneg just disable TX/RX else also force link down */
> > +	if (phylink_autoneg_inband(mode)) {
> > +		regmap_clear_bits(priv->regmap, AN8855_PMCR_P(dp->index),
> > +				  AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN);
> > +	} else {
> > +		regmap_update_bits(priv->regmap, AN8855_PMCR_P(dp->index),
> > +				   AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN |
> > +				   AN8855_PMCR_FORCE_MODE | AN8855_PMCR_FORCE_LNK,
> > +				   AN8855_PMCR_FORCE_MODE);
> > +	}
> > +}
> > +
> > +static void
> > +an8855_phylink_mac_link_up(struct phylink_config *config,
> > +			   struct phy_device *phydev, unsigned int mode,
> > +			   phy_interface_t interface, int speed, int duplex,
> > +			   bool tx_pause, bool rx_pause)
> > +{
> > +	struct dsa_port *dp = dsa_phylink_to_port(config);
> > +	struct an8855_priv *priv = dp->ds->priv;
> > +	int port = dp->index;
> > +	u32 reg;
> > +
> > +	reg = regmap_read(priv->regmap, AN8855_PMCR_P(port), &reg);
> > +	if (phylink_autoneg_inband(mode)) {
> > +		reg &= ~AN8855_PMCR_FORCE_MODE;
> > +	} else {
> > +		reg |= AN8855_PMCR_FORCE_MODE | AN8855_PMCR_FORCE_LNK;
> > +
> > +		reg &= ~AN8855_PMCR_FORCE_SPEED;
> > +		switch (speed) {
> > +		case SPEED_10:
> > +			reg |= AN8855_PMCR_FORCE_SPEED_10;
> > +			break;
> > +		case SPEED_100:
> > +			reg |= AN8855_PMCR_FORCE_SPEED_100;
> > +			break;
> > +		case SPEED_1000:
> > +			reg |= AN8855_PMCR_FORCE_SPEED_1000;
> > +			break;
> > +		case SPEED_2500:
> > +			reg |= AN8855_PMCR_FORCE_SPEED_2500;
> > +			break;
> > +		case SPEED_5000:
> > +			dev_err(priv->dev, "Missing support for 5G speed. Aborting...\n");
> > +			return;
> > +		}
> > +
> > +		reg &= ~AN8855_PMCR_FORCE_FDX;
> > +		if (duplex == DUPLEX_FULL)
> > +			reg |= AN8855_PMCR_FORCE_FDX;
> > +
> > +		reg &= ~AN8855_PMCR_RX_FC_EN;
> > +		if (rx_pause || dsa_port_is_cpu(dp))
> > +			reg |= AN8855_PMCR_RX_FC_EN;
> > +
> > +		reg &= ~AN8855_PMCR_TX_FC_EN;
> > +		if (rx_pause || dsa_port_is_cpu(dp))
> > +			reg |= AN8855_PMCR_TX_FC_EN;
> > +
> > +		/* Disable any EEE options */
> > +		reg &= ~(AN8855_PMCR_FORCE_EEE5G | AN8855_PMCR_FORCE_EEE2P5G |
> > +			 AN8855_PMCR_FORCE_EEE1G | AN8855_PMCR_FORCE_EEE100);
> > +	}
> > +
> > +	reg |= AN8855_PMCR_TX_EN | AN8855_PMCR_RX_EN;
> > +
> > +	regmap_write(priv->regmap, AN8855_PMCR_P(port), reg);
> > +}
> 
> All the above looks fine to me.
> 
> > +
> > +static void an8855_pcs_get_state(struct phylink_pcs *pcs,
> > +				 struct phylink_link_state *state)
> > +{
> > +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> > +	u32 val;
> > +	int ret;
> > +
> > +	ret = regmap_read(priv->regmap, AN8855_PMSR_P(AN8855_CPU_PORT), &val);
> > +	if (ret < 0) {
> > +		state->link = false;
> > +		return;
> > +	}
> > +
> > +	state->link = !!(val & AN8855_PMSR_LNK);
> > +	state->an_complete = state->link;
> > +	state->duplex = (val & AN8855_PMSR_DPX) ? DUPLEX_FULL :
> > +						  DUPLEX_HALF;
> > +
> > +	switch (val & AN8855_PMSR_SPEED) {
> > +	case AN8855_PMSR_SPEED_10:
> > +		state->speed = SPEED_10;
> > +		break;
> > +	case AN8855_PMSR_SPEED_100:
> > +		state->speed = SPEED_100;
> > +		break;
> > +	case AN8855_PMSR_SPEED_1000:
> > +		state->speed = SPEED_1000;
> > +		break;
> > +	case AN8855_PMSR_SPEED_2500:
> > +		state->speed = SPEED_2500;
> > +		break;
> > +	case AN8855_PMSR_SPEED_5000:
> > +		dev_err(priv->dev, "Missing support for 5G speed. Setting Unknown.\n");
> > +		fallthrough;
> > +	default:
> > +		state->speed = SPEED_UNKNOWN;
> > +		break;
> > +	}
> > +
> > +	if (val & AN8855_PMSR_RX_FC)
> > +		state->pause |= MLO_PAUSE_RX;
> > +	if (val & AN8855_PMSR_TX_FC)
> > +		state->pause |= MLO_PAUSE_TX;
> > +}
> > +
> > +static int an8855_pcs_config(struct phylink_pcs *pcs, unsigned int neg_mode,
> > +			     phy_interface_t interface,
> > +			     const unsigned long *advertising,
> > +			     bool permit_pause_to_mac)
> > +{
> > +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> > +	u32 val;
> > +	int ret;
> > +
> > +	switch (interface) {
> > +	case PHY_INTERFACE_MODE_SGMII:
> > +		break;
> > +	case PHY_INTERFACE_MODE_2500BASEX:
> > +		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> > +			dev_err(priv->dev, "in-band negotiation unsupported");
> > +			return -EINVAL;
> > +		}
> > +		break;
> 
> Now that we have the phylink in-band changes merged, along with a new
> PCS .pcs_inband_caps() method, please implement this method to indicate
> that in-band is not supported in 2500BASE-X mode. Phylink will issue a
> warning if it attempts to go against that (particularly when e.g. the
> PHY demands in-band but the PCS doesn't support it - a case that likely
> will never work in practice.)
> 

Thanks for the hint of the new OPs, yes will reimplement this.

> > +	default:
> > +		return -EINVAL;
> > +	}
> > +
> > +	/*                   !!! WELCOME TO HELL !!!                   */
> > +
> > +	/* TX FIR - improve TX EYE */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_INTF_CTRL_10,
> > +				 AN8855_RG_DA_QP_TX_FIR_C2_SEL |
> > +				 AN8855_RG_DA_QP_TX_FIR_C2_FORCE |
> > +				 AN8855_RG_DA_QP_TX_FIR_C1_SEL |
> > +				 AN8855_RG_DA_QP_TX_FIR_C1_FORCE,
> > +				 AN8855_RG_DA_QP_TX_FIR_C2_SEL |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C2_FORCE, 0x4) |
> > +				 AN8855_RG_DA_QP_TX_FIR_C1_SEL |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C1_FORCE, 0x0));
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x0;
> > +	else
> > +		val = 0xd;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_INTF_CTRL_11,
> > +				 AN8855_RG_DA_QP_TX_FIR_C0B_SEL |
> > +				 AN8855_RG_DA_QP_TX_FIR_C0B_FORCE,
> > +				 AN8855_RG_DA_QP_TX_FIR_C0B_SEL |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_TX_FIR_C0B_FORCE, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* RX CDR - improve RX Jitter Tolerance */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x5;
> > +	else
> > +		val = 0x6;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_BOT_LIM,
> > +				 AN8855_RG_QP_CDR_LPF_KP_GAIN |
> > +				 AN8855_RG_QP_CDR_LPF_KI_GAIN,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_KP_GAIN, val) |
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_KI_GAIN, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x1;
> > +	else
> > +		val = 0x0;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_QP_DIG_MODE_CTRL_1,
> > +				 AN8855_RG_TPHY_SPEED,
> > +				 FIELD_PREP(AN8855_RG_TPHY_SPEED, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - LPF */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				 AN8855_RG_DA_QP_PLL_RICO_SEL_INTF |
> > +				 AN8855_RG_DA_QP_PLL_FBKSEL_INTF |
> > +				 AN8855_RG_DA_QP_PLL_BR_INTF |
> > +				 AN8855_RG_DA_QP_PLL_BPD_INTF |
> > +				 AN8855_RG_DA_QP_PLL_BPA_INTF |
> > +				 AN8855_RG_DA_QP_PLL_BC_INTF,
> > +				 AN8855_RG_DA_QP_PLL_RICO_SEL_INTF |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_FBKSEL_INTF, 0x0) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BR_INTF, 0x3) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BPD_INTF, 0x0) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BPA_INTF, 0x5) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_BC_INTF, 0x1));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - ICO */
> > +	ret = regmap_set_bits(priv->regmap, AN8855_PLL_CTRL_4,
> > +			      AN8855_RG_DA_QP_PLL_ICOLP_EN_INTF);
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				AN8855_RG_DA_QP_PLL_ICOIQ_EN_INTF);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - CHP */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x6;
> > +	else
> > +		val = 0x4;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				 AN8855_RG_DA_QP_PLL_IR_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_IR_INTF, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - PFD */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				 AN8855_RG_DA_QP_PLL_PFD_OFFSET_EN_INTRF |
> > +				 AN8855_RG_DA_QP_PLL_PFD_OFFSET_INTF |
> > +				 AN8855_RG_DA_QP_PLL_KBAND_PREDIV_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_PFD_OFFSET_INTF, 0x1) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_KBAND_PREDIV_INTF, 0x1));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - POSTDIV */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				 AN8855_RG_DA_QP_PLL_POSTDIV_EN_INTF |
> > +				 AN8855_RG_DA_QP_PLL_PHY_CK_EN_INTF |
> > +				 AN8855_RG_DA_QP_PLL_PCK_SEL_INTF,
> > +				 AN8855_RG_DA_QP_PLL_PCK_SEL_INTF);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - SDM */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				 AN8855_RG_DA_QP_PLL_SDM_HREN_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SDM_HREN_INTF, 0x0));
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CTRL_2,
> > +				AN8855_RG_DA_QP_PLL_SDM_IFM_INTF);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_SS_LCPLL_PWCTL_SETTING_2,
> > +				 AN8855_RG_NCPO_ANA_MSB,
> > +				 FIELD_PREP(AN8855_RG_NCPO_ANA_MSB, 0x1));
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x7a000000;
> > +	else
> > +		val = 0x48000000;
> > +	ret = regmap_write(priv->regmap, AN8855_SS_LCPLL_TDC_FLT_2,
> > +			   FIELD_PREP(AN8855_RG_LCPLL_NCPO_VALUE, val));
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_write(priv->regmap, AN8855_SS_LCPLL_TDC_PCW_1,
> > +			   FIELD_PREP(AN8855_RG_LCPLL_PON_HRDDS_PCW_NCPO_GPON, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_SS_LCPLL_TDC_FLT_5,
> > +				AN8855_RG_LCPLL_NCPO_CHG);
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CK_CTRL_0,
> > +				AN8855_RG_DA_QP_PLL_SDM_DI_EN_INTF);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - SS */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_3,
> > +				 AN8855_RG_DA_QP_PLL_SSC_DELTA_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_DELTA_INTF, 0x0));
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_4,
> > +				 AN8855_RG_DA_QP_PLL_SSC_DIR_DLY_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_DIR_DLY_INTF, 0x0));
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PLL_CTRL_3,
> > +				 AN8855_RG_DA_QP_PLL_SSC_PERIOD_INTF,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_PLL_SSC_PERIOD_INTF, 0x0));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PLL - TDC */
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_PLL_CK_CTRL_0,
> > +				AN8855_RG_DA_QP_PLL_TDC_TXCK_SEL_INTF);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_PLL_SDM_ORD,
> > +			      AN8855_RG_QP_PLL_SSC_TRI_EN);
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_PLL_SDM_ORD,
> > +			      AN8855_RG_QP_PLL_SSC_PHASE_INI);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_RX_DAC_EN,
> > +				 AN8855_RG_QP_SIGDET_HF,
> > +				 FIELD_PREP(AN8855_RG_QP_SIGDET_HF, 0x2));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* TCL Disable (only for Co-SIM) */
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_PON_RXFEDIG_CTRL_0,
> > +				AN8855_RG_QP_EQ_RX500M_CK_SEL);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* TX Init */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x4;
> > +	else
> > +		val = 0x0;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_TX_MODE,
> > +				 AN8855_RG_QP_TX_RESERVE |
> > +				 AN8855_RG_QP_TX_MODE_16B_EN,
> > +				 FIELD_PREP(AN8855_RG_QP_TX_RESERVE, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* RX Control/Init */
> > +	ret = regmap_set_bits(priv->regmap, AN8855_RG_QP_RXAFE_RESERVE,
> > +			      AN8855_RG_QP_CDR_PD_10B_EN);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x1;
> > +	else
> > +		val = 0x2;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_MJV_LIM,
> > +				 AN8855_RG_QP_CDR_LPF_RATIO,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_LPF_RATIO, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_LPF_SETVALUE,
> > +				 AN8855_RG_QP_CDR_PR_BUF_IN_SR |
> > +				 AN8855_RG_QP_CDR_PR_BETA_SEL,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_BUF_IN_SR, 0x6) |
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_BETA_SEL, 0x1));
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0xf;
> > +	else
> > +		val = 0xc;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_CKREF_DIV1,
> > +				 AN8855_RG_QP_CDR_PR_DAC_BAND,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_DAC_BAND, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE,
> > +				 AN8855_RG_QP_CDR_PR_KBAND_PCIE_MODE |
> > +				 AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE_MASK,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE_MASK, 0x19));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_FORCE_IBANDLPF_R_OFF,
> > +				 AN8855_RG_QP_CDR_PHYCK_SEL |
> > +				 AN8855_RG_QP_CDR_PHYCK_RSTB |
> > +				 AN8855_RG_QP_CDR_PHYCK_DIV,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PHYCK_SEL, 0x2) |
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PHYCK_DIV, 0x21));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_RG_QP_CDR_PR_KBAND_DIV_PCIE,
> > +				AN8855_RG_QP_CDR_PR_XFICK_EN);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RG_QP_CDR_PR_CKREF_DIV1,
> > +				 AN8855_RG_QP_CDR_PR_KBAND_DIV,
> > +				 FIELD_PREP(AN8855_RG_QP_CDR_PR_KBAND_DIV, 0x4));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_26,
> > +				 AN8855_RG_QP_EQ_RETRAIN_ONLY_EN |
> > +				 AN8855_RG_LINK_NE_EN |
> > +				 AN8855_RG_LINK_ERRO_EN,
> > +				 AN8855_RG_QP_EQ_RETRAIN_ONLY_EN |
> > +				 AN8855_RG_LINK_ERRO_EN);
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_DLY_0,
> > +				 AN8855_RG_QP_RX_SAOSC_EN_H_DLY |
> > +				 AN8855_RG_QP_RX_PI_CAL_EN_H_DLY,
> > +				 FIELD_PREP(AN8855_RG_QP_RX_SAOSC_EN_H_DLY, 0x3f) |
> > +				 FIELD_PREP(AN8855_RG_QP_RX_PI_CAL_EN_H_DLY, 0x6f));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_42,
> > +				 AN8855_RG_QP_EQ_EN_DLY,
> > +				 FIELD_PREP(AN8855_RG_QP_EQ_EN_DLY, 0x150));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_2,
> > +				 AN8855_RG_QP_RX_EQ_EN_H_DLY,
> > +				 FIELD_PREP(AN8855_RG_QP_RX_EQ_EN_H_DLY, 0x150));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_PON_RXFEDIG_CTRL_9,
> > +				 AN8855_RG_QP_EQ_LEQOSC_DLYCNT,
> > +				 FIELD_PREP(AN8855_RG_QP_EQ_LEQOSC_DLYCNT, 0x1));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_8,
> > +				 AN8855_RG_DA_QP_SAOSC_DONE_TIME |
> > +				 AN8855_RG_DA_QP_LEQOS_EN_TIME,
> > +				 FIELD_PREP(AN8855_RG_DA_QP_SAOSC_DONE_TIME, 0x200) |
> > +				 FIELD_PREP(AN8855_RG_DA_QP_LEQOS_EN_TIME, 0xfff));
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Frequency meter */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +		val = 0x10;
> > +	else
> > +		val = 0x28;
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_5,
> > +				 AN8855_RG_FREDET_CHK_CYCLE,
> > +				 FIELD_PREP(AN8855_RG_FREDET_CHK_CYCLE, val));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_6,
> > +				 AN8855_RG_FREDET_GOLDEN_CYCLE,
> > +				 FIELD_PREP(AN8855_RG_FREDET_GOLDEN_CYCLE, 0x64));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_update_bits(priv->regmap, AN8855_RX_CTRL_7,
> > +				 AN8855_RG_FREDET_TOLERATE_CYCLE,
> > +				 FIELD_PREP(AN8855_RG_FREDET_TOLERATE_CYCLE, 0x2710));
> > +	if (ret)
> > +		return ret;
> > +
> > +	ret = regmap_set_bits(priv->regmap, AN8855_PLL_CTRL_0,
> > +			      AN8855_RG_PHYA_AUTO_INIT);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* PCS Init */
> > +	if (interface == PHY_INTERFACE_MODE_SGMII &&
> > +	    neg_mode == PHYLINK_PCS_NEG_INBAND_DISABLED) {
> > +		ret = regmap_clear_bits(priv->regmap, AN8855_QP_DIG_MODE_CTRL_0,
> > +					AN8855_RG_SGMII_MODE | AN8855_RG_SGMII_AN_EN);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	ret = regmap_clear_bits(priv->regmap, AN8855_RG_HSGMII_PCS_CTROL_1,
> > +				AN8855_RG_TBI_10B_MODE);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> > +		/* Set AN Ability - Interrupt */
> > +		ret = regmap_set_bits(priv->regmap, AN8855_SGMII_REG_AN_FORCE_CL37,
> > +				      AN8855_RG_FORCE_AN_DONE);
> > +		if (ret)
> > +			return ret;
> > +
> > +		ret = regmap_update_bits(priv->regmap, AN8855_SGMII_REG_AN_13,
> > +					 AN8855_SGMII_REMOTE_FAULT_DIS |
> > +					 AN8855_SGMII_IF_MODE,
> > +					 AN8855_SGMII_REMOTE_FAULT_DIS |
> > +					 FIELD_PREP(AN8855_SGMII_IF_MODE, 0xb));
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	/* Rate Adaption - GMII path config. */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX) {
> > +		ret = regmap_clear_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> > +					AN8855_RG_P0_DIS_MII_MODE);
> > +		if (ret)
> > +			return ret;
> > +	} else {
> > +		if (neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED) {
> > +			ret = regmap_set_bits(priv->regmap, AN8855_MII_RA_AN_ENABLE,
> > +					      AN8855_RG_P0_RA_AN_EN);
> > +			if (ret)
> > +				return ret;
> > +		} else {
> > +			ret = regmap_update_bits(priv->regmap, AN8855_RG_AN_SGMII_MODE_FORCE,
> > +						 AN8855_RG_FORCE_CUR_SGMII_MODE |
> > +						 AN8855_RG_FORCE_CUR_SGMII_SEL,
> > +						 AN8855_RG_FORCE_CUR_SGMII_SEL);
> > +			if (ret)
> > +				return ret;
> > +
> > +			ret = regmap_clear_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> > +						AN8855_RG_P0_MII_RA_RX_EN |
> > +						AN8855_RG_P0_MII_RA_TX_EN |
> > +						AN8855_RG_P0_MII_RA_RX_MODE |
> > +						AN8855_RG_P0_MII_RA_TX_MODE);
> > +			if (ret)
> > +				return ret;
> > +		}
> > +
> > +		ret = regmap_set_bits(priv->regmap, AN8855_RATE_ADP_P0_CTRL_0,
> > +				      AN8855_RG_P0_MII_MODE);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	ret = regmap_set_bits(priv->regmap, AN8855_RG_RATE_ADAPT_CTRL_0,
> > +			      AN8855_RG_RATE_ADAPT_RX_BYPASS |
> > +			      AN8855_RG_RATE_ADAPT_TX_BYPASS |
> > +			      AN8855_RG_RATE_ADAPT_RX_EN |
> > +			      AN8855_RG_RATE_ADAPT_TX_EN);
> > +	if (ret)
> > +		return ret;
> > +
> > +	/* Disable AN if not in autoneg */
> > +	ret = regmap_update_bits(priv->regmap, AN8855_SGMII_REG_AN0, BMCR_ANENABLE,
> > +				 neg_mode == PHYLINK_PCS_NEG_INBAND_ENABLED ? BMCR_ANENABLE :
> > +									      0);
> > +	if (ret)
> > +		return ret;
> > +
> > +	if (interface == PHY_INTERFACE_MODE_SGMII &&
> > +	    neg_mode == PHYLINK_PCS_NEG_INBAND_DISABLED) {
> > +		ret = regmap_set_bits(priv->regmap, AN8855_PHY_RX_FORCE_CTRL_0,
> > +				      AN8855_RG_FORCE_TXC_SEL);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	/* Force Speed with fixed-link or 2500base-x as doesn't support aneg */
> > +	if (interface == PHY_INTERFACE_MODE_2500BASEX ||
> > +	    neg_mode != PHYLINK_PCS_NEG_INBAND_ENABLED) {
> > +		if (interface == PHY_INTERFACE_MODE_2500BASEX)
> > +			val = AN8855_RG_LINK_MODE_P0_SPEED_2500;
> > +		else
> > +			val = AN8855_RG_LINK_MODE_P0_SPEED_1000;
> > +		ret = regmap_update_bits(priv->regmap, AN8855_SGMII_STS_CTRL_0,
> > +					 AN8855_RG_LINK_MODE_P0 |
> > +					 AN8855_RG_FORCE_SPD_MODE_P0,
> > +					 val | AN8855_RG_FORCE_SPD_MODE_P0);
> > +		if (ret)
> > +			return ret;
> > +	}
> > +
> > +	/* bypass flow control to MAC */
> > +	ret = regmap_write(priv->regmap, AN8855_MSG_RX_LIK_STS_0,
> > +			   AN8855_RG_DPX_STS_P3 | AN8855_RG_DPX_STS_P2 |
> > +			   AN8855_RG_DPX_STS_P1 | AN8855_RG_TXFC_STS_P0 |
> > +			   AN8855_RG_RXFC_STS_P0 | AN8855_RG_DPX_STS_P0);
> > +	if (ret)
> > +		return ret;
> > +	ret = regmap_write(priv->regmap, AN8855_MSG_RX_LIK_STS_2,
> > +			   AN8855_RG_RXFC_AN_BYPASS_P3 |
> > +			   AN8855_RG_RXFC_AN_BYPASS_P2 |
> > +			   AN8855_RG_RXFC_AN_BYPASS_P1 |
> > +			   AN8855_RG_TXFC_AN_BYPASS_P3 |
> > +			   AN8855_RG_TXFC_AN_BYPASS_P2 |
> > +			   AN8855_RG_TXFC_AN_BYPASS_P1 |
> > +			   AN8855_RG_DPX_AN_BYPASS_P3 |
> > +			   AN8855_RG_DPX_AN_BYPASS_P2 |
> > +			   AN8855_RG_DPX_AN_BYPASS_P1 |
> > +			   AN8855_RG_DPX_AN_BYPASS_P0);
> > +	if (ret)
> > +		return ret;
> 
> Is the above disruptive to the link if it is executed when the link is
> already up? Do you need to re-execute it even when "interface" hasn't
> changed?
> 
> This gets called to possibly change the PCS advertisement in response
> to ethtool -s, so ought not have any link disrupting effects if
> nothing has changed.
> 

No these change only after a switch reset, so they don't need to be
re-executed and doesn't interrupt the link.

> > +
> > +	return 0;
> > +}
> > +
> > +static void an8855_pcs_an_restart(struct phylink_pcs *pcs)
> > +{
> > +	struct an8855_priv *priv = container_of(pcs, struct an8855_priv, pcs);
> > +
> > +	regmap_set_bits(priv->regmap, AN8855_SGMII_REG_AN0, BMCR_ANRESTART);
> 
> This won't get called for SGMII - Cisco SGMII is merely the PHY telling
> the PCS/MAC what was negotiated at the media and how it's going to be
> operating the SGMII link. AN restart is a media side thing, so that
> would happen at the PHY in that case. I suggest this becomes an empty
> no-op function.
> 

Soo this function won't ever be called in the current implementation?

I would like to keep the same PCS config of the original driver (just to
be sure). This is called for SGMII configuration in autoneg right after
BMCR_ANENABLE, ok for you to move this call in pcs_config or do you
think I should drop it entirely?
Vladimir Oltean Dec. 5, 2024, 6:05 p.m. UTC | #7
On Thu, Dec 05, 2024 at 06:17:18PM +0100, Christian Marangi wrote:
> I checked the examples and one problems that comes to me is how to model
> this if only MDIO is used as a comunication method. Ocelot have PCIE or
> SPI but this switch only comunicate with MDIO on his address.

I don't see why this matters. There will be a top-level device driver,
which in this case will be an mdio_driver and will use mdiobus_{read,write}
to physically access registers. This driver will create regmaps and add
them to devres using devm_regmap_init(). From devres, DSA and other child
drivers can use dev_get_regmap(dev->parent) and perform their I/O through
regmap.

This driver is already written for regmap, so part of the work can
already be reused.

> So where should I place the SoC or MFD node? In the switch root node?

The SoC should be placed on the host MDIO bus. And the Ethernet switch
component should be a child of the SoC. Ideally, so should be all other
switch peripherals: on the same level as the Ethernet switch.

> Also the big problem is how to model accessing the register with MDIO
> with an MFD implementation.
> 
> Anyway just to make sure the Switch SoC doesn't expose an actualy MDIO
> bus, that is just to solve the problem with the Switch Address shared
> with one of the port. (Switch Address can be accessed by every switch
> port with a specific page set)

Sorry, I don't understand this, can you explain more? "Switch Address
can be accessed by every switch port with a specific page set"

In the code, I see that the priv->bus and priv->phy_base are used to
perform MDIO accesses for anything related to the switch. That's perfect,
it means that all switch registers are concentrated on a single MDIO
address, behind a single mdio_device. If that weren't the case, things
would get messy, because the Linux device model associates an MDIO device
with a single address on its bus.

And then we have an8855_phy_read() and an8855_phy_write(), which in my
understanding are the ops of a fake MDIO controller, one which has no
registers or MDIO address space of its own, but is just a passthrough
towards the host MDIO bus's address space. I have no idea why you don't
just put a phy-handle from the switch user ports to PHYs located on the
host MDIO bus directly, and why you go through this middle entity, but I
expect you will clarify. Creating an MDIO bus from DSA for internal PHYs
is completely optional if no special handling is required.

To explain again: In the MFD proposal, there is only one driver who has
access to the mdio_device from the host bus: the MFD driver. Depending
on how it implements the regmaps it presents to the children, it can
control page switching, etc etc. The child devices only operate with
regmaps, and have no idea of the underlying hardware access method.

> But yes the problem is there... Function is not implemented but the
> switch have i2c interface, minimal CPU, GPIO and Timer in it.
> 
> Happy to make the required changes, just very confused on how the final
> DT node structure.
> 
> -- 
> 	Ansuel
Christian Marangi Dec. 5, 2024, 6:29 p.m. UTC | #8
On Thu, Dec 05, 2024 at 08:05:39PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 06:17:18PM +0100, Christian Marangi wrote:
> > I checked the examples and one problems that comes to me is how to model
> > this if only MDIO is used as a comunication method. Ocelot have PCIE or
> > SPI but this switch only comunicate with MDIO on his address.
> 
> I don't see why this matters. There will be a top-level device driver,
> which in this case will be an mdio_driver and will use mdiobus_{read,write}
> to physically access registers. This driver will create regmaps and add
> them to devres using devm_regmap_init(). From devres, DSA and other child
> drivers can use dev_get_regmap(dev->parent) and perform their I/O through
> regmap.
> 
> This driver is already written for regmap, so part of the work can
> already be reused.
> 
> > So where should I place the SoC or MFD node? In the switch root node?
> 
> The SoC should be placed on the host MDIO bus. And the Ethernet switch
> component should be a child of the SoC. Ideally, so should be all other
> switch peripherals: on the same level as the Ethernet switch.
>

Ohhhh ok, wasn't clear to me the MFD driver had to be placed in the mdio
node.

To make it clear this would be an implementation.

mdio_bus: mdio-bus {
	#address-cells = <1>;
	#size-cells = <0>;

	...

	mfd@1 {
		compatible = "airoha,an8855-mfd";
		reg = <1>;

		nvmem_node {
			...
		};

		switch_node {
			...
		};
	};
};

Consider tho that recently I faced some problem with such structure with
DT mainatiners asking to keep everything in the MFD node. But lets see
how it goes.

Well aware of the MFD API, (had to have some fun recently) so the only
confusing part was the node placement.

> > Also the big problem is how to model accessing the register with MDIO
> > with an MFD implementation.
> > 
> > Anyway just to make sure the Switch SoC doesn't expose an actualy MDIO
> > bus, that is just to solve the problem with the Switch Address shared
> > with one of the port. (Switch Address can be accessed by every switch
> > port with a specific page set)
> 
> Sorry, I don't understand this, can you explain more? "Switch Address
> can be accessed by every switch port with a specific page set"
> 
> In the code, I see that the priv->bus and priv->phy_base are used to
> perform MDIO accesses for anything related to the switch. That's perfect,
> it means that all switch registers are concentrated on a single MDIO
> address, behind a single mdio_device. If that weren't the case, things
> would get messy, because the Linux device model associates an MDIO device
> with a single address on its bus.
> 
> And then we have an8855_phy_read() and an8855_phy_write(), which in my
> understanding are the ops of a fake MDIO controller, one which has no
> registers or MDIO address space of its own, but is just a passthrough
> towards the host MDIO bus's address space. I have no idea why you don't
> just put a phy-handle from the switch user ports to PHYs located on the
> host MDIO bus directly, and why you go through this middle entity, but I
> expect you will clarify. Creating an MDIO bus from DSA for internal PHYs
> is completely optional if no special handling is required.

The difficulties I found (and maybe is very easy to solve and I'm
missing something here) is that switch and internal PHY port have the
same address and conflicts.

Switch will be at address 1 (or 2 3 4 5... every port can access switch
register with page 0x4)

DSA port 0 will be at address 1, that is already occupied by the switch.

Defining the DSA port node on the host MDIO bus works correctly for
every port but for port 0 (the one at address 1), the kernel complains
and is not init. (as it does conflict with the switch that is at the
same address) (can't remember the exact warning)

> 
> To explain again: In the MFD proposal, there is only one driver who has
> access to the mdio_device from the host bus: the MFD driver. Depending
> on how it implements the regmaps it presents to the children, it can
> control page switching, etc etc. The child devices only operate with
> regmaps, and have no idea of the underlying hardware access method.
> 
> > But yes the problem is there... Function is not implemented but the
> > switch have i2c interface, minimal CPU, GPIO and Timer in it.
> > 
> > Happy to make the required changes, just very confused on how the final
> > DT node structure.
> > 
> > -- 
> > 	Ansuel
>
Vladimir Oltean Dec. 5, 2024, 6:34 p.m. UTC | #9
On Thu, Dec 05, 2024 at 06:26:01PM +0100, Christian Marangi wrote:
> Concept looks handy, ideally I can just assign one ID for each port
> like:
> port 1 -> FIB 1
> port 2 -> FIB 1
> port 3 -> FIB 2
> 
> Question:
> Ports of the same bridge should have the same FIB?

The answer, as well as many other explanations, is under the "Address
databases" section of Documentation/networking/dsa/dsa.rst. Please read
it through before starting to implement anything.

> What I need to check is how the switch handle this for learning. Does
> the switch correctly create FDB entry with the right FIB?

You're asking me how an8855 behaves? I have no idea, I never interacted
with it :-|

The idea as far as the DSA API is concerned would be to learn addresses
in the bridge database (DSA_DB_BRIDGE) that the user port is configured
for, and not learn any addresses in the port-private database (DSA_DB_PORT).

> If that's not the case then I think assisted_learning is needed and HW
> Learn can't be used?

ds->assisted_learning_on_cpu_port applies, as suggested by its name,
only on the CPU port. On user ports, address learning should work normally.

As you will find in the documentation, the CPU port is not like a user
port, in the sense that it is not configured to service a single address
database, but all of them. So, source learning on the CPU port will not
work unless the switch knows which address database should each packet
be associated with.

In principle, one way could be to pass, during tagger xmit, the database ID,
so that the switch knows that it must learn the MAC SA of this packet in
this FID. I don't have the full image of the Mediatek DSA tag format,
but if an8855 is anything like mt7530, this option isn't available.
Thus, like mt7530, it needs to set ds->assisted_learning_on_cpu_port, so
that software will call port_fdb_add() on the CPU port with the correct
dsa_db (for the right bridge) as argument. But I don't think that is
going to pose any sort of issue.

> (I still need to check if I can assign a default FIB for a port...
> Currently the STP register are 2 bit for each FIB id, so 16 different
> FIB are possible)
> 
> Also do we have a script for selft tests? I remember there was one back
> in the days for fdb isolation?

I don't remember right now, I don't think we do. I'll try to come up
with something in the following days.
Vladimir Oltean Dec. 5, 2024, 6:50 p.m. UTC | #10
On Thu, Dec 05, 2024 at 07:29:53PM +0100, Christian Marangi wrote:
> Ohhhh ok, wasn't clear to me the MFD driver had to be placed in the mdio
> node.
> 
> To make it clear this would be an implementation.
> 
> mdio_bus: mdio-bus {
> 	#address-cells = <1>;
> 	#size-cells = <0>;
> 
> 	...
> 
> 	mfd@1 {
> 		compatible = "airoha,an8855-mfd";
> 		reg = <1>;
> 
> 		nvmem_node {
> 			...
> 		};
> 
> 		switch_node {
> 			...
> 		};
> 	};
> };

I mean, I did mention Documentation/devicetree/bindings/mfd/mscc,ocelot.yaml
in my initial reply, which has an example with exactly this layout...

> The difficulties I found (and maybe is very easy to solve and I'm
> missing something here) is that switch and internal PHY port have the
> same address and conflicts.
> 
> Switch will be at address 1 (or 2 3 4 5... every port can access switch
> register with page 0x4)
> 
> DSA port 0 will be at address 1, that is already occupied by the switch.
> 
> Defining the DSA port node on the host MDIO bus works correctly for
> every port but for port 0 (the one at address 1), the kernel complains
> and is not init. (as it does conflict with the switch that is at the
> same address) (can't remember the exact warning)

Can any of these MDIO addresses (switch or ports) be changed through registers?

I guess the non-hack solution would be to permit MDIO buses to have
#size-cells = 1, and MDIO devices to acquire a range of the address
space, rather than just one address. Though take this with a grain of
salt, I have a lot more to learn.

If neither of those are options, in principle the hack with just
selecting, randomly, one of the N internal PHY addresses as the central
MDIO address should work equally fine regardless of whether we are
talking about the DSA switch's MDIO address here, or the MFD device's
MDIO address.

With MFD you still have the option of creating a fake MDIO controller
child device, which has mdio-parent-bus = <&host_bus>, and redirecting
all user port phy-handles to children of this bus. Since all regmap I/O
of this fake MDIO bus goes to the MFD driver, you can implement there
your hacks with page switching etc etc, and it should be equally safe.
Christian Marangi Dec. 5, 2024, 7:16 p.m. UTC | #11
On Thu, Dec 05, 2024 at 08:34:03PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 06:26:01PM +0100, Christian Marangi wrote:
> > Concept looks handy, ideally I can just assign one ID for each port
> > like:
> > port 1 -> FIB 1
> > port 2 -> FIB 1
> > port 3 -> FIB 2
> > 
> > Question:
> > Ports of the same bridge should have the same FIB?
> 
> The answer, as well as many other explanations, is under the "Address
> databases" section of Documentation/networking/dsa/dsa.rst. Please read
> it through before starting to implement anything.
>

Ok sorry, will read.

> > What I need to check is how the switch handle this for learning. Does
> > the switch correctly create FDB entry with the right FIB?
> 
> You're asking me how an8855 behaves? I have no idea, I never interacted
> with it :-|
> 

Noo it wasn't a question for you, it was really to describe a problem
that might be present if the switch doesn't account for that and that I
need to check.

> The idea as far as the DSA API is concerned would be to learn addresses
> in the bridge database (DSA_DB_BRIDGE) that the user port is configured
> for, and not learn any addresses in the port-private database (DSA_DB_PORT).
> 
> > If that's not the case then I think assisted_learning is needed and HW
> > Learn can't be used?
> 
> ds->assisted_learning_on_cpu_port applies, as suggested by its name,
> only on the CPU port. On user ports, address learning should work normally.
> 
> As you will find in the documentation, the CPU port is not like a user
> port, in the sense that it is not configured to service a single address
> database, but all of them. So, source learning on the CPU port will not
> work unless the switch knows which address database should each packet
> be associated with.

Ok so in such case, learning on CPU needs to be disabled and assisted
learning enabled.

> 
> In principle, one way could be to pass, during tagger xmit, the database ID,
> so that the switch knows that it must learn the MAC SA of this packet in
> this FID. I don't have the full image of the Mediatek DSA tag format,
> but if an8855 is anything like mt7530, this option isn't available.
> Thus, like mt7530, it needs to set ds->assisted_learning_on_cpu_port, so
> that software will call port_fdb_add() on the CPU port with the correct
> dsa_db (for the right bridge) as argument. But I don't think that is
> going to pose any sort of issue.
> 

In theory I might have found just this option. Tagger documentation is
totally missing but there are some c and header API that define some
interesting option of the tagger.

It seems the tagger can work in 3 way:
- portmap (what we currently use)
- portid 
- lookup result

Now the last 2 mode seems very interesting.
The naming is very confusing but maybe with portid they refer to the
FIB. I need to check this. If that's the case then it's exactly what we
need. They set an int so it's definetly an ID.

I assume lookup result is to only use FDB to decide where the packet
should go. In this mode no ID or port are defined.

So in short lots of tests to do, maybe this can be handled in the
tagger.

> > (I still need to check if I can assign a default FIB for a port...
> > Currently the STP register are 2 bit for each FIB id, so 16 different
> > FIB are possible)
> > 
> > Also do we have a script for selft tests? I remember there was one back
> > in the days for fdb isolation?
> 
> I don't remember right now, I don't think we do. I'll try to come up
> with something in the following days.

Yes that would be handy.
Christian Marangi Dec. 5, 2024, 7:36 p.m. UTC | #12
On Thu, Dec 05, 2024 at 08:50:37PM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 07:29:53PM +0100, Christian Marangi wrote:
> > Ohhhh ok, wasn't clear to me the MFD driver had to be placed in the mdio
> > node.
> > 
> > To make it clear this would be an implementation.
> > 
> > mdio_bus: mdio-bus {
> > 	#address-cells = <1>;
> > 	#size-cells = <0>;
> > 
> > 	...
> > 
> > 	mfd@1 {
> > 		compatible = "airoha,an8855-mfd";
> > 		reg = <1>;
> > 
> > 		nvmem_node {
> > 			...
> > 		};
> > 
> > 		switch_node {
> > 			...
> > 		};
> > 	};
> > };
> 
> I mean, I did mention Documentation/devicetree/bindings/mfd/mscc,ocelot.yaml
> in my initial reply, which has an example with exactly this layout...
> 
> > The difficulties I found (and maybe is very easy to solve and I'm
> > missing something here) is that switch and internal PHY port have the
> > same address and conflicts.
> > 
> > Switch will be at address 1 (or 2 3 4 5... every port can access switch
> > register with page 0x4)
> > 
> > DSA port 0 will be at address 1, that is already occupied by the switch.
> > 
> > Defining the DSA port node on the host MDIO bus works correctly for
> > every port but for port 0 (the one at address 1), the kernel complains
> > and is not init. (as it does conflict with the switch that is at the
> > same address) (can't remember the exact warning)
> 
> Can any of these MDIO addresses (switch or ports) be changed through registers?

No, it can only be changed the BASE address that change the address of
each port.

port 0 is base address
port 1 is base address + 1
port 2 is base address + 2...

> 
> I guess the non-hack solution would be to permit MDIO buses to have
> #size-cells = 1, and MDIO devices to acquire a range of the address
> space, rather than just one address. Though take this with a grain of
> salt, I have a lot more to learn.

I remember this was an idea when PHY Package API were proposed and was
rejected as we wanted PHY to be single reg.

> 
> If neither of those are options, in principle the hack with just
> selecting, randomly, one of the N internal PHY addresses as the central
> MDIO address should work equally fine regardless of whether we are
> talking about the DSA switch's MDIO address here, or the MFD device's
> MDIO address.
> 
> With MFD you still have the option of creating a fake MDIO controller
> child device, which has mdio-parent-bus = <&host_bus>, and redirecting
> all user port phy-handles to children of this bus. Since all regmap I/O
> of this fake MDIO bus goes to the MFD driver, you can implement there
> your hacks with page switching etc etc, and it should be equally
> safe.

I wonder if a node like this would be more consistent and descriptive?

mdio_bus: mdio-bus {
    #address-cells = <1>;
    #size-cells = <0>;

    ...

    mfd@1 {
            compatible = "airoha,an8855-mfd";
            reg = <1>;

            nvmem_node {
                    ...
            };

            switch_node {
                ports {
                        port@0 {
                                phy-handle = <&phy>;
                        };

                        port@1 {
                                phy-handle = <&phy_2>;
                        }
                };
            };

            phy: phy_node {

            };
    };

    phy_2: phy@2 {
        reg = <2>;
    }

    phy@3 {
        reg = <3>;
    }

    ..
};

No idea how to register that single phy in mfd... I guess a fake mdio is
needed anyway... What do you think of this node example? Or not worth it
and better have the fake MDIO with all the switch PHY in it?
Vladimir Oltean Dec. 5, 2024, 11:57 p.m. UTC | #13
On Thu, Dec 05, 2024 at 08:36:30PM +0100, Christian Marangi wrote:
> > I guess the non-hack solution would be to permit MDIO buses to have
> > #size-cells = 1, and MDIO devices to acquire a range of the address
> > space, rather than just one address. Though take this with a grain of
> > salt, I have a lot more to learn.
> 
> I remember this was an idea when PHY Package API were proposed and was
> rejected as we wanted PHY to be single reg.

Would that effort have helped with MDIO devices, in the way it was proposed?
Why did it die out?

> > If neither of those are options, in principle the hack with just
> > selecting, randomly, one of the N internal PHY addresses as the central
> > MDIO address should work equally fine regardless of whether we are
> > talking about the DSA switch's MDIO address here, or the MFD device's
> > MDIO address.
> > 
> > With MFD you still have the option of creating a fake MDIO controller
> > child device, which has mdio-parent-bus = <&host_bus>, and redirecting
> > all user port phy-handles to children of this bus. Since all regmap I/O
> > of this fake MDIO bus goes to the MFD driver, you can implement there
> > your hacks with page switching etc etc, and it should be equally
> > safe.
> 
> I wonder if a node like this would be more consistent and descriptive?
> 
> mdio_bus: mdio-bus {
>     #address-cells = <1>;
>     #size-cells = <0>;
> 
>     ...
> 
>     mfd@1 {
>             compatible = "airoha,an8855-mfd";
>             reg = <1>;
> 
>             nvmem_node {
>                     ...
>             };
> 
>             switch_node {
>                 ports {
>                         port@0 {
>                                 phy-handle = <&phy>;
>                         };
> 
>                         port@1 {
>                                 phy-handle = <&phy_2>;
>                         }
>                 };
>             };
> 
>             phy: phy_node {
> 
>             };
>     };
> 
>     phy_2: phy@2 {
>         reg = <2>;
>     }
> 
>     phy@3 {
>         reg = <3>;
>     }
> 
>     ..
> };
> 
> No idea how to register that single phy in mfd... I guess a fake mdio is
> needed anyway... What do you think of this node example? Or not worth it
> and better have the fake MDIO with all the switch PHY in it?

Could you work with something like this? dtc seems to swallow it without
any warnings...

mdio_bus: mdio {
        #address-cells = <1>;
        #size-cells = <0>;

        soc@1 {
                compatible = "airoha,an8855";
                reg = <1>, <2>, <3>, <4>;
                reg-names = "phy0", "phy1", "phy2", "phy3";

                nvmem {
                        compatible = "airoha,an8855-nvmem";
                };

                ethernet-switch {
                        compatible = "airoha,an8855-switch";

                        ethernet-ports {
                                #address-cells = <1>;
                                #size-cells = <0>;

                                ethernet-port@0 {
                                        reg = <0>;
                                        phy-handle = <&phy0>;
                                        phy-mode = "internal";
                                };

                                ethernet-port@1 {
                                        reg = <1>;
                                        phy-handle = <&phy1>;
                                        phy-mode = "internal";
                                };

                                ethernet-port@2 {
                                        reg = <2>;
                                        phy-handle = <&phy2>;
                                        phy-mode = "internal";
                                };

                                ethernet-port@3 {
                                        reg = <3>;
                                        phy-handle = <&phy3>;
                                        phy-mode = "internal";
                                };
                        };
                };

                mdio {
                        compatible = "airoha,an8855-mdio";
                        mdio-parent-bus = <&host_mdio>;
                        #address-cells = <1>;
                        #size-cells = <0>;

                        phy0: ethernet-phy@1 {
                                reg = <1>;
                        };

                        phy1: ethernet-phy@2 {
                                reg = <2>;
                        };

                        phy2: ethernet-phy@3 {
                                reg = <3>;
                        };

                        phy3: ethernet-phy@4 {
                                reg = <4>;
                        };
                };
        };
};
Christian Marangi Dec. 7, 2024, 12:11 p.m. UTC | #14
On Fri, Dec 06, 2024 at 01:57:09AM +0200, Vladimir Oltean wrote:
> On Thu, Dec 05, 2024 at 08:36:30PM +0100, Christian Marangi wrote:
> > > I guess the non-hack solution would be to permit MDIO buses to have
> > > #size-cells = 1, and MDIO devices to acquire a range of the address
> > > space, rather than just one address. Though take this with a grain of
> > > salt, I have a lot more to learn.
> > 
> > I remember this was an idea when PHY Package API were proposed and was
> > rejected as we wanted PHY to be single reg.
> 
> Would that effort have helped with MDIO devices, in the way it was proposed?
> Why did it die out?
> 
> > > If neither of those are options, in principle the hack with just
> > > selecting, randomly, one of the N internal PHY addresses as the central
> > > MDIO address should work equally fine regardless of whether we are
> > > talking about the DSA switch's MDIO address here, or the MFD device's
> > > MDIO address.
> > > 
> > > With MFD you still have the option of creating a fake MDIO controller
> > > child device, which has mdio-parent-bus = <&host_bus>, and redirecting
> > > all user port phy-handles to children of this bus. Since all regmap I/O
> > > of this fake MDIO bus goes to the MFD driver, you can implement there
> > > your hacks with page switching etc etc, and it should be equally
> > > safe.
> > 
> > I wonder if a node like this would be more consistent and descriptive?
> > 
> > mdio_bus: mdio-bus {
> >     #address-cells = <1>;
> >     #size-cells = <0>;
> > 
> >     ...
> > 
> >     mfd@1 {
> >             compatible = "airoha,an8855-mfd";
> >             reg = <1>;
> > 
> >             nvmem_node {
> >                     ...
> >             };
> > 
> >             switch_node {
> >                 ports {
> >                         port@0 {
> >                                 phy-handle = <&phy>;
> >                         };
> > 
> >                         port@1 {
> >                                 phy-handle = <&phy_2>;
> >                         }
> >                 };
> >             };
> > 
> >             phy: phy_node {
> > 
> >             };
> >     };
> > 
> >     phy_2: phy@2 {
> >         reg = <2>;
> >     }
> > 
> >     phy@3 {
> >         reg = <3>;
> >     }
> > 
> >     ..
> > };
> > 
> > No idea how to register that single phy in mfd... I guess a fake mdio is
> > needed anyway... What do you think of this node example? Or not worth it
> > and better have the fake MDIO with all the switch PHY in it?
> 
> Could you work with something like this? dtc seems to swallow it without
> any warnings...
> 
> mdio_bus: mdio {
>         #address-cells = <1>;
>         #size-cells = <0>;
> 
>         soc@1 {
>                 compatible = "airoha,an8855";
>                 reg = <1>, <2>, <3>, <4>;
>                 reg-names = "phy0", "phy1", "phy2", "phy3";
> 
>                 nvmem {
>                         compatible = "airoha,an8855-nvmem";
>                 };
> 
>                 ethernet-switch {
>                         compatible = "airoha,an8855-switch";
> 
>                         ethernet-ports {
>                                 #address-cells = <1>;
>                                 #size-cells = <0>;
> 
>                                 ethernet-port@0 {
>                                         reg = <0>;
>                                         phy-handle = <&phy0>;
>                                         phy-mode = "internal";
>                                 };
> 
>                                 ethernet-port@1 {
>                                         reg = <1>;
>                                         phy-handle = <&phy1>;
>                                         phy-mode = "internal";
>                                 };
> 
>                                 ethernet-port@2 {
>                                         reg = <2>;
>                                         phy-handle = <&phy2>;
>                                         phy-mode = "internal";
>                                 };
> 
>                                 ethernet-port@3 {
>                                         reg = <3>;
>                                         phy-handle = <&phy3>;
>                                         phy-mode = "internal";
>                                 };
>                         };
>                 };
> 
>                 mdio {
>                         compatible = "airoha,an8855-mdio";
>                         mdio-parent-bus = <&host_mdio>;
>                         #address-cells = <1>;
>                         #size-cells = <0>;
> 
>                         phy0: ethernet-phy@1 {
>                                 reg = <1>;
>                         };
> 
>                         phy1: ethernet-phy@2 {
>                                 reg = <2>;
>                         };
> 
>                         phy2: ethernet-phy@3 {
>                                 reg = <3>;
>                         };
> 
>                         phy3: ethernet-phy@4 {
>                                 reg = <4>;
>                         };
>                 };
>         };
> };

I finished testing and this works, I'm not using mdio-parent-bus tho as
the mdio-mux driver seems overkill for the task and problematic for PAGE
handling. (mdio-mux doesn't provide a way to give the current addr that
is being accessed)

My big concern is dt_binding_check and how Rob might take this
implementation. We recently had another case with a MFD node and Rob
found some problems in having subnode with compatible but maybe for this
particular complex case it will be O.K.

Still have to check if it's ok to have multiple reg in the mfd root node
(for mdio schema)
Vladimir Oltean Dec. 10, 2024, 8:31 p.m. UTC | #15
On Sat, Dec 07, 2024 at 01:11:23PM +0100, Christian Marangi wrote:
> I finished testing and this works, I'm not using mdio-parent-bus tho as
> the mdio-mux driver seems overkill for the task and problematic for PAGE
> handling. (mdio-mux doesn't provide a way to give the current addr that
> is being accessed)

The use of mdio-parent-bus doesn't necessarily imply an mdio-mux. For
example, you'll also see it used in net/dsa/microchip,ksz.yaml.

You say this switch is also accessible over I2C. How are the internal
PHYs accessed in that case? Still over MDIO? If so, it would be nice to
have a unified scheme for both I2C-controlled switch and MDIO-controlled
switch, which is something that mdio-parent-bus would permit.
Christian Marangi Dec. 10, 2024, 8:56 p.m. UTC | #16
On Tue, Dec 10, 2024 at 10:31:48PM +0200, Vladimir Oltean wrote:
> On Sat, Dec 07, 2024 at 01:11:23PM +0100, Christian Marangi wrote:
> > I finished testing and this works, I'm not using mdio-parent-bus tho as
> > the mdio-mux driver seems overkill for the task and problematic for PAGE
> > handling. (mdio-mux doesn't provide a way to give the current addr that
> > is being accessed)
> 
> The use of mdio-parent-bus doesn't necessarily imply an mdio-mux. For
> example, you'll also see it used in net/dsa/microchip,ksz.yaml.
> 
> You say this switch is also accessible over I2C. How are the internal
> PHYs accessed in that case? Still over MDIO? If so, it would be nice to
> have a unified scheme for both I2C-controlled switch and MDIO-controlled
> switch, which is something that mdio-parent-bus would permit.

What is not clear to me is where that property should be used in the
code or it's just for reference in DT to describe what is the parent?

Given MFD implementation, I pass the bus by accessing the MFD priv
struct so having the parent property is redundant.