Message ID | E1qTKdM-003Cpx-Eh@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | Accepted |
Commit | 145622771d22a7ad5061278eb6fb16396c29d308 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next] net: dsa: mark parsed interface mode for legacy switch drivers | expand |
On Tue, Aug 08, 2023 at 03:06:52PM +0300, Vladimir Oltean wrote: > Hi Russell, > > On Tue, Aug 08, 2023 at 12:12:16PM +0100, Russell King (Oracle) wrote: > > If we successfully parsed an interface mode with a legacy switch > > driver, populate that mode into phylink's supported interfaces rather > > than defaulting to the internal and gmii interfaces. > > > > This hasn't caused an issue so far, because when the interface doesn't > > match a supported one, phylink_validate() doesn't clear the supported > > mask, but instead returns -EINVAL. phylink_parse_fixedlink() doesn't > > check this return value, and merely relies on the supported ethtool > > link modes mask being cleared. Therefore, the fixed link settings end > > up being allowed despite validation failing. > > > > Before this causes a problem, arrange for DSA to more accurately > > populate phylink's supported interfaces mask so validation can > > correctly succeed. > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > How did you notice this? Is there any unconverted DSA switch which has a > phy-mode which isn't PHY_INTERFACE_MODE_INTERNAL or PHY_INTERFACE_MODE_NA? By looking at some of the legacy drivers, finding their DT compatibles and then grepping the dts files. For example, vitesse,vsc73* compatibles show up here: arch/arm/boot/dts/gemini/gemini-sq201.dts and generally, the ports are listed as: port@0 { reg = <0>; label = "lan1"; }; except for the CPU port which has: vsc: port@6 { reg = <6>; label = "cpu"; ethernet = <&gmac1>; phy-mode = "rgmii"; fixed-link { speed = <1000>; full-duplex; pause; }; }; Since the vitesse DSA driver doesn't populate .phylink_get_caps, it would have been failing as you discovered with dsa_loop before the previous patch. Fixing this by setting GMII and INTERNAL worked around the additional check that was using that failure and will work fine for the LAN ports as listed above. However, that CPU port uses "rgmii" which doesn't match the GMII and INTERNAL bits in the supported mask. Since phylink_validate() does this: const unsigned long *interfaces = pl->config->supported_interfaces; if (state->interface == PHY_INTERFACE_MODE_NA) ... it isn't, so we move on... if (!test_bit(state->interface, interfaces)) return -EINVAL; This will trigger and phylink_validate() in phylink_parse_fixedlink() will return -EINVAL without touching the passed supported mask. phylink_parse_fixedlink() does: bitmap_fill(pl->supported, __ETHTOOL_LINK_MODE_MASK_NBITS); linkmode_copy(pl->link_config.advertising, pl->supported); phylink_validate(pl, MLO_AN_FIXED, pl->supported, &pl->link_config); and then we have: s = phy_lookup_setting(pl->link_config.speed, pl->link_config.duplex, pl->supported, true); ... if (s) { ... success ... } else { phylink_warn(pl, "fixed link %s duplex %dMbps not recognised\n", pl->link_config.duplex == DUPLEX_FULL ? "full" : "half", pl->link_config.speed); } So, since phylink_validate() with an apparently unsupported interface exits early with -EINVAL, pl->supported ends up with all bits set, and phy_lookup_setting() allows any speed. If someone decides to fix that phylink_validate() error checking, then this will then lead to a warning/failure. I want to avoid that happening - fixing that latent bug before it becomes a problem.
Hello: This patch was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Tue, 08 Aug 2023 12:12:16 +0100 you wrote: > If we successfully parsed an interface mode with a legacy switch > driver, populate that mode into phylink's supported interfaces rather > than defaulting to the internal and gmii interfaces. > > This hasn't caused an issue so far, because when the interface doesn't > match a supported one, phylink_validate() doesn't clear the supported > mask, but instead returns -EINVAL. phylink_parse_fixedlink() doesn't > check this return value, and merely relies on the supported ethtool > link modes mask being cleared. Therefore, the fixed link settings end > up being allowed despite validation failing. > > [...] Here is the summary with links: - [net-next] net: dsa: mark parsed interface mode for legacy switch drivers https://git.kernel.org/netdev/net-next/c/145622771d22 You are awesome, thank you!
Hi Russell, On Tue, Aug 08, 2023 at 03:19:20PM +0100, Russell King (Oracle) wrote: > On Tue, Aug 08, 2023 at 04:52:15PM +0300, Vladimir Oltean wrote: > > On Tue, Aug 08, 2023 at 01:57:43PM +0100, Russell King (Oracle) wrote: > > > Thanks for the r-b. > > > > > > At risk of delaying this patch through further discussion... so I'll > > > say now that we're going off into discussions about future changes. > > > > > > I believe all DSA drivers that provide .phylink_get_caps fill in the > > > .mac_capabilities member, which leaves just a few drivers that do not, > > > which are: > > > > > > $ git grep -l dsa_switch_ops.*= drivers/net/dsa/ | xargs grep -L '\.phylink_get_caps' > > > drivers/net/dsa/dsa_loop.c > > > drivers/net/dsa/mv88e6060.c > > > drivers/net/dsa/realtek/rtl8366rb.c > > > drivers/net/dsa/vitesse-vsc73xx-core.c > > > > > > I've floated the idea to Linus W and Arinc about setting > > > .mac_capabilities in the non-phylink_get_caps path as well, suggesting: > > > > Not sure what you mean by "in the non-phylink_get_caps path" (what is > > that other path). Don't you mean that we should implement phylink_get_caps() > > for these drivers, to have a unified code flow for everyone? > > I meant this: > > /* For legacy drivers */ > if (mode != PHY_INTERFACE_MODE_NA) { > __set_bit(mode, dp->pl_config.supported_interfaces); > } else { > __set_bit(PHY_INTERFACE_MODE_INTERNAL, > dp->pl_config.supported_interfaces); > __set_bit(PHY_INTERFACE_MODE_GMII, > dp->pl_config.supported_interfaces); > } Ah, ok, you'd like a built-in assumption of the mac_capabilities in dsa_port_phylink_create(). > but ultimately yes, having the DSA phylink_get_caps method mandatory > would be excellent, but I don't think we have sufficient information > to do that. > > For example, what interface modes does the Vitesse DSA switch support? > What speeds? Does it support pause? Does it vary depending on port? I only have a VSC7395 datasheet which was shared with me by Linus (and that link is no longer functional). This switch supports MII/REV-MII/GMII/RGMII on MAC 6, and MACs 0-4 are connected to internal PHYs (yes, there is no port 5, also see the comment in vsc73xx_probe()). Based on vsc73xx_init_port() and vsc73xx_adjust_enable_port(), I guess all ports support flow control, and thus, PHYs should advertise it. I don't have a datasheet for the other switches supported by the driver: * Vitesse VSC7385 SparX-G5 5+1-port Integrated Gigabit Ethernet Switch * Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch * Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch * Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch but based on the common treatment in vsc73xx_init_port(), I'd say that on all models, port 6 (CPU_PORT) is the xMII port and all the others are internal PHY ports, and all support the same configuration. So a phylink_get_caps() implementation could probably also do one of two things, based on "if (port == CPU_PORT)". > > > MAC_1000 | MAC_100 | MAC_10 | MAC_SYM_PAUSE | MAC_ASYM_PAUSE > > > > > > support more than 1G speeds. I think the only exception to that may > > > be dsa_loop, but as I think that makes use of the old fixed-link > > > software emulated PHYs, I believe that would be limited to max. 1G > > > as well. > > > > I don't believe that dsa_loop makes use of fixed-link at all. Its user > > ports use phy/gmii mode through the non-OF-based dsa_slave_phy_connect() > > to the ds->slave_mii_bus, and the CPU port goes through the non-OF code > > path ("else" block) here (because dsa_loop_bdinfo.c _is_ non-OF-based): > > Sorry, I meant fixed-phy not fixed-link. > > > > > dsa_port_setup: > > case DSA_PORT_TYPE_CPU: > > if (dp->dn) { > > err = dsa_shared_port_link_register_of(dp); > > if (err) > > break; > > dsa_port_link_registered = true; > > } else { > > dev_warn(ds->dev, > > "skipping link registration for CPU port %d\n", > > dp->index); > > } > > What made me believe that it uses the old fixed-phy stuff is: > > static int __init dsa_loop_init(void) > ... > for (i = 0; i < NUM_FIXED_PHYS; i++) > phydevs[i] = fixed_phy_register(PHY_POLL, &status, NULL); > > These PHYs end up on the "fixed-0" virtual MDIO bus, which also has a > MDIO device created for the dsa-loop driver at address 31. Thus, in > dsa_loop_drv_probe(): > > ps->bus = mdiodev->bus; > > is the fixed-0 bus with these fixed-PHYs on, and dsa_loop_phy_read() > and dsa_loop_phy_write() access these fixed PHYs. > > These fixed PHYs are clause-22 PHYs, which only support up to 1G > speeds. Therefore, it is my understanding that dsa-loop will only > support up to 1G speeds. Clear now. Yes, this is correct. > > > If we did set .mac_capabilities, then dsa_port_phylink_validate() would > > > always call phylink_generic_validate() for all DSA drivers, and at that > > > point, we don't need dsa_port_phylink_validate() anymore as it provides > > > nothing that isn't already done inside phylink. > > > > > > Once dsa_port_phylink_validate() is gone, then I believe there are no > > > drivers populating the .validate method in phylink_mac_ops, which > > > then means there is the possibility to remove that method. > > > > Assuming I understand correctly, I agree it would be beneficial for > > mv88e6060, rtl8366rb and vsc73xx to populate mac_capabilities and > > supported_interfaces. > > ... which we can only do if someone can furnish information on what > these support. Short of that, we would need something in the core > DSA code (like we're doing for the supported_interfaces) that would > allow them to continue working until .phylink_get_caps could be > reasonably implemented for them. > > Providing a legacy .phylink_get_caps would also be a possibility. > Maybe something like this: > > void legacy_dsa_phylink_get_caps(struct dsa_switch *ds, int port, > struct phylink_config *config) > { > struct dsa_port *dp = dsa_to_port(ds, port); > phy_interface_t mode; > int err; > > err = of_get_phy_mode(dp->dn, &mode); > if (!err) { > __set_bit(mode, config->supported_interfaces); > } else { > __set_bit(PHY_INTERFACE_MODE_INTERNAL, > config->supported_interfaces); > __set_bit(PHY_INTERFACE_MODE_GMII, > config->supported_interfaces); > } > > config->mac_capabilities = MAC_1000 | MAC_100 | MAC_10 | > MAC_SYM_PAUSE | MAC_ASYM_PAUSE; > } > > and then dsa_port_phylink_create() always calls phylink_get_caps: > > - if (ds->ops->phylink_get_caps) { > - ds->ops->phylink_get_caps(ds, dp->index, &dp->pl_config); > - } else { > - ... > - } > + ds->ops->phylink_get_caps(ds, dp->index, &dp->pl_config); That could be an option, but I think the volume of switches is low enough that we could just consider converting them all. I see you've sent a mv88e6060 patch, I'll go review that now.
On Thu, Aug 10, 2023 at 06:16:17PM +0300, Vladimir Oltean wrote: > Hi Russell, > > On Tue, Aug 08, 2023 at 03:19:20PM +0100, Russell King (Oracle) wrote: > > On Tue, Aug 08, 2023 at 04:52:15PM +0300, Vladimir Oltean wrote: > > > On Tue, Aug 08, 2023 at 01:57:43PM +0100, Russell King (Oracle) wrote: > > > > Thanks for the r-b. > > > > > > > > At risk of delaying this patch through further discussion... so I'll > > > > say now that we're going off into discussions about future changes. > > > > > > > > I believe all DSA drivers that provide .phylink_get_caps fill in the > > > > .mac_capabilities member, which leaves just a few drivers that do not, > > > > which are: > > > > > > > > $ git grep -l dsa_switch_ops.*= drivers/net/dsa/ | xargs grep -L '\.phylink_get_caps' > > > > drivers/net/dsa/dsa_loop.c > > > > drivers/net/dsa/mv88e6060.c > > > > drivers/net/dsa/realtek/rtl8366rb.c > > > > drivers/net/dsa/vitesse-vsc73xx-core.c > > > > > > > > I've floated the idea to Linus W and Arinc about setting > > > > .mac_capabilities in the non-phylink_get_caps path as well, suggesting: > > > > > > Not sure what you mean by "in the non-phylink_get_caps path" (what is > > > that other path). Don't you mean that we should implement phylink_get_caps() > > > for these drivers, to have a unified code flow for everyone? > > > > I meant this: > > > > /* For legacy drivers */ > > if (mode != PHY_INTERFACE_MODE_NA) { > > __set_bit(mode, dp->pl_config.supported_interfaces); > > } else { > > __set_bit(PHY_INTERFACE_MODE_INTERNAL, > > dp->pl_config.supported_interfaces); > > __set_bit(PHY_INTERFACE_MODE_GMII, > > dp->pl_config.supported_interfaces); > > } > > Ah, ok, you'd like a built-in assumption of the mac_capabilities in > dsa_port_phylink_create(). > > > but ultimately yes, having the DSA phylink_get_caps method mandatory > > would be excellent, but I don't think we have sufficient information > > to do that. > > > > For example, what interface modes does the Vitesse DSA switch support? > > What speeds? Does it support pause? Does it vary depending on port? > > I only have a VSC7395 datasheet which was shared with me by Linus (and > that link is no longer functional). > > This switch supports MII/REV-MII/GMII/RGMII on MAC 6, and MACs 0-4 are > connected to internal PHYs (yes, there is no port 5, also see the > comment in vsc73xx_probe()). > > Based on vsc73xx_init_port() and vsc73xx_adjust_enable_port(), I guess > all ports support flow control, and thus, PHYs should advertise it. > > I don't have a datasheet for the other switches supported by the driver: > > * Vitesse VSC7385 SparX-G5 5+1-port Integrated Gigabit Ethernet Switch > * Vitesse VSC7388 SparX-G8 8-port Integrated Gigabit Ethernet Switch > * Vitesse VSC7395 SparX-G5e 5+1-port Integrated Gigabit Ethernet Switch > * Vitesse VSC7398 SparX-G8e 8-port Integrated Gigabit Ethernet Switch > > but based on the common treatment in vsc73xx_init_port(), I'd say that > on all models, port 6 (CPU_PORT) is the xMII port and all the others are > internal PHY ports, and all support the same configuration. So a > phylink_get_caps() implementation could probably also do one of two > things, based on "if (port == CPU_PORT)". ... > That could be an option, but I think the volume of switches is low > enough that we could just consider converting them all. It's actually better - the vitesse driver uses .adjust_link, which means it's excluded from phylink for the DSA/CPU ports. So, I think for Vitesse, we just need to set INTERNAL and GMII for ports != CPU_PORT, speeds 10..1000Mbps at FD and HD, and also sym and asym pause. That leaves the RTL836x driver, for which I've found: http://realtek.info/pdf/rtl8366_8369_datasheet_1-1.pdf and that indicates that the user ports use RSGMII which is SGMII with a clock in one direction. The only dts I can find is: arch/arm/boot/dts/gemini-dlink-dir-685.dts which doesn't specify phy-mode for these, so that'll be using the phylib default of GMII. Port 5 supports MII/GMII/RGMII by hardware strapping, which has three modes of operation: MII/GMII (mac mode): 1G (GMII) when linked at 1G, otherwise 100M (MII) RGMII: only 1G MII (phy mode): only 100M FD supported. Flow control by hardware strapping but is readable via a register, but omits to say where. There's also some suggestion that asym flow control is supported in 1G mode - but it doesn't say whether it's supported in 100M (and since IEEE 802.3 advertisements do not make this conditional on speed... yea, sounds like a slightly broken design to me.) So for realtek, I propose (completely untested): 8<==== From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> Subject: [PATCH net-next] net: dsa: realtek: add phylink_get_caps implementation The user ports use RSGMII, but we don't have that, and DT doesn't specify a phy interface mode, so phylib defaults to GMII. These support 1G, 100M and 10M with flow control. It is unknown whether asymetric pause is supported at all speeds. The CPU port uses MII/GMII/RGMII/REVMII by hardware pin strapping, and support speeds specific to each, with full duplex only supported in some modes. Flow control may be supported again by hardware pin strapping, and theoretically is readable through a register but no information is given in the datasheet for that. So, we do a best efforts - and be lenient. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> --- drivers/net/dsa/realtek/rtl8366rb.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/drivers/net/dsa/realtek/rtl8366rb.c b/drivers/net/dsa/realtek/rtl8366rb.c index 25f88022b9e4..76b5c43e1430 100644 --- a/drivers/net/dsa/realtek/rtl8366rb.c +++ b/drivers/net/dsa/realtek/rtl8366rb.c @@ -1049,6 +1049,32 @@ static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds, return DSA_TAG_PROTO_RTL4_A; } +static void rtl8366rb_phylink_get_caps(struct dsa_switch *ds, int port, + struct phylink_config *config) +{ + unsigned long *interfaces = config->supported_interfaces; + struct realtek_priv *priv = ds->priv; + + if (port == priv->cpu_port) { + __set_bit(PHY_INTERFACE_MODE_MII, interfaces); + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); + /* Only supports 100M FD */ + __set_bit(PHY_INTERFACE_MODE_REVMII, interfaces); + /* Only supports 1G FD */ + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); + + config->mac_capabilities = MAC_1000 | MAC_100 | + MAC_SYM_PAUSE; + } + + /* RSGMII port, but we don't have that, and we don't + * specify in DT, so phylib uses the default of GMII + */ + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); + config->mac_capabilities = MAC_1000 | MAC_100 | MAC_10 | + MAC_SYM_PAUSE | MAC_ASYM_PAUSE; +} + static void rtl8366rb_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode, phy_interface_t interface, struct phy_device *phydev, @@ -1796,6 +1822,7 @@ static int rtl8366rb_detect(struct realtek_priv *priv) static const struct dsa_switch_ops rtl8366rb_switch_ops_smi = { .get_tag_protocol = rtl8366_get_tag_protocol, .setup = rtl8366rb_setup, + .phylink_get_caps = rtl8366rb_phylink_get_caps, .phylink_mac_link_up = rtl8366rb_mac_link_up, .phylink_mac_link_down = rtl8366rb_mac_link_down, .get_strings = rtl8366_get_strings, @@ -1821,6 +1848,7 @@ static const struct dsa_switch_ops rtl8366rb_switch_ops_mdio = { .setup = rtl8366rb_setup, .phy_read = rtl8366rb_dsa_phy_read, .phy_write = rtl8366rb_dsa_phy_write, + .phylink_get_caps = rtl8366rb_phylink_get_caps, .phylink_mac_link_up = rtl8366rb_mac_link_up, .phylink_mac_link_down = rtl8366rb_mac_link_down, .get_strings = rtl8366_get_strings,
On Sat, Aug 12, 2023 at 01:16:00PM +0100, Russell King (Oracle) wrote: > It's actually better - the vitesse driver uses .adjust_link, which > means it's excluded from phylink for the DSA/CPU ports. > > So, I think for Vitesse, we just need to set INTERNAL and GMII > for ports != CPU_PORT, speeds 10..1000Mbps at FD and HD, and also > sym and asym pause. Ok. > That leaves the RTL836x driver, for which I've found: > > http://realtek.info/pdf/rtl8366_8369_datasheet_1-1.pdf > > and that indicates that the user ports use RSGMII which is SGMII with > a clock in one direction. The only dts I can find is: > > arch/arm/boot/dts/gemini-dlink-dir-685.dts > > which doesn't specify phy-mode for these, so that'll be using the > phylib default of GMII. > > Port 5 supports MII/GMII/RGMII by hardware strapping, which has three > modes of operation: > > MII/GMII (mac mode): 1G (GMII) when linked at 1G, otherwise 100M (MII) > RGMII: only 1G > MII (phy mode): only 100M FD supported. Flow control by hardware > strapping but is readable via a register, but omits to say where. > > There's also some suggestion that asym flow control is supported in 1G > mode - but it doesn't say whether it's supported in 100M (and since > IEEE 802.3 advertisements do not make this conditional on speed... > yea, sounds like a slightly broken design to me.) > > So for realtek, I propose (completely untested): > > 8<==== > From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> > Subject: [PATCH net-next] net: dsa: realtek: add phylink_get_caps > implementation > > The user ports use RSGMII, but we don't have that, and DT doesn't > specify a phy interface mode, so phylib defaults to GMII. These support > 1G, 100M and 10M with flow control. It is unknown whether asymetric > pause is supported at all speeds. > > The CPU port uses MII/GMII/RGMII/REVMII by hardware pin strapping, > and support speeds specific to each, with full duplex only supported > in some modes. Flow control may be supported again by hardware pin > strapping, and theoretically is readable through a register but no > information is given in the datasheet for that. > > So, we do a best efforts - and be lenient. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/net/dsa/realtek/rtl8366rb.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/drivers/net/dsa/realtek/rtl8366rb.c b/drivers/net/dsa/realtek/rtl8366rb.c > index 25f88022b9e4..76b5c43e1430 100644 > --- a/drivers/net/dsa/realtek/rtl8366rb.c > +++ b/drivers/net/dsa/realtek/rtl8366rb.c > @@ -1049,6 +1049,32 @@ static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds, > return DSA_TAG_PROTO_RTL4_A; > } > > +static void rtl8366rb_phylink_get_caps(struct dsa_switch *ds, int port, > + struct phylink_config *config) > +{ > + unsigned long *interfaces = config->supported_interfaces; > + struct realtek_priv *priv = ds->priv; > + > + if (port == priv->cpu_port) { > + __set_bit(PHY_INTERFACE_MODE_MII, interfaces); > + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); > + /* Only supports 100M FD */ > + __set_bit(PHY_INTERFACE_MODE_REVMII, interfaces); > + /* Only supports 1G FD */ > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > + > + config->mac_capabilities = MAC_1000 | MAC_100 | > + MAC_SYM_PAUSE; Missing "return" statement here. > + } > + > + /* RSGMII port, but we don't have that, and we don't > + * specify in DT, so phylib uses the default of GMII > + */ > + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); > + config->mac_capabilities = MAC_1000 | MAC_100 | MAC_10 | > + MAC_SYM_PAUSE | MAC_ASYM_PAUSE; > +} > + > static void > rtl8366rb_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode, > phy_interface_t interface, struct phy_device *phydev, > @@ -1796,6 +1822,7 @@ static int rtl8366rb_detect(struct realtek_priv *priv) > static const struct dsa_switch_ops rtl8366rb_switch_ops_smi = { > .get_tag_protocol = rtl8366_get_tag_protocol, > .setup = rtl8366rb_setup, > + .phylink_get_caps = rtl8366rb_phylink_get_caps, > .phylink_mac_link_up = rtl8366rb_mac_link_up, > .phylink_mac_link_down = rtl8366rb_mac_link_down, > .get_strings = rtl8366_get_strings, > @@ -1821,6 +1848,7 @@ static const struct dsa_switch_ops rtl8366rb_switch_ops_mdio = { > .setup = rtl8366rb_setup, > .phy_read = rtl8366rb_dsa_phy_read, > .phy_write = rtl8366rb_dsa_phy_write, > + .phylink_get_caps = rtl8366rb_phylink_get_caps, > .phylink_mac_link_up = rtl8366rb_mac_link_up, > .phylink_mac_link_down = rtl8366rb_mac_link_down, > .get_strings = rtl8366_get_strings, > -- > 2.30.2 > > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Sat, Aug 12, 2023 at 2:16 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > That leaves the RTL836x driver, for which I've found: > > http://realtek.info/pdf/rtl8366_8369_datasheet_1-1.pdf > > and that indicates that the user ports use RSGMII which is SGMII with > a clock in one direction. Sadly that datasheet has been pretty far off the RTL8366RB, the "RB" in the end means "revision B" and things changed a lot there. What I mostly used was a DD-WRT vendor code drop, which is pretty terse, but can be used for guesswork: https://svn.dd-wrt.com//browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb > The only dts I can find is: > > arch/arm/boot/dts/gemini-dlink-dir-685.dts > > which doesn't specify phy-mode for these, so that'll be using the > phylib default of GMII. Hm. That file is my educated guesses and trial-and-error at times, due to lack of documentation. It shouldn't be trusted too much. > So for realtek, I propose (completely untested): I applied it and it all works fine afterwards on the DIR-685. Should I test some different configs in the DTS as well? Yours, Linus Walleij
On Sun, Aug 13, 2023 at 11:56:40PM +0200, Linus Walleij wrote: > On Sat, Aug 12, 2023 at 2:16 PM Russell King (Oracle) > <linux@armlinux.org.uk> wrote: > > > That leaves the RTL836x driver, for which I've found: > > > > http://realtek.info/pdf/rtl8366_8369_datasheet_1-1.pdf > > > > and that indicates that the user ports use RSGMII which is SGMII with > > a clock in one direction. > > Sadly that datasheet has been pretty far off the RTL8366RB, > the "RB" in the end means "revision B" and things changed a > lot there. > > What I mostly used was a DD-WRT vendor code drop, which is pretty > terse, but can be used for guesswork: > https://svn.dd-wrt.com//browser/src/linux/universal/linux-3.2/drivers/net/ethernet/raeth/rb > > > The only dts I can find is: > > > > arch/arm/boot/dts/gemini-dlink-dir-685.dts > > > > which doesn't specify phy-mode for these, so that'll be using the > > phylib default of GMII. > > Hm. That file is my educated guesses and trial-and-error at times, > due to lack of documentation. It shouldn't be trusted too much. > > > So for realtek, I propose (completely untested): > > I applied it and it all works fine afterwards on the DIR-685. > Should I test some different configs in the DTS as well? Note Vladimir's comment about the missing "return" - he's correct. It would be good to test all combinations that we're aware of users for, if that's somehow possible? I'm guessing the only one we know about is yours above?
On Sat, Aug 12, 2023 at 01:16:00PM +0100, Russell King (Oracle) wrote: > So for realtek, I propose (completely untested): > > 8<==== > From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> > Subject: [PATCH net-next] net: dsa: realtek: add phylink_get_caps > implementation > > The user ports use RSGMII, but we don't have that, and DT doesn't > specify a phy interface mode, so phylib defaults to GMII. These support > 1G, 100M and 10M with flow control. It is unknown whether asymetric > pause is supported at all speeds. > > The CPU port uses MII/GMII/RGMII/REVMII by hardware pin strapping, > and support speeds specific to each, with full duplex only supported > in some modes. Flow control may be supported again by hardware pin > strapping, and theoretically is readable through a register but no > information is given in the datasheet for that. > > So, we do a best efforts - and be lenient. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/net/dsa/realtek/rtl8366rb.c | 28 ++++++++++++++++++++++++++++ > 1 file changed, 28 insertions(+) > > diff --git a/drivers/net/dsa/realtek/rtl8366rb.c b/drivers/net/dsa/realtek/rtl8366rb.c > index 25f88022b9e4..76b5c43e1430 100644 > --- a/drivers/net/dsa/realtek/rtl8366rb.c > +++ b/drivers/net/dsa/realtek/rtl8366rb.c > @@ -1049,6 +1049,32 @@ static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds, > return DSA_TAG_PROTO_RTL4_A; > } > > +static void rtl8366rb_phylink_get_caps(struct dsa_switch *ds, int port, > + struct phylink_config *config) > +{ > + unsigned long *interfaces = config->supported_interfaces; > + struct realtek_priv *priv = ds->priv; > + > + if (port == priv->cpu_port) { > + __set_bit(PHY_INTERFACE_MODE_MII, interfaces); > + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); > + /* Only supports 100M FD */ > + __set_bit(PHY_INTERFACE_MODE_REVMII, interfaces); > + /* Only supports 1G FD */ > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); also, I guess that this should allow all 4 variants of RGMII. > + > + config->mac_capabilities = MAC_1000 | MAC_100 | > + MAC_SYM_PAUSE; > + } > + > + /* RSGMII port, but we don't have that, and we don't > + * specify in DT, so phylib uses the default of GMII > + */ > + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); > + config->mac_capabilities = MAC_1000 | MAC_100 | MAC_10 | > + MAC_SYM_PAUSE | MAC_ASYM_PAUSE; > +} > + > static void > rtl8366rb_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode, > phy_interface_t interface, struct phy_device *phydev, > @@ -1796,6 +1822,7 @@ static int rtl8366rb_detect(struct realtek_priv *priv) > static const struct dsa_switch_ops rtl8366rb_switch_ops_smi = { > .get_tag_protocol = rtl8366_get_tag_protocol, > .setup = rtl8366rb_setup, > + .phylink_get_caps = rtl8366rb_phylink_get_caps, > .phylink_mac_link_up = rtl8366rb_mac_link_up, > .phylink_mac_link_down = rtl8366rb_mac_link_down, > .get_strings = rtl8366_get_strings, > @@ -1821,6 +1848,7 @@ static const struct dsa_switch_ops rtl8366rb_switch_ops_mdio = { > .setup = rtl8366rb_setup, > .phy_read = rtl8366rb_dsa_phy_read, > .phy_write = rtl8366rb_dsa_phy_write, > + .phylink_get_caps = rtl8366rb_phylink_get_caps, > .phylink_mac_link_up = rtl8366rb_mac_link_up, > .phylink_mac_link_down = rtl8366rb_mac_link_down, > .get_strings = rtl8366_get_strings, > -- > 2.30.2
On Mon, Aug 14, 2023 at 05:59:48PM +0300, Vladimir Oltean wrote: > On Sat, Aug 12, 2023 at 01:16:00PM +0100, Russell King (Oracle) wrote: > > So for realtek, I propose (completely untested): > > > > 8<==== > > From: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk> > > Subject: [PATCH net-next] net: dsa: realtek: add phylink_get_caps > > implementation > > > > The user ports use RSGMII, but we don't have that, and DT doesn't > > specify a phy interface mode, so phylib defaults to GMII. These support > > 1G, 100M and 10M with flow control. It is unknown whether asymetric > > pause is supported at all speeds. > > > > The CPU port uses MII/GMII/RGMII/REVMII by hardware pin strapping, > > and support speeds specific to each, with full duplex only supported > > in some modes. Flow control may be supported again by hardware pin > > strapping, and theoretically is readable through a register but no > > information is given in the datasheet for that. > > > > So, we do a best efforts - and be lenient. > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > drivers/net/dsa/realtek/rtl8366rb.c | 28 ++++++++++++++++++++++++++++ > > 1 file changed, 28 insertions(+) > > > > diff --git a/drivers/net/dsa/realtek/rtl8366rb.c b/drivers/net/dsa/realtek/rtl8366rb.c > > index 25f88022b9e4..76b5c43e1430 100644 > > --- a/drivers/net/dsa/realtek/rtl8366rb.c > > +++ b/drivers/net/dsa/realtek/rtl8366rb.c > > @@ -1049,6 +1049,32 @@ static enum dsa_tag_protocol rtl8366_get_tag_protocol(struct dsa_switch *ds, > > return DSA_TAG_PROTO_RTL4_A; > > } > > > > +static void rtl8366rb_phylink_get_caps(struct dsa_switch *ds, int port, > > + struct phylink_config *config) > > +{ > > + unsigned long *interfaces = config->supported_interfaces; > > + struct realtek_priv *priv = ds->priv; > > + > > + if (port == priv->cpu_port) { > > + __set_bit(PHY_INTERFACE_MODE_MII, interfaces); > > + __set_bit(PHY_INTERFACE_MODE_GMII, interfaces); > > + /* Only supports 100M FD */ > > + __set_bit(PHY_INTERFACE_MODE_REVMII, interfaces); > > + /* Only supports 1G FD */ > > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > > also, I guess that this should allow all 4 variants of RGMII. I'm not sure - looking at what's available, the RTL8366 datasheet (not RB) says that there's pinstrapping for the RGMII delays. It also suggests that there may be a register that can be modified for this, but the driver doesn't appear to touch it - in fact, it does nothing with the interface mode. Moreover, the only in-kernel DT for this has: rtl8366rb_cpu_port: port@5 { reg = <5>; label = "cpu"; ethernet = <&gmac0>; phy-mode = "rgmii"; fixed-link { speed = <1000>; full-duplex; pause; }; }; Whether that can be changed in the RB version of the device or not, I don't know, so whether it makes sense to allow the other RGMII modes, again, I don't know. Annoyingly, gmac0 doesn't exist in this file, it's defined in gemini.dtsi, which this file references through a heirarchy of nodes (makes it very much less readable), but it points at: / { ... soc { ... ethernet@60000000 { ... ethernet-port@0 { phy-mode = "rgmii"; fixed-link { speed = <1000>; full-duplex; pause; }; }; So that also uses "rgmii". I'm tempted not to allow the others as the driver doesn't make any adjustments, and we only apparently have the one user.
> > > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > > > > also, I guess that this should allow all 4 variants of RGMII. > > I'm not sure - looking at what's available, the RTL8366 datasheet (not > RB) says that there's pinstrapping for the RGMII delays. It also suggests > that there may be a register that can be modified for this, but the driver > doesn't appear to touch it - in fact, it does nothing with the interface > mode. Moreover, the only in-kernel DT for this has: > > rtl8366rb_cpu_port: port@5 { > reg = <5>; > label = "cpu"; > ethernet = <&gmac0>; > phy-mode = "rgmii"; > fixed-link { > speed = <1000>; > full-duplex; > pause; > }; > }; > > Whether that can be changed in the RB version of the device or not, I > don't know, so whether it makes sense to allow the other RGMII modes, > again, I don't know. > > Annoyingly, gmac0 doesn't exist in this file, it's defined in > gemini.dtsi, which this file references through a heirarchy of nodes > (makes it very much less readable), but it points at: > > / { > ... > soc { > ... > ethernet@60000000 { > ... > ethernet-port@0 { > phy-mode = "rgmii"; > fixed-link { > speed = <1000>; > full-duplex; > pause; > }; > }; > > So that also uses "rgmii". > > I'm tempted not to allow the others as the driver doesn't make any > adjustments, and we only apparently have the one user. RGMII on both ends is unlikely to work, so probably one is wrong. Probably the switch has strapping to set it to rgmii-id, but we don't actually know that. And i guess we have no ability to find out the truth? So a narrow definition seems reasonable at the moment, to raise a red warning flag if somebody does try to use rgmii-id which is not actually implemented in the driver. And that user then gets to sort out the problem. Andrew
On Mon, Aug 14, 2023 at 04:12:40PM +0100, Russell King (Oracle) wrote: > > > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > > > > also, I guess that this should allow all 4 variants of RGMII. > > I'm not sure - looking at what's available, the RTL8366 datasheet (not > RB) says that there's pinstrapping for the RGMII delays. It also suggests > that there may be a register that can be modified for this, but the driver > doesn't appear to touch it - in fact, it does nothing with the interface > mode. Moreover, the only in-kernel DT for this has: > Whether that can be changed in the RB version of the device or not, I > don't know, so whether it makes sense to allow the other RGMII modes, > again, I don't know. > > Annoyingly, gmac0 doesn't exist in this file, it's defined in > gemini.dtsi, which this file references through a heirarchy of nodes > (makes it very much less readable), but it points at: > > So that also uses "rgmii". ... so one of them must be wrong.. > > I'm tempted not to allow the others as the driver doesn't make any > adjustments, and we only apparently have the one user. I believe you were of the opinion that the RGMII delays in the phy-mode are from the perspective of the other end of the RGMII connection - i.e. 'rgmii-rxid' means that [ the other end, or the board serpentine traces ] have set up a clock skew on our RX_CLK relative to our RXD[3:0]. In that interpretation, it doesn't matter whether we're doing anything in the 4 different phy-modes for rgmii or not, and it's not illegal to have any of those 4 properties. Only a PHY should modify RGMII delays based purely upon a phy-mode, and we're not a PHY. A MAC could adjust its RGMII delays based on rx-internal-delay-ps and tx-internal-delay-ps, independently (to some extent) of what its phy-mode is. The rtl8365mb_ext_config_rgmii() method of the related rtl8365mb does just that.
On Mon, Aug 14, 2023 at 05:46:21PM +0200, Andrew Lunn wrote: > > > > + __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > > > > > > also, I guess that this should allow all 4 variants of RGMII. > > > > I'm not sure - looking at what's available, the RTL8366 datasheet (not > > RB) says that there's pinstrapping for the RGMII delays. It also suggests > > that there may be a register that can be modified for this, but the driver > > doesn't appear to touch it - in fact, it does nothing with the interface > > mode. Moreover, the only in-kernel DT for this has: > > > > rtl8366rb_cpu_port: port@5 { > > reg = <5>; > > label = "cpu"; > > ethernet = <&gmac0>; > > phy-mode = "rgmii"; > > fixed-link { > > speed = <1000>; > > full-duplex; > > pause; > > }; > > }; > > > > Whether that can be changed in the RB version of the device or not, I > > don't know, so whether it makes sense to allow the other RGMII modes, > > again, I don't know. > > > > Annoyingly, gmac0 doesn't exist in this file, it's defined in > > gemini.dtsi, which this file references through a heirarchy of nodes > > (makes it very much less readable), but it points at: > > > > / { > > ... > > soc { > > ... > > ethernet@60000000 { > > ... > > ethernet-port@0 { > > phy-mode = "rgmii"; > > fixed-link { > > speed = <1000>; > > full-duplex; > > pause; > > }; > > }; > > > > So that also uses "rgmii". > > > > I'm tempted not to allow the others as the driver doesn't make any > > adjustments, and we only apparently have the one user. > > RGMII on both ends is unlikely to work, so probably one is > wrong. Probably the switch has strapping to set it to rgmii-id, but we > don't actually know that. And i guess we have no ability to find out > the truth? "rgmii" on both ends _can_ work - from our own documentation: * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any internal delay by itself, it assumes that either the Ethernet MAC (if capable) or the PCB traces insert the correct 1.5-2ns delay ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ So, if the board is correctly laid out to include this delay, then RGMII on both ends can work. Next, we have no ability to find anything out - we have no hardware. LinusW does, but I have no idea whether Linus can read the state of the pin strapping. I can see from the RTL8366 info I found that there is a register that the delay settings can be read from, but this is not the RTL8366, it's RTL8366RB, which Linus already pointed out is revision B and is different. So, I would defer to Linus for anything on this, and without input from Linus, all we have to go on is what we have in the kernel sources. > So a narrow definition seems reasonable at the moment, to raise a red > warning flag if somebody does try to use rgmii-id which is not > actually implemented in the driver. And that user then gets to sort > out the problem. I think Vladimir will be having a party, because that's now two of us who've made the mistake of infering that "phy-mode" changes the configuration at the end that has that specified (I can hear the baloons being inflated!) Of course it shouldn't, as per our documentation on RGMII delays in Documentation/networking/phy.rst that was added by Florian back in November 2016. That said, with DSA, we don't currently have a way for the MAC to instruct the DSA switch what delay it should be using as unlike PHYLIB, the MAC doesn't bind to the DSA switch in the same way (there's no dsa_connect() or dsa_attach() call that MACs call which would pass the phy interface mode to the DSA side.) However, a DSA switch CPU node does have an "ethernet" property which points at the CPU-side node, and a DSA switch _could_ read that setting and use it to configure the RGMII delays in the same way that PHYs would do - but no one does that. This brings up the obvious question: does anyone deal with the RGMII delays with DSA switches in software, or is it all done by hardware pin strapping, hardware defaults, and/or a correctly laid out PCB?
> > RGMII on both ends is unlikely to work, so probably one is > > wrong. Probably the switch has strapping to set it to rgmii-id, but we > > don't actually know that. And i guess we have no ability to find out > > the truth? > > "rgmii" on both ends _can_ work - from our own documentation: > > * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > internal delay by itself, it assumes that either the Ethernet MAC (if capable) > or the PCB traces insert the correct 1.5-2ns delay > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > So, if the board is correctly laid out to include this delay, then RGMII > on both ends can work. Yes, which is why is said 'unlikely', not 'will not'. I've not come across many boards which do have extra long clock tracks, so it is unlikely this board has them. It is much more likely to be strapped to do rgmii-id. > Next, we have no ability to find anything out - we have no hardware. > LinusW does, but I have no idea whether Linus can read the state of the > pin strapping. I can see from the RTL8366 info I found that there is > a register that the delay settings can be read from, but this is not > the RTL8366, it's RTL8366RB, which Linus already pointed out is > revision B and is different. So, I would defer to Linus for anything on > this, and without input from Linus, all we have to go on is what we > have in the kernel sources. > > > So a narrow definition seems reasonable at the moment, to raise a red > > warning flag if somebody does try to use rgmii-id which is not > > actually implemented in the driver. And that user then gets to sort > > out the problem. > > I think Vladimir will be having a party, because that's now two of us > who've made the mistake of infering that "phy-mode" changes the > configuration at the end that has that specified (I can hear the > baloons being inflated!) > > Of course it shouldn't, as per our documentation on RGMII delays in > Documentation/networking/phy.rst that was added by Florian back in > November 2016. It might not be documented, but there are a couple of DSA drivers which do react on this property and set their RGMII delays based on this for their CPU port. mv88e6xxx is one of them, and it also does so for DSA ports. > This brings up the obvious question: does anyone deal with the RGMII > delays with DSA switches in software, or is it all done by hardware > pin strapping, hardware defaults, and/or a correctly laid out PCB? arch/arm/boot/dts/nxp/vf/vf610-zii-dev-rev-b.dts: switch0port5: port@5 { reg = <5>; label = "dsa"; phy-mode = "rgmii-txid"; link = <&switch1port6 &switch2port9>; fixed-link { speed = <1000>; full-duplex; }; }; and the other end of this link: switch1port6: port@6 { reg = <6>; label = "dsa"; phy-mode = "rgmii-txid"; link = <&switch0port5>; fixed-link { speed = <1000>; full-duplex; }; }; imx7d-zii-rpu2.dts: port@5 { reg = <5>; label = "cpu"; ethernet = <&fec1>; phy-mode = "rgmii-id"; fixed-link { speed = <1000>; full-duplex; }; }; With the 'cpu' case, it is clearly acting like a PHY to the SoCs MAC, so it does not seem too unreasonable for it to act upon it. For a DSA link there is not a clear MAC-PHY relationship, but we do somehow need to specify delays. Andrew
On Mon, Aug 14, 2023 at 07:05:19PM +0200, Andrew Lunn wrote: > > > So a narrow definition seems reasonable at the moment, to raise a red > > > warning flag if somebody does try to use rgmii-id which is not > > > actually implemented in the driver. And that user then gets to sort > > > out the problem. > > > > I think Vladimir will be having a party, because that's now two of us > > who've made the mistake of infering that "phy-mode" changes the > > configuration at the end that has that specified (I can hear the > > baloons being inflated!) > > > > Of course it shouldn't, as per our documentation on RGMII delays in > > Documentation/networking/phy.rst that was added by Florian back in > > November 2016. > > It might not be documented, but there are a couple of DSA drivers > which do react on this property and set their RGMII delays based on > this for their CPU port. mv88e6xxx is one of them, and it also does so > for DSA ports. mv88e6xxx does indeed configure the RGMII delays: mv88e6xxx_port_set_rgmii_delay() does the delay configuration in the MAC_CTL register for each port, using the interface mode passed to it. This is called by mv88e6xxx_port_config_interface() which in turn is called by mv88e6xxx_port_setup_mac() and mv88e6xxx_mac_config(). This will be using the interface mode set in the switch port's configuration, rather than the interface mode that is in the CPU MAC node to which it is connected. This is different. Essentially, when an ethernet driver connects to a PHY: MAC <---------------------------------> PHY v v dt-mac-node { dt-phy-node { phy-mode = "rgmii-foo"; /* contains no phy-mode */ }; }; parses phy mode configures for RGMII mode ignores RGMII delays associated with phy-mode applies any delays configured by rx-internal-delay-ps and tx-internal-delay-ps properties calls phy_attach(..., mode); phylib sets phy_dev->interface PHY driver uses phy_dev->interface to configure RGMII delays So, in this case, the dt-mac-node phy-mode property determines the delays at the PHY. In the DSA case: MAC <---------------------------------> DSA v v dt-mac-node { dt-dsa-node { phy-mode = "rgmii-foo"; phy-mode = "rgmii-bar"; fixed-link { fixed-link { ... ... }; }; }; }; parses phy mode parses phy mode configures for RGMII mode configures for RGMII mode ignores RGMII delays associated configures RGMII delays with phy-mode associated with its own phy-mode applies any delays configured by rx-internal-delay-ps and tx-internal-delay-ps properties sets up fixed link sets up fixed link This is a different behaviour from the phylib setup above... but let me explain why it is, because it now looks very weird. Before phylink, we actually had this model for DSA switches: Host MAC <----------------------------> Fixed-PHY v v dt-mac-node { No DT node phy-mode = "rgmii-foo"; fixed-link { ... }; }; parses phy mode configures for RGMII mode ignores RGMII delays associated with phy-mode applies any delays configured by rx-internal-delay-ps and tx-internal-delay-ps properties calls phy_attach(..., mode); phylib sets phy_dev->interface Generic PHY driver ignores phy_dev->interface Then we have the DSA side: DSA <------------------------------> Fixed-PHY v v dt-dsa-node { No DT node phy-mode = "rgmii-foo"; fixed-link { ... }; }; parses phy mode configures for RGMII mode configures RGMII delays associated with phy-mode calls phy_attach(..., mode); phylib sets phy_dev->interface Generic PHY driver ignores phy_dev->interface ... and it is as if, magically, those two fixed-PHYs talk to each other. So, the model looks very much still like the phylib model above - each MAC has a PHY that it talks to and passes the RGMII interface mode to. The difference is, that PHY is a software emulation of a PHY operating in a fixed mode. The other difference is that the DSA MAC configures its RGMII delays from its *own* phy-mode, which is completely different from what happens with a host MAC which doesn't - because in November 2016, it was decided (and documented) that the PHY interface mode specifies the delays to be used *at* *the* *phy* and not the *mac* side of the link. So now, when one looks at the phylink setup, where those fixed-PHYs have essentially been eliminated, it now looks very weird in comparison - because it leads one to believe that the DSA switch RGMII interface has taken the place of a proper PHY, and that leads up to the conclusion that "but if phylib sets the PHY delays according to the host MAC's phy-mode property, why isn't DSA doing the same!" The answer to that is... established history. The host MAC phy-mode property still has zero effect what so ever on the RGMII delay settings at the host MAC end of the link - and can be *any* of the four RGMII interface types. It really doesn't matter. The DSA MAC phy-mode property, at least in the case of mv88e6xxx, configures the RGMII delays at the DSA switch MAC end of the link. So, to summarise... A host MAC connected to a phylib PHY, the host MAC's phy-mode property defines the RGMII delays at device on other end of the RGMII bus - which is the phylib PHY. A host MAC connected to a DSA switch, the host MAC's phy-mode property is irrelevant as far as RGMII delays are concerned, they have no effect on the device on the end of the RGMII bus. A DSA MAC, the DSA MAC's phy-mode property is used to configure the RGMII delays on the *local* end of the RGMII bus. This is what happens with the mv88e6xxx driver, whether intentional or not. In the case of a DSA to host MAC link, there is no attempt by DSA to delve into the host MAC's DT to retrieve the phy-mode property there. The big problem with RGMII delays has been this in the documentation: "The PHY library offers different types of PHY_INTERFACE_MODE_RGMII* values to let the PHY driver and optionally the MAC driver, implement the required delay. The values of phy_interface_t must be understood from the perspective of the PHY device itself, leading to the following:" Note "and optionally the MAC driver". Well, no, there is nothing optional about this if one wants consistency of implementation - and I'll explain why in a moment, but first lets see what is expected of the PHY itself for each of these RGMII modes: "* PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any internal delay by itself, it assumes that either the Ethernet MAC (if capable) or the PCB traces insert the correct 1.5-2ns delay * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay for the transmit data lines (TXD[3:0]) processed by the PHY device * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay for the receive data lines (RXD[3:0]) processed by the PHY device * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for both transmit AND receive data lines from/to the PHY device" This is quite clear where the delay is inserted - by the *PHY* device. The above pre-dates my involvement in PHYLIB, and comes from a commit by Florian in November 2016, yet I seem to be often attributed with it. Now, going back to that "optionally the MAC driver". Consider if we have, say, a PHY driver that is using host MAC M1 that has decided not to implement the delays (hey, they're optional!) Using PHY_INTERFACE_MODE_RGMII_TXID, to satisfy the above, the PHY is expected to insert the required delay for the transmit data lines. Now lets say that the very same PHY driver uses host MAC M2, but that MAC driver has decided to implement these delays (hey, they're optional!) Using again PHY_INTERFACE_MODE_RGMII_TXID, the MAC driver decided to add delay to the transmit path. The PHY, however, also sees PHY_INTERFACE_MODE_RGMII_TXID and adds its own delay to the transmit data lines as it always has done. Now we have a double delay. So, that "and optionally the MAC driver" is what has historically led to problems with the various RGMII modes, with new implementations popping up and finding that host MAC M2 that's been in use for years with PHY device P1 (that hasn't implemented RGMII delays because the MAC driver did) now doesn't work with PHY device P2 (which does implement RGMII delays) that has also been in use for years. It's because that "optionally" stuff at the MAC end has led people down the path of _sometimes_ implementing RGMII delays at the MAC end of the link, and other times implementing RGMII delays at the PHY end of the link according to the phy-mode specification at the host MAC. It seems to me that since we had this understanding that RGMII delays are applied at the PHY end of the link for RGMII, we have had significantly less "my RGMII doesn't work" stuff. That's not really my doing - that's Florian's, by writing the specification for the what is expected of the PHY device for each of the RGMII phy interface modes back in November 2016. My only part in it was only later ensuring that everyone was singing off the same hymm sheet with what had already been decided - so we didn't get different reviewers telling people different things that were also different from what had been documented. ... and with that consistency, we now appear to have way less issues with RGMII - or at least that is my impression in terms of the emails I see as one of the co-phylib maintainers (thus I get the emails!) At the end of the day, what is important for inter-operability between PHYs and MACs is that *both* implement the RGMII delays in a consistent manner, so if PHYs are to insert delays and MACs not, then all PHY drivers need to insert delays and all MACs must not. We had been heading to a situation where some MACs did, other MACs didn't, some PHY drivers did, some PHY drivers didn't... Anyway, this seems to have turned into a very long email... wasn't supposed to be, but I suspect if I didn't cover everything, there would be a very long email thread instead... well, there probably will be picking this apart and disagreeing with bits of it...
On Mon, Aug 14, 2023 at 6:27 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > > > ethernet@60000000 { > > > ... > > > ethernet-port@0 { > > > phy-mode = "rgmii"; > > > fixed-link { > > > speed = <1000>; > > > full-duplex; > > > pause; > > > }; > > > }; > > > > > > So that also uses "rgmii". > > > > > > I'm tempted not to allow the others as the driver doesn't make any > > > adjustments, and we only apparently have the one user. > > > > RGMII on both ends is unlikely to work, so probably one is > > wrong. Probably the switch has strapping to set it to rgmii-id, but we > > don't actually know that. And i guess we have no ability to find out > > the truth? > > "rgmii" on both ends _can_ work - from our own documentation: > > * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > internal delay by itself, it assumes that either the Ethernet MAC (if capable) > or the PCB traces insert the correct 1.5-2ns delay > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > So, if the board is correctly laid out to include this delay, then RGMII > on both ends can work. > > Next, we have no ability to find anything out - we have no hardware. > LinusW does, but I have no idea whether Linus can read the state of the > pin strapping. I can see from the RTL8366 info I found that there is > a register that the delay settings can be read from, but this is not > the RTL8366, it's RTL8366RB, which Linus already pointed out is > revision B and is different. So, I would defer to Linus for anything on > this, and without input from Linus, all we have to go on is what we > have in the kernel sources. I looked at the delays a bit, on the Gemini I think it is set up from the pin control system, for example we have this in arch/arm/boot/dts/gemini/gemini-sl93512r.dts: pinctrl-gmii { mux { function = "gmii"; groups = "gmii_gmac0_grp", "gmii_gmac1_grp"; }; /* Control pad skew comes from sl_switch.c in the vendor code */ conf0 { pins = "P10 GMAC1 TXC"; skew-delay = <5>; }; conf1 { pins = "V11 GMAC1 TXEN"; skew-delay = <7>; }; conf2 { pins = "T11 GMAC1 RXC"; skew-delay = <8>; }; conf3 { pins = "U11 GMAC1 RXDV"; skew-delay = <7>; }; conf4 { pins = "V7 GMAC0 TXC"; skew-delay = <10>; }; conf5 { pins = "P8 GMAC0 TXEN"; skew-delay = <7>; /* 5 at another place? */ }; conf6 { pins = "T8 GMAC0 RXC"; skew-delay = <15>; }; conf7 { pins = "R8 GMAC0 RXDV"; skew-delay = <0>; }; conf8 { /* The data lines all have default skew */ pins = "U8 GMAC0 RXD0", "V8 GMAC0 RXD1", "P9 GMAC0 RXD2", "R9 GMAC0 RXD3", "R11 GMAC1 RXD0", "P11 GMAC1 RXD1", "V12 GMAC1 RXD2", "U12 GMAC1 RXD3", "R10 GMAC1 TXD0", "T10 GMAC1 TXD1", "U10 GMAC1 TXD2", "V10 GMAC1 TXD3"; skew-delay = <7>; }; /* Appears in sl351x_gmac.c in the vendor code */ conf9 { pins = "U7 GMAC0 TXD0", "T7 GMAC0 TXD1", "R7 GMAC0 TXD2", "P7 GMAC0 TXD3"; skew-delay = <5>; }; }; }; For the DIR-685 this is set to the default value of 7 for all skews. I haven't found any registers dealing with delays in RTL8366RB. I might need to look closer (vendor mess...) I think the PCB can have been engineered to match this since clearly other PCBs contain elaborate values per line. Here is a picture of the DIR-685 PCB: https://www.redeszone.net/app/uploads-redeszone.net/d-link_dir-685_analisis_15-1024x755.jpg the swirly lines to the right are toward the memory indicating that the engineer is consciously designing delays on this board. Yours, Linus Walleij
> This will be using the interface mode set in the switch port's > configuration, rather than the interface mode that is in the CPU > MAC node to which it is connected. This is different. > > Essentially, when an ethernet driver connects to a PHY: > > MAC <---------------------------------> PHY > v v > dt-mac-node { dt-phy-node { > phy-mode = "rgmii-foo"; /* contains no phy-mode */ > }; }; > > parses phy mode > configures for RGMII mode > ignores RGMII delays associated > with phy-mode > applies any delays configured > by rx-internal-delay-ps and > tx-internal-delay-ps properties > calls phy_attach(..., mode); > phylib sets phy_dev->interface > PHY driver uses phy_dev->interface > to configure RGMII delays > > So, in this case, the dt-mac-node phy-mode property determines the > delays at the PHY. Mostly. There is at least one MAC driver which looks at phy-mode, enables delays in the MAC based on the phy-mode. It then masks phy-mode to remove the delays it has added, and passes phy_attach() the masked value. This was i think done historically because there was a board with a PHY which could not add the delays and the MAC could. And that driver has got extended to other versions of the MAC and has kept to this way of doing it. I always push back against any new instances of this, and i don't think any more have been added while i've been watching for it. > The host MAC phy-mode property still has zero effect what so ever on > the RGMII delay settings at the host MAC end of the link - and can be > *any* of the four RGMII interface types. It really doesn't matter. Except for the one case i outlined above. > So, to summarise... > > A host MAC connected to a phylib PHY, the host MAC's phy-mode property > defines the RGMII delays at device on other end of the RGMII bus - which > is the phylib PHY. Except for the one case i outlined above. Unfortunately. > A host MAC connected to a DSA switch, the host MAC's phy-mode property > is irrelevant as far as RGMII delays are concerned, they have no > effect on the device on the end of the RGMII bus. I'm don't know if said MAC is every connected to a DSA switch.... > The big problem with RGMII delays has been this in the documentation: > > "The PHY library offers different types of PHY_INTERFACE_MODE_RGMII* > values to let the PHY driver and optionally the MAC driver, implement > the required delay. The values of phy_interface_t must be understood > from the perspective of the PHY device itself, leading to the > following:" I suspect this documentation was written to take into the account the one oddball MAC. > Note "and optionally the MAC driver". Well, no, there is nothing > optional about this if one wants consistency of implementation - and > I'll explain why in a moment, but first lets see what is expected of > the PHY itself for each of these RGMII modes: > > "* PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > internal delay by itself, it assumes that either the Ethernet MAC (if > capable) or the PCB traces insert the correct 1.5-2ns delay > > * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay > for the transmit data lines (TXD[3:0]) processed by the PHY device > > * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay > for the receive data lines (RXD[3:0]) processed by the PHY device I guess 50% of PHYs get these two swapped around, simply because they are never tested. > * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for > both transmit AND receive data lines from/to the PHY device" > > This is quite clear where the delay is inserted - by the *PHY* device. > The above pre-dates my involvement in PHYLIB, and comes from a commit > by Florian in November 2016, yet I seem to be often attributed with it. > > Now, going back to that "optionally the MAC driver". Consider if we > have, say, a PHY driver that is using host MAC M1 that has decided not > to implement the delays (hey, they're optional!) Using > PHY_INTERFACE_MODE_RGMII_TXID, to satisfy the above, the PHY is > expected to insert the required delay for the transmit data lines. > > Now lets say that the very same PHY driver uses host MAC M2, but that > MAC driver has decided to implement these delays (hey, they're > optional!) Using again PHY_INTERFACE_MODE_RGMII_TXID, the MAC driver > decided to add delay to the transmit path. The PHY, however, also > sees PHY_INTERFACE_MODE_RGMII_TXID and adds its own delay to the > transmit data lines as it always has done. Now we have a double delay. The one example of a MAC which does this also masks the value passed to the PHY, so that PHY gets PHY_INTERFACE_MODE_RGMII. And it then all works. What we might want to do is document this masking. > So, that "and optionally the MAC driver" is what has historically led > to problems with the various RGMII modes, with new implementations > popping up and finding that host MAC M2 that's been in use for years > with PHY device P1 (that hasn't implemented RGMII delays because the > MAC driver did) now doesn't work with PHY device P2 (which does > implement RGMII delays) that has also been in use for years. I would actually say most problems have come about because the PHY has not always implemented all four modes correctly. PHYs have silently accepted one of the modes but not changed the hardware to actual do what is requested, and kept with strapping defaults. Sometime later, that mode has been correctly implemented, overwriting the strapping, and breaking stuff. So it is on my checklist to ensure all four modes set the hardware, or return -EOPNOTSUPP. I would agree that reviewing and consistency has got better over time, which is why we see less problems. Andrew
On Sun, Aug 13, 2023 at 11:56 PM Linus Walleij <linus.walleij@linaro.org> wrote: > On Sat, Aug 12, 2023 at 2:16 PM Russell King (Oracle) > <linux@armlinux.org.uk> wrote: > > So for realtek, I propose (completely untested): > > I applied it and it all works fine afterwards on the DIR-685. > Should I test some different configs in the DTS as well? I applied the following two patches on top of your patch, and it works like a charm. >From e23b281afecd019322fd7d3f0e1c2f561842b02a Mon Sep 17 00:00:00 2001 From: Linus Walleij <linus.walleij@linaro.org> Date: Tue, 15 Aug 2023 08:21:18 +0200 Subject: [PATCH 1/2] RFC: net: dsa: realtek: Implement setting up link on CPU port We auto-negotiate most ports in the RTL8366RB driver, but the CPU port is hard-coded to 1Gbit, full duplex, tx and rx pause. Actually respect the arguments passed to the function for the CPU port. After this the link is still set up properly. Signed-off-by: Linus Walleij <linus.walleij@linaro.org> --- This patch requires Russell Kings patch: "net: dsa: realtek: add phylink_get_caps implementation" --- drivers/net/dsa/realtek/rtl8366rb.c | 42 ++++++++++++++++++++++------- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git a/drivers/net/dsa/realtek/rtl8366rb.c b/drivers/net/dsa/realtek/rtl8366rb.c index 76b5c43e1430..385225980e8d 100644 --- a/drivers/net/dsa/realtek/rtl8366rb.c +++ b/drivers/net/dsa/realtek/rtl8366rb.c @@ -95,12 +95,6 @@ #define RTL8366RB_PAACR_RX_PAUSE BIT(6) #define RTL8366RB_PAACR_AN BIT(7) -#define RTL8366RB_PAACR_CPU_PORT (RTL8366RB_PAACR_SPEED_1000M | \ - RTL8366RB_PAACR_FULL_DUPLEX | \ - RTL8366RB_PAACR_LINK_UP | \ - RTL8366RB_PAACR_TX_PAUSE | \ - RTL8366RB_PAACR_RX_PAUSE) - /* bits 0..7 = port 0, bits 8..15 = port 1 */ #define RTL8366RB_PSTAT0 0x0014 /* bits 0..7 = port 2, bits 8..15 = port 3 */ @@ -1081,6 +1075,7 @@ rtl8366rb_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode, int speed, int duplex, bool tx_pause, bool rx_pause) { struct realtek_priv *priv = ds->priv; + unsigned int val; int ret; if (port != priv->cpu_port) @@ -1088,22 +1083,51 @@ rtl8366rb_mac_link_up(struct dsa_switch *ds, int port, unsigned int mode, dev_dbg(priv->dev, "MAC link up on CPU port (%d)\n", port); - /* Force the fixed CPU port into 1Gbit mode, no autonegotiation */ + /* Force the fixed CPU port forced, no autonegotiation */ ret = regmap_update_bits(priv->map, RTL8366RB_MAC_FORCE_CTRL_REG, BIT(port), BIT(port)); if (ret) { - dev_err(priv->dev, "failed to force 1Gbit on CPU port\n"); + dev_err(priv->dev, "failed to force CPU port\n"); return; } + /* Conjure port config */ + switch (speed) { + case SPEED_10: + val = RTL8366RB_PAACR_SPEED_10M; + break; + case SPEED_100: + val = RTL8366RB_PAACR_SPEED_100M; + break; + case SPEED_1000: + val = RTL8366RB_PAACR_SPEED_1000M; + break; + default: + val = RTL8366RB_PAACR_SPEED_1000M; + break; + } + + if (duplex == DUPLEX_FULL) + val |= RTL8366RB_PAACR_FULL_DUPLEX; + + if (tx_pause) + val |= RTL8366RB_PAACR_TX_PAUSE; + + if (rx_pause) + val |= RTL8366RB_PAACR_RX_PAUSE; + + val |= RTL8366RB_PAACR_LINK_UP; + ret = regmap_update_bits(priv->map, RTL8366RB_PAACR2, 0xFF00U, - RTL8366RB_PAACR_CPU_PORT << 8); + val << 8); if (ret) { dev_err(priv->dev, "failed to set PAACR on CPU port\n"); return; } + dev_dbg(priv->dev, "set PAACR to %04x\n", val); + /* Enable the CPU port */ ret = regmap_update_bits(priv->map, RTL8366RB_PECR, BIT(port), 0);
On Mon, Aug 14, 2023 at 11:03:00PM +0100, Russell King (Oracle) wrote: > Then we have the DSA side: > > DSA <------------------------------> Fixed-PHY > v v > dt-dsa-node { No DT node > phy-mode = "rgmii-foo"; > fixed-link { > ... > }; > }; > > parses phy mode > configures for RGMII mode > configures RGMII delays associated > with phy-mode > calls phy_attach(..., mode); > phylib sets phy_dev->interface > Generic PHY driver ignores > phy_dev->interface There is one case that I have missed, and it's totally screwed by this behaviour where a Marvell DSA switch is connected to a Marvell PHY via a RGMII connection: DSA <---------------------------------> PHY v v dt-dsa-node { phy: dt-phy-node { phy-handle = <&phy>; ... phy-mode = "rgmii-foo"; }; }; parses phy mode configures for RGMII mode configures RGMII delays associated with phy-mode calls phy_attach(..., mode); phylib sets phy_dev->interface PHY driver looks at phydev->interface and configures delays In this case, we have *both* ends configuring the RGMII delays and it will not work - because having the DSA MAC end configure the delays breaks the phylib model where the MAC *shouldn't* be configuring the delays. So, should mv88e6xxx_mac_config() also be forcing all RGMII modes in state->interface to be PHY_INTERFACE_MODE_RGMII when passing that into mv88e6xxx_port_config_interface() if, and only if the port is a user port? Or maybe if and only if the port is actually connected to a real PHY?
Hi Russell, On Tue, Aug 15, 2023 at 11:13:07AM +0100, Russell King (Oracle) wrote: > There is one case that I have missed, and it's totally screwed by > this behaviour where a Marvell DSA switch is connected to a Marvell > PHY via a RGMII connection: > > DSA <---------------------------------> PHY > v v > dt-dsa-node { phy: dt-phy-node { > phy-handle = <&phy>; ... > phy-mode = "rgmii-foo"; }; > }; > > parses phy mode > configures for RGMII mode > configures RGMII delays associated > with phy-mode > calls phy_attach(..., mode); > phylib sets phy_dev->interface > PHY driver looks at > phydev->interface and > configures delays > > In this case, we have *both* ends configuring the RGMII delays and it > will not work - because having the DSA MAC end configure the delays > breaks the phylib model where the MAC *shouldn't* be configuring the > delays. > > So, should mv88e6xxx_mac_config() also be forcing all RGMII modes > in state->interface to be PHY_INTERFACE_MODE_RGMII when passing > that into mv88e6xxx_port_config_interface() if, and only if the > port is a user port? Or maybe if and only if the port is actually > connected to a real PHY? I'd tend to believe that you're right, this would be broken (and not just with Marvell RGMII PHYs). I was under the impression that mv88e6xxx_mac_config() will only configure the RGMII delays associated with phy-mode if the port is in fixed-link mode. But looking at the actual condition upon which mv88e6xxx_mac_config() decides to call mv88e6xxx_port_config_interface(), that is: if (mode != MLO_AN_PHY || !mv88e6xxx_phy_is_internal(chip, port)) mv88e6xxx_port_config_interface() and thus, an external PHY would return false for the first term of the OR operation, but true for the second, and it would proceed to configure the MAC-side RGMII delays, resulting in a double delay. However, I am not fully confident in my analysis, since I don't have mv88e6xxx hardware with an RGMII port to confirm. It's interesting that we haven't seen reports?
On Mon, Aug 14, 2023 at 11:03:00PM +0100, Russell King (Oracle) wrote: > So, to summarise... > > A host MAC connected to a phylib PHY, the host MAC's phy-mode property > defines the RGMII delays at device on other end of the RGMII bus - which > is the phylib PHY. > > A host MAC connected to a DSA switch, the host MAC's phy-mode property > is irrelevant as far as RGMII delays are concerned, they have no > effect on the device on the end of the RGMII bus. > > A DSA MAC, the DSA MAC's phy-mode property is used to configure the > RGMII delays on the *local* end of the RGMII bus. > > This is what happens with the mv88e6xxx driver, whether intentional or > not. In the case of a DSA to host MAC link, there is no attempt by DSA > to delve into the host MAC's DT to retrieve the phy-mode property > there. > > > The big problem with RGMII delays has been this in the documentation: > > "The PHY library offers different types of PHY_INTERFACE_MODE_RGMII* > values to let the PHY driver and optionally the MAC driver, implement > the required delay. The values of phy_interface_t must be understood > from the perspective of the PHY device itself, leading to the > following:" > > Note "and optionally the MAC driver". Well, no, there is nothing > optional about this if one wants consistency of implementation - and > I'll explain why in a moment, but first lets see what is expected of > the PHY itself for each of these RGMII modes: > > "* PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any > internal delay by itself, it assumes that either the Ethernet MAC (if > capable) or the PCB traces insert the correct 1.5-2ns delay > > * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay > for the transmit data lines (TXD[3:0]) processed by the PHY device > > * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay > for the receive data lines (RXD[3:0]) processed by the PHY device > > * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for > both transmit AND receive data lines from/to the PHY device" > > This is quite clear where the delay is inserted - by the *PHY* device. > The above pre-dates my involvement in PHYLIB, and comes from a commit > by Florian in November 2016, yet I seem to be often attributed with it. > > Now, going back to that "optionally the MAC driver". Consider if we > have, say, a PHY driver that is using host MAC M1 that has decided not > to implement the delays (hey, they're optional!) Using > PHY_INTERFACE_MODE_RGMII_TXID, to satisfy the above, the PHY is > expected to insert the required delay for the transmit data lines. > > Now lets say that the very same PHY driver uses host MAC M2, but that > MAC driver has decided to implement these delays (hey, they're > optional!) Using again PHY_INTERFACE_MODE_RGMII_TXID, the MAC driver > decided to add delay to the transmit path. The PHY, however, also > sees PHY_INTERFACE_MODE_RGMII_TXID and adds its own delay to the > transmit data lines as it always has done. Now we have a double delay. > > So, that "and optionally the MAC driver" is what has historically led > to problems with the various RGMII modes, with new implementations > popping up and finding that host MAC M2 that's been in use for years > with PHY device P1 (that hasn't implemented RGMII delays because the > MAC driver did) now doesn't work with PHY device P2 (which does > implement RGMII delays) that has also been in use for years. > > It's because that "optionally" stuff at the MAC end has led people > down the path of _sometimes_ implementing RGMII delays at the MAC > end of the link, and other times implementing RGMII delays at the > PHY end of the link according to the phy-mode specification at the > host MAC. > > It seems to me that since we had this understanding that RGMII delays > are applied at the PHY end of the link for RGMII, we have had > significantly less "my RGMII doesn't work" stuff. That's not really > my doing - that's Florian's, by writing the specification for the > what is expected of the PHY device for each of the RGMII phy > interface modes back in November 2016. My only part in it was only > later ensuring that everyone was singing off the same hymm sheet with > what had already been decided - so we didn't get different reviewers > telling people different things that were also different from what > had been documented. > > ... and with that consistency, we now appear to have way less issues > with RGMII - or at least that is my impression in terms of the emails > I see as one of the co-phylib maintainers (thus I get the emails!) > > At the end of the day, what is important for inter-operability between > PHYs and MACs is that *both* implement the RGMII delays in a consistent > manner, so if PHYs are to insert delays and MACs not, then all PHY > drivers need to insert delays and all MACs must not. > > We had been heading to a situation where some MACs did, other MACs > didn't, some PHY drivers did, some PHY drivers didn't... > > Anyway, this seems to have turned into a very long email... wasn't > supposed to be, but I suspect if I didn't cover everything, there > would be a very long email thread instead... well, there probably > will be picking this apart and disagreeing with bits of it... Russell, I agree with your analysis that the MAC side's required interpretation of the phy-mode is underspecified. What has worked for me was the addition of the "rx-internal-delay-ps" and "tx-internal-delay-ps" properties in the MAC OF node. When those are present, I believe that the behavior of the MAC side is fully specified. Today, the sja1105, qca8k and ksz DSA drivers check for the presence of these 2 properties, and if found, they will use them, otherwise they will fall back to the old school of thought - if fixed-link, apply the delays ourselves. In addition, rtl8365mb has only the new behavior. It does not have the old-school logic at all. There exists a direct migration path between one school of thought and the other, which preserves functionality for existing device trees, just prints a warning if rx-internal-delay-ps and tx-internal-delay-ps are missing on fixed-link ports. Do you believe that there is value in converting the mv88e6xxx driver (and gradually, more and more other drivers) to work with the fully-specified DT bindings?
On Mon, Aug 14, 2023 at 07:05:19PM +0200, Andrew Lunn wrote: > arch/arm/boot/dts/nxp/vf/vf610-zii-dev-rev-b.dts: > switch0port5: port@5 { > reg = <5>; > label = "dsa"; > phy-mode = "rgmii-txid"; > link = <&switch1port6 > &switch2port9>; > fixed-link { > speed = <1000>; > full-duplex; > }; > }; > > and the other end of this link: > > switch1port6: port@6 { > reg = <6>; > label = "dsa"; > phy-mode = "rgmii-txid"; > link = <&switch0port5>; > fixed-link { > speed = <1000>; > full-duplex; > }; > }; > > imx7d-zii-rpu2.dts: > port@5 { > reg = <5>; > label = "cpu"; > ethernet = <&fec1>; > phy-mode = "rgmii-id"; > > fixed-link { > speed = <1000>; > full-duplex; > }; > }; > > With the 'cpu' case, it is clearly acting like a PHY to the SoCs MAC, > so it does not seem too unreasonable for it to act upon it. For a DSA > link there is not a clear MAC-PHY relationship, but we do somehow need > to specify delays. Andrew, I'd argue that the MAC-PHY relationship between the DSA master and the CPU port is equally clear as between 2 arbitrary cascade ports. Which is: not clear at all. The RGMII standard does not talk about the existence of a MAC role and a PHY role, to my knowledge. These are 2 MACs back to back, in both cases. With rx-internal-delay-ps and tx-internal-delay-ps in each MAC node, you get the freedom of specifying RGMII delays in whichever way is needed, without baking in any assumption that the port plays the role of a PHY or not.
> Andrew, I'd argue that the MAC-PHY relationship between the DSA master > and the CPU port is equally clear as between 2 arbitrary cascade ports. > Which is: not clear at all. The RGMII standard does not talk about the > existence of a MAC role and a PHY role, to my knowledge. The standard does talk about an optional in band status, placed onto the RXD pins during the inter packet gap. For that to work, there needs to be some notion of MAC and PHY side. > With rx-internal-delay-ps and tx-internal-delay-ps in each MAC node, you > get the freedom of specifying RGMII delays in whichever way is needed, > without baking in any assumption that the port plays the role of a PHY > or not. I agree with you here, but these are modern inventions, as a result of evolution over time, as we see what does and does not work well, and as we as developers go from newbies to seasoned developers getting better at defining consistent APIs. Andrew
On Thu, Aug 17, 2023 at 08:52:12PM +0200, Andrew Lunn wrote: > > Andrew, I'd argue that the MAC-PHY relationship between the DSA master > > and the CPU port is equally clear as between 2 arbitrary cascade ports. > > Which is: not clear at all. The RGMII standard does not talk about the > > existence of a MAC role and a PHY role, to my knowledge. > > The standard does talk about an optional in band status, placed onto > the RXD pins during the inter packet gap. For that to work, there > needs to be some notion of MAC and PHY side. Well, opening the RGMII standard, it was quite stupid of myself to say that. It says that the purpose of RGMII is to interconnect the MAC and the PHY right from the first phrase. You're also completely right in pointing out that the optional in-band status is provided by the PHY on RXD[3:0]. Actually, MAC-to-MAC is not explicitly supported anywhere in the standard (RGMII 2.0, 4/1/2002) that I can find. It simply seems to be a case of: "whatever the PHY is required by the standard to do is specified in such a way that when another MAC is put in its place (with RX and TX signals inverted), the protocol still makes sense". But, with that stretching of the standard considered, I'm still not necessarily seeing which side is the MAC and which side is the PHY in a MAC-to-MAC scenario. With a bit of imagination, I could actually see 2 back-to-back MAC IPs which both have logic to provide the optional in-band status (with hardcoded information) to the link partner's RXD[3:0]. No theory seems to be broken by this (though I can't point to any real implementation). So a MAC role would be the side that expects the in-band status to be present on its RXD[3:0], and a PHY role would be the side that provides it, and being in the MAC role does not preclude being in the PHY role?
On Thu, Aug 17, 2023 at 10:17:54PM +0300, Vladimir Oltean wrote: > On Thu, Aug 17, 2023 at 08:52:12PM +0200, Andrew Lunn wrote: > > > Andrew, I'd argue that the MAC-PHY relationship between the DSA master > > > and the CPU port is equally clear as between 2 arbitrary cascade ports. > > > Which is: not clear at all. The RGMII standard does not talk about the > > > existence of a MAC role and a PHY role, to my knowledge. > > > > The standard does talk about an optional in band status, placed onto > > the RXD pins during the inter packet gap. For that to work, there > > needs to be some notion of MAC and PHY side. > > Well, opening the RGMII standard, it was quite stupid of myself to say > that. It says that the purpose of RGMII is to interconnect the MAC and > the PHY right from the first phrase. > > You're also completely right in pointing out that the optional in-band > status is provided by the PHY on RXD[3:0]. > > Actually, MAC-to-MAC is not explicitly supported anywhere in the standard > (RGMII 2.0, 4/1/2002) that I can find. It simply seems to be a case of: > "whatever the PHY is required by the standard to do is specified in such > a way that when another MAC is put in its place (with RX and TX signals > inverted), the protocol still makes sense". > > But, with that stretching of the standard considered, I'm still not > necessarily seeing which side is the MAC and which side is the PHY in a > MAC-to-MAC scenario. > > With a bit of imagination, I could actually see 2 back-to-back MAC IPs > which both have logic to provide the optional in-band status (with > hardcoded information) to the link partner's RXD[3:0]. No theory seems > to be broken by this (though I can't point to any real implementation). > > So a MAC role would be the side that expects the in-band status to be > present on its RXD[3:0], and a PHY role would be the side that provides > it, and being in the MAC role does not preclude being in the PHY role? ... trying to find an appropriate place in this thread to put this as I would like to post the realtek patch adding the phylink_get_caps method. So, given the discussion so far, we have two patches to consider. One patch from Linus which changes one of the users of the Realtek DSA driver to use "rgmii-id" instead of "rgmii". Do we still think that this a correct change given that we seem to be agreeing that the only thing that matters on the DSA end of this is that it's "rgmii" and the delays for the DSA end should be specified using the [tr]x-internal-delay-ps properties? The second patch is my patch adding a phylink_get_caps method for Realtek drivers - should that allow all "rgmii" interface types, or do we want to just allow "rgmii" to encourage the use of the [tr]x-internal-delay-ps properties? Thanks.
On Fri, Aug 18, 2023 at 12:11:15PM +0100, Russell King (Oracle) wrote: > On Thu, Aug 17, 2023 at 10:17:54PM +0300, Vladimir Oltean wrote: > > On Thu, Aug 17, 2023 at 08:52:12PM +0200, Andrew Lunn wrote: > > > > Andrew, I'd argue that the MAC-PHY relationship between the DSA master > > > > and the CPU port is equally clear as between 2 arbitrary cascade ports. > > > > Which is: not clear at all. The RGMII standard does not talk about the > > > > existence of a MAC role and a PHY role, to my knowledge. > > > > > > The standard does talk about an optional in band status, placed onto > > > the RXD pins during the inter packet gap. For that to work, there > > > needs to be some notion of MAC and PHY side. > > > > Well, opening the RGMII standard, it was quite stupid of myself to say > > that. It says that the purpose of RGMII is to interconnect the MAC and > > the PHY right from the first phrase. > > > > You're also completely right in pointing out that the optional in-band > > status is provided by the PHY on RXD[3:0]. > > > > Actually, MAC-to-MAC is not explicitly supported anywhere in the standard > > (RGMII 2.0, 4/1/2002) that I can find. It simply seems to be a case of: > > "whatever the PHY is required by the standard to do is specified in such > > a way that when another MAC is put in its place (with RX and TX signals > > inverted), the protocol still makes sense". > > > > But, with that stretching of the standard considered, I'm still not > > necessarily seeing which side is the MAC and which side is the PHY in a > > MAC-to-MAC scenario. > > > > With a bit of imagination, I could actually see 2 back-to-back MAC IPs > > which both have logic to provide the optional in-band status (with > > hardcoded information) to the link partner's RXD[3:0]. No theory seems > > to be broken by this (though I can't point to any real implementation). > > > > So a MAC role would be the side that expects the in-band status to be > > present on its RXD[3:0], and a PHY role would be the side that provides > > it, and being in the MAC role does not preclude being in the PHY role? > > ... trying to find an appropriate place in this thread to put this > as I would like to post the realtek patch adding the phylink_get_caps > method. > > So, given the discussion so far, we have two patches to consider. > > One patch from Linus which changes one of the users of the Realtek > DSA driver to use "rgmii-id" instead of "rgmii". Do we still think > that this a correct change given that we seem to be agreeing that > the only thing that matters on the DSA end of this is that it's > "rgmii" and the delays for the DSA end should be specified using ~~~~~~~ I'd say not necessarily specifically "rgmii", but rather "rgmii*" > the [tr]x-internal-delay-ps properties? As long as it is understood that changing "rgmii" to "rgmii-id" is expected to be inconsequential (placebo) for a fixed-link, I don't have an objection (in principle) to that patch. Though, to have more confidence in the validity of the change, I'd need the phy-mode of the &gmac0 node from arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, and I'm not seeing it. Looking at its driver (drivers/net/ethernet/cortina/gemini.c), I don't see any handling of RGMII delays, and it accepts any RGMII phy-mode. So, if neither the Ethernet controller nor the RTL8366RB switch are adding RGMII delays, it becomes plausible that there are skews added through PCB traces. In that case, the current phy-mode = "rgmii" would be the correct choice, and changing that would be invalid. Some more documentary work would be needed. In any case, I wouldn't rush a change to arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, and I'm not seeing any dependency between that and your phylink_get_caps conversion for the rtl8366rb. > The second patch is my patch adding a phylink_get_caps method for > Realtek drivers - should that allow all "rgmii" interface types, > or do we want to just allow "rgmii" to encourage the use of the > [tr]x-internal-delay-ps properties? Same opinion as above. As long as it's understood that the RTL8366RB MAC, like any other MAC, shouldn't be acting upon the phy-mode when adding delays, let's just accept all 4 variants, with future support to be added for [rt]x-internal-delay-ps if there turn out to be configurable MAC-side delays present.
On Fri, Aug 18, 2023 at 1:40 PM Vladimir Oltean <olteanv@gmail.com> wrote: > Though, to have more confidence in the validity of the change, I'd need > the phy-mode of the &gmac0 node from arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > and I'm not seeing it. The assignment of the gmac0 label is in the top-level DTSI: (...) ethernet: ethernet@60000000 { compatible = "cortina,gemini-ethernet"; (...) gmac0: ethernet-port@0 { compatible = "cortina,gemini-ethernet-port"; (...) Then in the DIR-685 overlay DTS It looks like this: ethernet@60000000 { status = "okay"; ethernet-port@0 { phy-mode = "rgmii"; fixed-link { speed = <1000>; full-duplex; pause; }; }; ethernet-port@1 { /* Not used in this platform */ }; }; Russell pointed out that this style with overlays of nodes are confusing. I agree. If I did it today I would use &gmac0 to configure it. There is a bit of development history here. > Looking at its driver (drivers/net/ethernet/cortina/gemini.c), I don't > see any handling of RGMII delays, and it accepts any RGMII phy-mode. The handling of the delays exist and is done orthogonally, from the pin controller, which can assign skewing to pins on the chip. There are also SoCs that will do clock skewing in the external bus controller, from a driver in drivers/bus such as what drivers/bus/intel-ixp4xx-eb.c is doing for IXP4xx. So there are even several different places where clock skewing/transmit delays can be handled. For DIR-685 is looks like this: pinctrl-gmii { mux { function = "gmii"; groups = "gmii_gmac0_grp"; }; conf0 { pins = "V8 GMAC0 RXDV", "T10 GMAC1 RXDV", "Y7 GMAC0 RXC", "Y11 GMAC1 RXC", "T8 GMAC0 TXEN", "W11 GMAC1 TXEN", "U8 GMAC0 TXC", "V11 GMAC1 TXC", "W8 GMAC0 RXD0", "V9 GMAC0 RXD1", "Y8 GMAC0 RXD2", "U9 GMAC0 RXD3", "T7 GMAC0 TXD0", "U6 GMAC0 TXD1", "V7 GMAC0 TXD2", "U7 GMAC0 TXD3", "Y12 GMAC1 RXD0", "V12 GMAC1 RXD1", "T11 GMAC1 RXD2", "W12 GMAC1 RXD3", "U10 GMAC1 TXD0", "Y10 GMAC1 TXD1", "W10 GMAC1 TXD2", "T9 GMAC1 TXD3"; skew-delay = <7>; }; /* Set up drive strength on GMAC0 to 16 mA */ conf1 { groups = "gmii_gmac0_grp"; drive-strength = <16>; }; }; So skew-delay of 7 steps, each step is ~0.2ns so all pins have a delay of 7*0.2 = 1.4ns. > So, if neither the Ethernet controller nor the RTL8366RB switch are > adding RGMII delays, Actually the SoC is. When the ethernet is probed, it asks for its default pin configuration, and it will applied from the device tree from the above node. It's just that the registers to control them are not in the ethernet hardware block, but in the pin controller. > it becomes plausible that there are skews added > through PCB traces. In the DIR-685 example I think it is *both* (yeah...) because it makes no sense that all delays are 1.4ns. I think the PCB routing influence it *too*. However the D-Link DNS-313 makes more elaborate use of these settings: pinctrl-gmii { mux { function = "gmii"; groups = "gmii_gmac0_grp"; }; /* * In the vendor Linux tree, these values are set for the C3 * version of the SL3512 ASIC with the comment "benson suggest" */ conf0 { pins = "R8 GMAC0 RXDV", "U11 GMAC1 RXDV"; skew-delay = <0>; }; conf1 { pins = "T8 GMAC0 RXC"; skew-delay = <10>; }; conf2 { pins = "T11 GMAC1 RXC"; skew-delay = <15>; }; conf3 { pins = "P8 GMAC0 TXEN", "V11 GMAC1 TXEN"; skew-delay = <7>; }; conf4 { pins = "V7 GMAC0 TXC", "P10 GMAC1 TXC"; skew-delay = <10>; }; conf5 { /* The data lines all have default skew */ pins = "U8 GMAC0 RXD0", "V8 GMAC0 RXD1", "P9 GMAC0 RXD2", "R9 GMAC0 RXD3", "R11 GMAC1 RXD0", "P11 GMAC1 RXD1", "V12 GMAC1 RXD2", "U12 GMAC1 RXD3", "R10 GMAC1 TXD0", "T10 GMAC1 TXD1", "U10 GMAC1 TXD2", "V10 GMAC1 TXD3"; skew-delay = <7>; }; conf6 { pins = "U7 GMAC0 TXD0", "T7 GMAC0 TXD1", "R7 GMAC0 TXD2", "P7 GMAC0 TXD3"; skew-delay = <5>; }; /* Set up drive strength on GMAC0 to 16 mA */ conf7 { groups = "gmii_gmac0_grp"; drive-strength = <16>; }; }; So there are definitely systems doing this. > In that case, the current phy-mode = "rgmii" would > be the correct choice, and changing that would be invalid. Some more > documentary work would be needed. Does the aboe help or did I make it even more confusing :D If guess I could imagine the delays being configured directly from the ethernet driver if that is highly desired. For the IXP4xx ATA controller (that has similar needs to a network phy) I just shortcut the whole "this goes into the external memory controller" and look up the address space for the delay settings and control these settings directly from the driver. It's not very encapsulated but sometimes the world out there is ugly, and the ATA drivers really want to control this see drivers/ata/pata_ixp4xx_cf.c, ixp4xx_set_8bit_timing() etc. To do delay set-up fully from the ethernet controller, for Gemini, I would just pull this register range out from the pin controller, remove the pin control stuff in the device tree, and poke it directly from the ethernet driver. It can be done for sure. It's just two different ways to skin the cat. I suppose it fully confirms the view that the phy-mode in the ethernet controller is 100% placebo though! Just my €0.01 Yours, Linus Walleij
On Fri, Aug 18, 2023 at 02:40:55PM +0300, Vladimir Oltean wrote: > On Fri, Aug 18, 2023 at 12:11:15PM +0100, Russell King (Oracle) wrote: > > On Thu, Aug 17, 2023 at 10:17:54PM +0300, Vladimir Oltean wrote: > > > On Thu, Aug 17, 2023 at 08:52:12PM +0200, Andrew Lunn wrote: > > > > > Andrew, I'd argue that the MAC-PHY relationship between the DSA master > > > > > and the CPU port is equally clear as between 2 arbitrary cascade ports. > > > > > Which is: not clear at all. The RGMII standard does not talk about the > > > > > existence of a MAC role and a PHY role, to my knowledge. > > > > > > > > The standard does talk about an optional in band status, placed onto > > > > the RXD pins during the inter packet gap. For that to work, there > > > > needs to be some notion of MAC and PHY side. > > > > > > Well, opening the RGMII standard, it was quite stupid of myself to say > > > that. It says that the purpose of RGMII is to interconnect the MAC and > > > the PHY right from the first phrase. > > > > > > You're also completely right in pointing out that the optional in-band > > > status is provided by the PHY on RXD[3:0]. > > > > > > Actually, MAC-to-MAC is not explicitly supported anywhere in the standard > > > (RGMII 2.0, 4/1/2002) that I can find. It simply seems to be a case of: > > > "whatever the PHY is required by the standard to do is specified in such > > > a way that when another MAC is put in its place (with RX and TX signals > > > inverted), the protocol still makes sense". > > > > > > But, with that stretching of the standard considered, I'm still not > > > necessarily seeing which side is the MAC and which side is the PHY in a > > > MAC-to-MAC scenario. > > > > > > With a bit of imagination, I could actually see 2 back-to-back MAC IPs > > > which both have logic to provide the optional in-band status (with > > > hardcoded information) to the link partner's RXD[3:0]. No theory seems > > > to be broken by this (though I can't point to any real implementation). > > > > > > So a MAC role would be the side that expects the in-band status to be > > > present on its RXD[3:0], and a PHY role would be the side that provides > > > it, and being in the MAC role does not preclude being in the PHY role? > > > > ... trying to find an appropriate place in this thread to put this > > as I would like to post the realtek patch adding the phylink_get_caps > > method. > > > > So, given the discussion so far, we have two patches to consider. > > > > One patch from Linus which changes one of the users of the Realtek > > DSA driver to use "rgmii-id" instead of "rgmii". Do we still think > > that this a correct change given that we seem to be agreeing that > > the only thing that matters on the DSA end of this is that it's > > "rgmii" and the delays for the DSA end should be specified using > ~~~~~~~ > I'd say not necessarily specifically "rgmii", but rather "rgmii*" > > > the [tr]x-internal-delay-ps properties? For a CPU link though, where there is no "phy", does specifying anything other than "rgmii" make sense? (since there's no "remote" side that would take a blind bit of notice of it.) > As long as it is understood that changing "rgmii" to "rgmii-id" is > expected to be inconsequential (placebo) for a fixed-link, I don't have > an objection (in principle) to that patch. > > Though, to have more confidence in the validity of the change, I'd need > the phy-mode of the &gmac0 node from arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > and I'm not seeing it. Gemini DT source is hard to follow, because despite there being the labels, they aren't always used. gmac0 points at /soc/ethernet@6000000/ethernet-port@0 and finding that in arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts gives us: gmac0 specifies a fixed link with: phy-mode = "rgmii"; It would be helpful if the labels were used consistently! > Looking at its driver (drivers/net/ethernet/cortina/gemini.c), I don't > see any handling of RGMII delays, and it accepts any RGMII phy-mode. As discussed previously in this thread with Linus, Gemini apparently has the capability to add the delays in via the pinctrl layer. In this case, in the pinctrl-gmii node, everything has the same skew delay so the Gemini end of the link looks like it has no delays. On the Realtek DSA end, we don't know how it's strapped. RTL8366 (*not* RB) has the ability to pinstrap the required delays, and read the pinstrapping status out of a register. That register address is used for an entirely different purpose by RTL8366RB, so we can't easily find out that way. > So, if neither the Ethernet controller nor the RTL8366RB switch are > adding RGMII delays, it becomes plausible that there are skews added > through PCB traces. In that case, the current phy-mode = "rgmii" would > be the correct choice, and changing that would be invalid. Some more > documentary work would be needed. It could also be that RTL8366RB is pinstrapped to add the delays. If RTL8366RB is pinstrapped for delays on both rx and tx, then that would be equivalent to a DT description of e.g.: phy-mode = "rgmii"; rx-internal-delay-ps = <2000>; tx-internal-delay-ps = <2000>; on the DSA end, and: phy-mode = "rgmii-id"; on the gmac0 end. I believe the DSA end in this case should be "rgmii" because there are no delays being added at the CPU side of the connection, and in _this_ case in gemini-dlink-dir-685.dts, it would be incorrect to change the DSA side from "rgmii". Whether the delays are added by the switch or by trace routing is something we can't answer, thus we can't say whether the CPU end should use "rgmii-id" or "rgmii", nor whether the delay-ps properties should be added on the DSA side to make a complete "modern" description. > In any case, I wouldn't rush a change to arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > and I'm not seeing any dependency between that and your phylink_get_caps > conversion for the rtl8366rb. If the DSA side is changed from "rgmii" to "rgmii-id" then only doing: __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); means that when phylink_create() is called with PHY_INTERFACE_MODE_RGMII_ID due to Linus' change, the phylink_validate() in phylink_parse_fixedlink() will continue to fail (as it is today) and if that bug ever gets fixed, then rtl8366rb.c will break. This negates the whole point of adding phylink_get_caps() for realtek. > > The second patch is my patch adding a phylink_get_caps method for > > Realtek drivers - should that allow all "rgmii" interface types, > > or do we want to just allow "rgmii" to encourage the use of the > > [tr]x-internal-delay-ps properties? > > Same opinion as above. As long as it's understood that the RTL8366RB > MAC, like any other MAC, shouldn't be acting upon the phy-mode when > adding delays, let's just accept all 4 variants, with future support to > be added for [rt]x-internal-delay-ps if there turn out to be > configurable MAC-side delays present. Yes, I think you're right, because we could have the situation where the CPU side is adding the delays, and the DSA side is not, which should be described in DT as: phy-mode = "rgmii-id"; on the DSA side, and e.g.: phy-mode = "rgmii"; rx-internal-delay-ps = <2000>; tx-internal-delay-ps = <2000>; on the CPU side. Yes?
On Fri, Aug 18, 2023 at 03:08:52PM +0200, Linus Walleij wrote: > For DIR-685 is looks like this: > > pinctrl-gmii { > mux { > function = "gmii"; > groups = "gmii_gmac0_grp"; > }; > conf0 { > pins = "V8 GMAC0 > RXDV", "T10 GMAC1 RXDV", > "Y7 GMAC0 RXC", > "Y11 GMAC1 RXC", > "T8 GMAC0 TXEN", > "W11 GMAC1 TXEN", > "U8 GMAC0 TXC", > "V11 GMAC1 TXC", > "W8 GMAC0 RXD0", > "V9 GMAC0 RXD1", > "Y8 GMAC0 RXD2", > "U9 GMAC0 RXD3", > "T7 GMAC0 TXD0", > "U6 GMAC0 TXD1", > "V7 GMAC0 TXD2", > "U7 GMAC0 TXD3", > "Y12 GMAC1 RXD0", > "V12 GMAC1 RXD1", > "T11 GMAC1 RXD2", > "W12 GMAC1 RXD3", > "U10 GMAC1 TXD0", > "Y10 GMAC1 TXD1", > "W10 GMAC1 TXD2", > "T9 GMAC1 TXD3"; > skew-delay = <7>; > }; > /* Set up drive strength on > GMAC0 to 16 mA */ > conf1 { > groups = "gmii_gmac0_grp"; > drive-strength = <16>; > }; > }; > > So skew-delay of 7 steps, each step is ~0.2ns so all pins have a delay > of 7*0.2 = 1.4ns. If I read this correctly, then isn't this 1.4ns delay added not only to the RXD and TXD signals, but also RXC and TXC - meaning that although there is a delay, the effect is that (e.g.) the relative delay between TXC and TXD is zero? > In the DIR-685 example I think it is *both* (yeah...) because it > makes no sense that all delays are 1.4ns. I think the PCB > routing influence it *too*. > > However the D-Link DNS-313 makes more elaborate use of > these settings: > > pinctrl-gmii { > mux { > function = "gmii"; > groups = "gmii_gmac0_grp"; > }; > /* > * In the vendor Linux tree, > these values are set for the C3 > * version of the SL3512 ASIC > with the comment "benson suggest" > */ > conf0 { > pins = "R8 GMAC0 > RXDV", "U11 GMAC1 RXDV"; > skew-delay = <0>; > }; > conf1 { > pins = "T8 GMAC0 RXC"; > skew-delay = <10>; > }; > conf2 { > pins = "T11 GMAC1 RXC"; > skew-delay = <15>; > }; > conf3 { > pins = "P8 GMAC0 > TXEN", "V11 GMAC1 TXEN"; > skew-delay = <7>; > }; > conf4 { > pins = "V7 GMAC0 TXC", > "P10 GMAC1 TXC"; > skew-delay = <10>; > }; > conf5 { > /* The data lines all > have default skew */ > pins = "U8 GMAC0 > RXD0", "V8 GMAC0 RXD1", > "P9 GMAC0 > RXD2", "R9 GMAC0 RXD3", > "R11 GMAC1 > RXD0", "P11 GMAC1 RXD1", > "V12 GMAC1 > RXD2", "U12 GMAC1 RXD3", > "R10 GMAC1 > TXD0", "T10 GMAC1 TXD1", > "U10 GMAC1 > TXD2", "V10 GMAC1 TXD3"; > skew-delay = <7>; > }; > conf6 { > pins = "U7 GMAC0 > TXD0", "T7 GMAC0 TXD1", > "R7 GMAC0 > TXD2", "P7 GMAC0 TXD3"; > skew-delay = <5>; > }; > /* Set up drive strength on > GMAC0 to 16 mA */ > conf7 { > groups = "gmii_gmac0_grp"; > drive-strength = <16>; > }; > }; > > So there are definitely systems doing this. So here, the skews are: - GMAC0 RXD skewed by 7 = 1.4ns, and GMAC0 RXC by 10, making 2ns. Relative skew = 0.6ns. - GMAC0 TXD by 5 = 1.0ns, and GMAC0 TXD by 10, making 2ns. Relative skew = 1ns. I think this suggests there's additional skew by the PCB traces to make up to the required 1.5 to 2ns skew required by RGMII. > > In that case, the current phy-mode = "rgmii" would > > be the correct choice, and changing that would be invalid. Some more > > documentary work would be needed. > > Does the aboe help or did I make it even more confusing :D As far as I'm concerned, I think I have an overall picture of what is going on here - it's whether anyone else agrees with that! Going back to the 685, the skews for the datalines and clocks are: gmac0 rtl8366rb pinctrl ----- pcb traces ----- pinstrapped RXC 1.4ns unknown unknown RXD* 1.4ns unknown unknown TXC 1.4ns unknown unknown TXD* 1.4ns unknown unknown In the case of 313: gmac0 PHY pinctrl ----- pcb traces ----- phy0 RXC 2.0ns unknown 0ns (PHY 0ns due to using RXD* 1.4ns unknown 0ns phy-mode = "rgmii" on) TXC 2.0ns unknown 0ns gmac0's node.) TXD* 1.0ns unknown 0ns
On Fri, Aug 18, 2023 at 03:08:52PM +0200, Linus Walleij wrote: > On Fri, Aug 18, 2023 at 1:40 PM Vladimir Oltean <olteanv@gmail.com> wrote: > > > Though, to have more confidence in the validity of the change, I'd need > > the phy-mode of the &gmac0 node from arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > > and I'm not seeing it. > > The assignment of the gmac0 label is in the top-level DTSI: > > (...) > ethernet: ethernet@60000000 { > compatible = "cortina,gemini-ethernet"; > (...) > gmac0: ethernet-port@0 { > compatible = "cortina,gemini-ethernet-port"; > (...) > > Then in the DIR-685 overlay DTS It looks like this: > > ethernet@60000000 { > status = "okay"; > > ethernet-port@0 { > phy-mode = "rgmii"; > fixed-link { > speed = <1000>; > full-duplex; > pause; > }; > }; > ethernet-port@1 { > /* Not used in this platform */ > }; > }; > > Russell pointed out that this style with overlays of nodes are > confusing. I agree. If I did it today I would use &gmac0 to > configure it. There is a bit of development history here. Yikes. I totally missed that. > > Looking at its driver (drivers/net/ethernet/cortina/gemini.c), I don't > > see any handling of RGMII delays, and it accepts any RGMII phy-mode. > > The handling of the delays exist and is done orthogonally, from > the pin controller, which can assign skewing to pins on the chip. > > There are also SoCs that will do clock skewing in the external > bus controller, from a driver in drivers/bus such as what > drivers/bus/intel-ixp4xx-eb.c is doing for IXP4xx. So there > are even several different places where clock skewing/transmit > delays can be handled. > > For DIR-685 is looks like this: > > pinctrl-gmii { > mux { > function = "gmii"; > groups = "gmii_gmac0_grp"; > }; > conf0 { > pins = "V8 GMAC0 RXDV", "T10 GMAC1 RXDV", > "Y7 GMAC0 RXC", "Y11 GMAC1 RXC", > "T8 GMAC0 TXEN", "W11 GMAC1 TXEN", > "U8 GMAC0 TXC", "V11 GMAC1 TXC", > "W8 GMAC0 RXD0", "V9 GMAC0 RXD1", > "Y8 GMAC0 RXD2", "U9 GMAC0 RXD3", > "T7 GMAC0 TXD0", "U6 GMAC0 TXD1", > "V7 GMAC0 TXD2", "U7 GMAC0 TXD3", > "Y12 GMAC1 RXD0", "V12 GMAC1 RXD1", > "T11 GMAC1 RXD2", "W12 GMAC1 RXD3", > "U10 GMAC1 TXD0", "Y10 GMAC1 TXD1", > "W10 GMAC1 TXD2", "T9 GMAC1 TXD3"; > skew-delay = <7>; > }; > /* Set up drive strength on GMAC0 to 16 mA */ > conf1 { > groups = "gmii_gmac0_grp"; > drive-strength = <16>; > }; > }; > > So skew-delay of 7 steps, each step is ~0.2ns so all pins have a delay > of 7*0.2 = 1.4ns. Ohhhh, I saw the pinctrl-gmii explanation before, but I didn't understand it. I get it now. Sorry. > > > So, if neither the Ethernet controller nor the RTL8366RB switch are > > adding RGMII delays, > > Actually the SoC is. When the ethernet is probed, it asks for its > default pin configuration, and it will applied from the device tree > from the above node. It's just that the registers to control them > are not in the ethernet hardware block, but in the pin controller. > > it becomes plausible that there are skews added > > through PCB traces. > > In the DIR-685 example I think it is *both* (yeah...) because it > makes no sense that all delays are 1.4ns. I think the PCB > routing influence it *too*. I think that as long as the delays aren't added by the other end ("PHY"), then phy-mode = "rgmii" would be the adequate mode to describe this (even if it only has an informative purpose). The delays, even if added by the local system, are added externally to the MAC block (given its current bindings). Similar in that regard to PCB delays. Which makes the device tree basically correct/consistent in that particular aspect. > However the D-Link DNS-313 makes more elaborate use of > these settings: > > pinctrl-gmii { > mux { > function = "gmii"; > groups = "gmii_gmac0_grp"; > }; > /* > * In the vendor Linux tree, these values are set for the C3 > * version of the SL3512 ASIC with the comment "benson suggest" > */ > conf0 { > pins = "R8 GMAC0 RXDV", "U11 GMAC1 RXDV"; > skew-delay = <0>; > }; > conf1 { > pins = "T8 GMAC0 RXC"; > skew-delay = <10>; > }; > conf2 { > pins = "T11 GMAC1 RXC"; > skew-delay = <15>; > }; > conf3 { > pins = "P8 GMAC0 TXEN", "V11 GMAC1 TXEN"; > skew-delay = <7>; > }; > conf4 { > pins = "V7 GMAC0 TXC", "P10 GMAC1 TXC"; > skew-delay = <10>; > }; > conf5 { > /* The data lines all have default skew */ > pins = "U8 GMAC0 RXD0", "V8 GMAC0 RXD1", > "P9 GMAC0 RXD2", "R9 GMAC0 RXD3", > "R11 GMAC1 RXD0", "P11 GMAC1 RXD1", > "V12 GMAC1 RXD2", "U12 GMAC1 RXD3", > "R10 GMAC1 TXD0", "T10 GMAC1 TXD1", > "U10 GMAC1 TXD2", "V10 GMAC1 TXD3"; > skew-delay = <7>; > }; > conf6 { > pins = "U7 GMAC0 TXD0", "T7 GMAC0 TXD1", > "R7 GMAC0 TXD2", "P7 GMAC0 TXD3"; > skew-delay = <5>; > }; > /* Set up drive strength on GMAC0 to 16 mA */ > conf7 { > groups = "gmii_gmac0_grp"; > drive-strength = <16>; > }; > }; > > So there are definitely systems doing this. > > > In that case, the current phy-mode = "rgmii" would > > be the correct choice, and changing that would be invalid. Some more > > documentary work would be needed. > > Does the above help or did I make it even more confusing :D It helped a lot, thank you for clarifying. > If guess I could imagine the delays being configured directly > from the ethernet driver if that is highly desired. For the IXP4xx > ATA controller (that has similar needs to a network phy) I just > shortcut the whole "this goes into the external memory > controller" and look up the address space for the delay > settings and control these settings directly from the driver. > It's not very encapsulated but sometimes the world out there > is ugly, and the ATA drivers really want to control this > see drivers/ata/pata_ixp4xx_cf.c, > ixp4xx_set_8bit_timing() etc. > > To do delay set-up fully from the ethernet controller, for Gemini, > I would just pull this register range out from the pin controller, > remove the pin control stuff in the device tree, and poke it > directly from the ethernet driver. It can be done for sure. > It's just two different ways to skin the cat. I'm not sure that making changes there would become necessary. > I suppose it fully confirms the view that the phy-mode in > the ethernet controller is 100% placebo though! Which is what it should be, IMO.
On Fri, Aug 18, 2023 at 02:10:21PM +0100, Russell King (Oracle) wrote: > > > One patch from Linus which changes one of the users of the Realtek > > > DSA driver to use "rgmii-id" instead of "rgmii". Do we still think > > > that this a correct change given that we seem to be agreeing that > > > the only thing that matters on the DSA end of this is that it's > > > "rgmii" and the delays for the DSA end should be specified using > > ~~~~~~~ > > I'd say not necessarily specifically "rgmii", but rather "rgmii*" > > > > > the [tr]x-internal-delay-ps properties? > > For a CPU link though, where there is no "phy", does specifying anything > other than "rgmii" make sense? (since there's no "remote" side that > would take a blind bit of notice of it.) I'm not completely sure that I understand the question. I think you've answered this yourself, at the end. > > As long as it is understood that changing "rgmii" to "rgmii-id" is > > expected to be inconsequential (placebo) for a fixed-link, I don't have > > an objection (in principle) to that patch. > > > > Though, to have more confidence in the validity of the change, I'd need > > the phy-mode of the &gmac0 node from arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > > and I'm not seeing it. > > Gemini DT source is hard to follow, because despite there being the > labels, they aren't always used. gmac0 points at > /soc/ethernet@6000000/ethernet-port@0 and finding that in > arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts gives us: > > gmac0 specifies a fixed link with: > phy-mode = "rgmii"; > > It would be helpful if the labels were used consistently! +1 > > Looking at its driver (drivers/net/ethernet/cortina/gemini.c), I don't > > see any handling of RGMII delays, and it accepts any RGMII phy-mode. > > As discussed previously in this thread with Linus, Gemini apparently > has the capability to add the delays in via the pinctrl layer. In > this case, in the pinctrl-gmii node, everything has the same skew delay > so the Gemini end of the link looks like it has no delays. So it would appear. The skewing capability is there, but the skews for RXC/TXC and RXD/TXD cancel out in that particular configuration. In that case, it's pretty confusing why they're there at all, to be honest. > On the Realtek DSA end, we don't know how it's strapped. RTL8366 (*not* > RB) has the ability to pinstrap the required delays, and read the > pinstrapping status out of a register. That register address is used > for an entirely different purpose by RTL8366RB, so we can't easily > find out that way. I see. > > So, if neither the Ethernet controller nor the RTL8366RB switch are > > adding RGMII delays, it becomes plausible that there are skews added > > through PCB traces. In that case, the current phy-mode = "rgmii" would > > be the correct choice, and changing that would be invalid. Some more > > documentary work would be needed. > > It could also be that RTL8366RB is pinstrapped to add the delays. If > RTL8366RB is pinstrapped for delays on both rx and tx, then that would > be equivalent to a DT description of e.g.: > > phy-mode = "rgmii"; > rx-internal-delay-ps = <2000>; > tx-internal-delay-ps = <2000>; > > on the DSA end, and: > > phy-mode = "rgmii-id"; > > on the gmac0 end. > > I believe the DSA end in this case should be "rgmii" because there are > no delays being added at the CPU side of the connection, and in _this_ > case in gemini-dlink-dir-685.dts, it would be incorrect to change the > DSA side from "rgmii". > > Whether the delays are added by the switch or by trace routing is > something we can't answer, thus we can't say whether the CPU end should > use "rgmii-id" or "rgmii", nor whether the delay-ps properties should > be added on the DSA side to make a complete "modern" description. Well, if we can't distinguish between the PCB traces case and the RTL8366RB pin strapping case, I think it would be best to keep the strategic ambiguity in the device tree alone. > > In any case, I wouldn't rush a change to arch/arm/boot/dts/gemini/gemini-dlink-dir-685.dts, > > and I'm not seeing any dependency between that and your phylink_get_caps > > conversion for the rtl8366rb. > > If the DSA side is changed from "rgmii" to "rgmii-id" then only doing: > > __set_bit(PHY_INTERFACE_MODE_RGMII, interfaces); > > means that when phylink_create() is called with > PHY_INTERFACE_MODE_RGMII_ID due to Linus' change, the phylink_validate() > in phylink_parse_fixedlink() will continue to fail (as it is today) and > if that bug ever gets fixed, then rtl8366rb.c will break. > > This negates the whole point of adding phylink_get_caps() for realtek. Speech can be so imprecise :) I can see how "I'm not seeing any dependency" can be interpreted as you've done, i.e. "there is no dependency in either direction between the 2 patches", but that isn't what I wanted to transmit. I just wanted to say that you don't need Linus' patch to make progress with the conversion, that's all. But of course Linus' device tree patch would only work if your kernel-side patch puts all 4 RGMII modes in supported_interfaces. But maybe that should be done anyway, with or without Linus' device tree patch. > > > The second patch is my patch adding a phylink_get_caps method for > > > Realtek drivers - should that allow all "rgmii" interface types, > > > or do we want to just allow "rgmii" to encourage the use of the > > > [tr]x-internal-delay-ps properties? > > > > Same opinion as above. As long as it's understood that the RTL8366RB > > MAC, like any other MAC, shouldn't be acting upon the phy-mode when > > adding delays, let's just accept all 4 variants, with future support to > > be added for [rt]x-internal-delay-ps if there turn out to be > > configurable MAC-side delays present. > > Yes, I think you're right, because we could have the situation where > the CPU side is adding the delays, and the DSA side is not, which > should be described in DT as: > > phy-mode = "rgmii-id"; > > on the DSA side, and e.g.: > > phy-mode = "rgmii"; > rx-internal-delay-ps = <2000>; > tx-internal-delay-ps = <2000>; > > on the CPU side. Yes? Yes, this is the situation I was thinking of, where the DSA CPU port would have "rgmii-id" to denote that the remote side has added the delays. At least, that would be the intuitive way for me to describe things according to our definitions from the documentation. (open question: if those remote delays are added through pinctrl-gmii rather than through the MAC OF node, would that count towards DSA's phy-mode, to make it rgmii-id, or not?) Though I agree that I can't see the exact phy-mode breaking anything with a fixed link, given this interpretation, even if it's "wrong".
On Fri, Aug 18, 2023 at 3:29 PM Russell King (Oracle) <linux@armlinux.org.uk> wrote: > > So skew-delay of 7 steps, each step is ~0.2ns so all pins have a delay > > of 7*0.2 = 1.4ns. > > If I read this correctly, then isn't this 1.4ns delay added not only > to the RXD and TXD signals, but also RXC and TXC - meaning that although > there is a delay, the effect is that (e.g.) the relative delay between > TXC and TXD is zero? Yes seems like so. (second instance) > So here, the skews are: > - GMAC0 RXD skewed by 7 = 1.4ns, and GMAC0 RXC by 10, making 2ns. > Relative skew = 0.6ns. > > - GMAC0 TXD by 5 = 1.0ns, and GMAC0 TXD by 10, making 2ns. > Relative skew = 1ns. > > I think this suggests there's additional skew by the PCB traces to make > up to the required 1.5 to 2ns skew required by RGMII. I believe you're right. Also when I looked at that PCB it had a few swirly traces on it. https://openwrt.org/_media/media/dlink/dns-313/dns-313-front-large.jpg Again they seem to be toward the RAM mostly, but I don't know where these vias go, and clearly the PCB designer is doing some active work on it. > As far as I'm concerned, I think I have an overall picture of what is > going on here - it's whether anyone else agrees with that! > > Going back to the 685, the skews for the datalines and clocks are: > > gmac0 rtl8366rb > pinctrl ----- pcb traces ----- pinstrapped > RXC 1.4ns unknown unknown > RXD* 1.4ns unknown unknown > TXC 1.4ns unknown unknown > TXD* 1.4ns unknown unknown > > In the case of 313: > > gmac0 PHY > pinctrl ----- pcb traces ----- phy0 > RXC 2.0ns unknown 0ns (PHY 0ns due to using > RXD* 1.4ns unknown 0ns phy-mode = "rgmii" on) > TXC 2.0ns unknown 0ns gmac0's node.) > TXD* 1.0ns unknown 0ns Yep that's how it looks. Yours, Linus Walleij
On Fri, Aug 18, 2023 at 05:21:05PM +0300, Vladimir Oltean wrote: > On Fri, Aug 18, 2023 at 02:10:21PM +0100, Russell King (Oracle) wrote: > > > > The second patch is my patch adding a phylink_get_caps method for > > > > Realtek drivers - should that allow all "rgmii" interface types, > > > > or do we want to just allow "rgmii" to encourage the use of the > > > > [tr]x-internal-delay-ps properties? > > > > > > Same opinion as above. As long as it's understood that the RTL8366RB > > > MAC, like any other MAC, shouldn't be acting upon the phy-mode when > > > adding delays, let's just accept all 4 variants, with future support to > > > be added for [rt]x-internal-delay-ps if there turn out to be > > > configurable MAC-side delays present. > > > > Yes, I think you're right, because we could have the situation where > > the CPU side is adding the delays, and the DSA side is not, which > > should be described in DT as: > > > > phy-mode = "rgmii-id"; > > > > on the DSA side, and e.g.: > > > > phy-mode = "rgmii"; > > rx-internal-delay-ps = <2000>; > > tx-internal-delay-ps = <2000>; > > > > on the CPU side. Yes? > > Yes, this is the situation I was thinking of, where the DSA CPU port > would have "rgmii-id" to denote that the remote side has added the > delays. > > At least, that would be the intuitive way for me to describe things > according to our definitions from the documentation. > > (open question: if those remote delays are added through pinctrl-gmii > rather than through the MAC OF node, would that count towards DSA's > phy-mode, to make it rgmii-id, or not?) > > Though I agree that I can't see the exact phy-mode breaking anything > with a fixed link, given this interpretation, even if it's "wrong". I haven't fully digested this, but would it make sense to say: "for fixed links, only phy-mode 'rgmii' should be used, since the remote side is not known, and thus, it is also not known whether it has set up clock skews in any direction"? One possible advantage would be that it would make people think a bit more whether they should add code that applies MAC-side delays in fixed-link mode based on the phy-mode. If we had that extra recommendation documented somewhere centrally, doing that wouldn't make a lot of sense.
diff --git a/net/dsa/port.c b/net/dsa/port.c index 24015e11255f..37ab238e8304 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -1690,10 +1690,14 @@ int dsa_port_phylink_create(struct dsa_port *dp) ds->ops->phylink_get_caps(ds, dp->index, &dp->pl_config); } else { /* For legacy drivers */ - __set_bit(PHY_INTERFACE_MODE_INTERNAL, - dp->pl_config.supported_interfaces); - __set_bit(PHY_INTERFACE_MODE_GMII, - dp->pl_config.supported_interfaces); + if (mode != PHY_INTERFACE_MODE_NA) { + __set_bit(mode, dp->pl_config.supported_interfaces); + } else { + __set_bit(PHY_INTERFACE_MODE_INTERNAL, + dp->pl_config.supported_interfaces); + __set_bit(PHY_INTERFACE_MODE_GMII, + dp->pl_config.supported_interfaces); + } } pl = phylink_create(&dp->pl_config, of_fwnode_handle(dp->dn),
If we successfully parsed an interface mode with a legacy switch driver, populate that mode into phylink's supported interfaces rather than defaulting to the internal and gmii interfaces. This hasn't caused an issue so far, because when the interface doesn't match a supported one, phylink_validate() doesn't clear the supported mask, but instead returns -EINVAL. phylink_parse_fixedlink() doesn't check this return value, and merely relies on the supported ethtool link modes mask being cleared. Therefore, the fixed link settings end up being allowed despite validation failing. Before this causes a problem, arrange for DSA to more accurately populate phylink's supported interfaces mask so validation can correctly succeed. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> --- net/dsa/port.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)