Message ID | 20220719235002.1944800-8-sean.anderson@seco.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net: phy: Add support for rate adaptation | expand |
On Tue, Jul 19, 2022 at 07:49:57PM -0400, Sean Anderson wrote: > If the phy is configured to use pause-based rate adaptation, ensure that > the link is full duplex with pause frame reception enabled. As > suggested, if pause-based rate adaptation is enabled by the phy, then > pause reception is unconditionally enabled. > > The interface duplex is determined based on the rate adaptation type. > When rate adaptation is enabled, so is the speed. We assume the maximum > interface speed is used. This is only relevant for MLO_AN_PHY. For > MLO_AN_INBAND, the MAC/PCS's view of the interface speed will be used. > > Signed-off-by: Sean Anderson <sean.anderson@seco.com> > --- > > Changes in v2: > - Use the phy's rate adaptation setting to determine whether to use its > link speed/duplex or the MAC's speed/duplex with MLO_AN_INBAND. > - Always use the rate adaptation setting to determine the interface > speed/duplex (instead of sometimes using the interface mode). > > drivers/net/phy/phylink.c | 126 ++++++++++++++++++++++++++++++++++---- > include/linux/phylink.h | 1 + > 2 files changed, 114 insertions(+), 13 deletions(-) > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > index da0623d94a64..619ef553476f 100644 > --- a/drivers/net/phy/phylink.c > +++ b/drivers/net/phy/phylink.c > @@ -160,16 +160,93 @@ static const char *phylink_an_mode_str(unsigned int mode) > * @state: A link state > * > * Update the .speed and .duplex members of @state. We can determine them based > - * on the .link_speed and .link_duplex. This function should be called whenever > - * .link_speed and .link_duplex are updated. For example, userspace deals with > - * link speed and duplex, and not the interface speed and duplex. Similarly, > - * phys deal with link speed and duplex and only implicitly the interface speed > - * and duplex. > + * on the .link_speed, .link_duplex, .interface, and .rate_adaptation. This > + * function should be called whenever .link_speed and .link_duplex are updated. > + * For example, userspace deals with link speed and duplex, and not the > + * interface speed and duplex. Similarly, phys deal with link speed and duplex > + * and only implicitly the interface speed and duplex. > */ > static void phylink_state_fill_speed_duplex(struct phylink_link_state *state) > { > - state->speed = state->link_speed; > - state->duplex = state->link_duplex; > + switch (state->rate_adaptation) { > + case RATE_ADAPT_NONE: > + state->speed = state->link_speed; > + state->duplex = state->link_duplex; > + return; > + case RATE_ADAPT_PAUSE: > + state->duplex = DUPLEX_FULL; > + break; > + case RATE_ADAPT_CRS: > + state->duplex = DUPLEX_HALF; > + break; > + case RATE_ADAPT_OPEN_LOOP: > + state->duplex = state->link_duplex; > + break; > + } > + > + /* Use the max speed of the interface */ > + switch (state->interface) { > + case PHY_INTERFACE_MODE_100BASEX: > + case PHY_INTERFACE_MODE_REVRMII: > + case PHY_INTERFACE_MODE_RMII: > + case PHY_INTERFACE_MODE_SMII: > + case PHY_INTERFACE_MODE_REVMII: > + case PHY_INTERFACE_MODE_MII: > + state->speed = SPEED_100; > + return; > + > + case PHY_INTERFACE_MODE_TBI: > + case PHY_INTERFACE_MODE_MOCA: > + case PHY_INTERFACE_MODE_RTBI: > + case PHY_INTERFACE_MODE_1000BASEX: > + case PHY_INTERFACE_MODE_1000BASEKX: > + case PHY_INTERFACE_MODE_TRGMII: > + case PHY_INTERFACE_MODE_RGMII_TXID: > + case PHY_INTERFACE_MODE_RGMII_RXID: > + case PHY_INTERFACE_MODE_RGMII_ID: > + case PHY_INTERFACE_MODE_RGMII: > + case PHY_INTERFACE_MODE_QSGMII: > + case PHY_INTERFACE_MODE_SGMII: > + case PHY_INTERFACE_MODE_GMII: > + state->speed = SPEED_1000; > + return; > + > + case PHY_INTERFACE_MODE_2500BASEX: > + state->speed = SPEED_2500; > + return; > + > + case PHY_INTERFACE_MODE_5GBASER: > + state->speed = SPEED_5000; > + return; > + > + case PHY_INTERFACE_MODE_XGMII: > + case PHY_INTERFACE_MODE_RXAUI: > + case PHY_INTERFACE_MODE_XAUI: > + case PHY_INTERFACE_MODE_10GBASER: > + case PHY_INTERFACE_MODE_10GKR: > + case PHY_INTERFACE_MODE_USXGMII: > + state->speed = SPEED_10000; > + return; > + > + case PHY_INTERFACE_MODE_25GBASER: > + state->speed = SPEED_25000; > + return; > + > + case PHY_INTERFACE_MODE_XLGMII: > + state->speed = SPEED_40000; > + return; > + > + case PHY_INTERFACE_MODE_INTERNAL: > + state->speed = state->link_speed; > + return; > + > + case PHY_INTERFACE_MODE_NA: > + case PHY_INTERFACE_MODE_MAX: > + state->speed = SPEED_UNKNOWN; > + return; > + } > + > + WARN_ON(1); > } > > /** > @@ -803,11 +880,12 @@ static void phylink_mac_config(struct phylink *pl, > const struct phylink_link_state *state) > { > phylink_dbg(pl, > - "%s: mode=%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", > + "%s: mode=%s/%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", > __func__, phylink_an_mode_str(pl->cur_link_an_mode), > phy_modes(state->interface), > phy_speed_to_str(state->speed), > phy_duplex_to_str(state->duplex), > + phy_rate_adaptation_to_str(state->rate_adaptation), > __ETHTOOL_LINK_MODE_MASK_NBITS, state->advertising, > state->pause, state->link, state->an_enabled); > > @@ -944,6 +1022,7 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, > linkmode_zero(state->lp_advertising); > state->interface = pl->link_config.interface; > state->an_enabled = pl->link_config.an_enabled; > + state->rate_adaptation = pl->link_config.rate_adaptation; > if (state->an_enabled) { > state->link_speed = SPEED_UNKNOWN; > state->link_duplex = DUPLEX_UNKNOWN; > @@ -968,8 +1047,10 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, > else > state->link = 0; > > - state->link_speed = state->speed; > - state->link_duplex = state->duplex; > + if (state->rate_adaptation == RATE_ADAPT_NONE) { > + state->link_speed = state->speed; > + state->link_duplex = state->duplex; > + } So we need to have every PCS driver be udpated to fill in link_speed and link_duplex if rate_adaption != none. There's got to be a better way - maybe what I suggested in the last round of only doing the rate adaption thing in the link_up() functions, since that seems to be the only real difference. I'm not even sure we need to do that - in the "open loop" case, we need to be passing the media speed to the MAC driver with the knowledge that it should be increasing the IPG. So, I'm thinking we don't want any of these changes, what we instead should be doing is passing the media speed/duplex and the interface speed/duplex to the PCS and MAC. We can do that by storing the PHY rate adaption state, and processing that in phylink_link_up().
Hi Sean,
I love your patch! Yet something to improve:
[auto build test ERROR on net-next/master]
[also build test ERROR on net/master horms-ipvs/master linus/master v5.19-rc7 next-20220719]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sean-Anderson/net-phy-Add-support-for-rate-adaptation/20220720-075438
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 1f17708b47a99ca5bcad594a6f8d14cb016edfd2
config: x86_64-randconfig-a005 (https://download.01.org/0day-ci/archive/20220720/202207201640.6pB23GJN-lkp@intel.com/config)
compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project dd5635541cd7bbd62cd59b6694dfb759b6e9a0d8)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/a17fd2b01914c1c5779a76167def6910a6dd1185
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Sean-Anderson/net-phy-Add-support-for-rate-adaptation/20220720-075438
git checkout a17fd2b01914c1c5779a76167def6910a6dd1185
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "phy_rate_adaptation_to_str" [drivers/net/phy/phylink.ko] undefined!
Hi Sean,
I love your patch! Yet something to improve:
[auto build test ERROR on net-next/master]
[also build test ERROR on net/master horms-ipvs/master linus/master v5.19-rc7 next-20220719]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sean-Anderson/net-phy-Add-support-for-rate-adaptation/20220720-075438
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 1f17708b47a99ca5bcad594a6f8d14cb016edfd2
config: i386-randconfig-a005 (https://download.01.org/0day-ci/archive/20220720/202207201727.9nqTCybA-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/a17fd2b01914c1c5779a76167def6910a6dd1185
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Sean-Anderson/net-phy-Add-support-for-rate-adaptation/20220720-075438
git checkout a17fd2b01914c1c5779a76167def6910a6dd1185
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "phy_rate_adaptation_to_str" [drivers/net/phy/phylink.ko] undefined!
On Wed, Jul 20, 2022 at 07:50:52AM +0100, Russell King (Oracle) wrote: > We can do that by storing the PHY rate adaption state, and processing > that in phylink_link_up(). Something like this? I haven't included the IPG (open loop) stuff in this - I think when we come to implement that, we need a new mac method to call to set the IPG just before calling mac_link_up(). Something like: void mac_set_ipg(struct phylink_config *config, int packet_speed, int interface_speed); Note that we also have PCS that do rate adaption too, and I think fitting those in with the code below may be easier than storing the media and phy interface speed/duplex separately. A few further question though - does rate adaption make sense with interface modes that can already natively handle the different speeds, such as SGMII, RGMII, USXGMII, etc? drivers/net/phy/phylink.c | 103 ++++++++++++++++++++++++++++++++++++++++++++-- include/linux/phylink.h | 1 + 2 files changed, 100 insertions(+), 4 deletions(-) diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 9bd69328dc4d..c89eb74458cd 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -994,23 +994,105 @@ static const char *phylink_pause_to_str(int pause) } } +static int phylink_interface_max_speed(phy_interface_t interface) +{ + switch (interface) { + case PHY_INTERFACE_MODE_100BASEX: + case PHY_INTERFACE_MODE_REVRMII: + case PHY_INTERFACE_MODE_RMII: + case PHY_INTERFACE_MODE_SMII: + case PHY_INTERFACE_MODE_REVMII: + case PHY_INTERFACE_MODE_MII: + return SPEED_100; + + case PHY_INTERFACE_MODE_TBI: + case PHY_INTERFACE_MODE_MOCA: + case PHY_INTERFACE_MODE_RTBI: + case PHY_INTERFACE_MODE_1000BASEX: + case PHY_INTERFACE_MODE_1000BASEKX: + case PHY_INTERFACE_MODE_TRGMII: + case PHY_INTERFACE_MODE_RGMII_TXID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_QSGMII: + case PHY_INTERFACE_MODE_SGMII: + case PHY_INTERFACE_MODE_GMII: + return SPEED_1000; + + case PHY_INTERFACE_MODE_2500BASEX: + return SPEED_2500; + + case PHY_INTERFACE_MODE_5GBASER: + return SPEED_5000; + + case PHY_INTERFACE_MODE_XGMII: + case PHY_INTERFACE_MODE_RXAUI: + case PHY_INTERFACE_MODE_XAUI: + case PHY_INTERFACE_MODE_10GBASER: + case PHY_INTERFACE_MODE_10GKR: + case PHY_INTERFACE_MODE_USXGMII: + return SPEED_10000; + + case PHY_INTERFACE_MODE_25GBASER: + return SPEED_25000; + + case PHY_INTERFACE_MODE_XLGMII: + return SPEED_40000; + + case PHY_INTERFACE_MODE_INTERNAL: + /* Rate adaption is probably not supported */ + return 0; + + case PHY_INTERFACE_MODE_NA: + case PHY_INTERFACE_MODE_MAX: + return SPEED_UNKNOWN; + } +} + static void phylink_link_up(struct phylink *pl, struct phylink_link_state link_state) { struct net_device *ndev = pl->netdev; + int speed, duplex; + bool rx_pause; + + speed = link_state.speed; + duplex = link_state.duplex; + rx_pause = !!(link_state.pause & MLO_PAUSE_RX); + + switch (state->rate_adaption) { + case RATE_ADAPT_PAUSE: + /* The PHY is doing rate adaption from the media rate (in + * the link_state) to the interface speed, and will send + * pause frames to the MAC to limit its transmission speed. + */ + speed = phylink_interface_max_speed(link_state.interface); + duplex = DUPLEX_FULL; + rx_pause = true; + break; + + case RATE_ADAPT_CRS: + /* The PHY is doing rate adaption from the media rate (in + * the link_state) to the interface speed, and will cause + * collisions to the MAC to limit its transmission speed. + */ + speed = phylink_interface_max_speed(link_state.interface); + duplex = DUPLEX_HALF; + break; + } pl->cur_interface = link_state.interface; if (pl->pcs && pl->pcs->ops->pcs_link_up) pl->pcs->ops->pcs_link_up(pl->pcs, pl->cur_link_an_mode, - pl->cur_interface, - link_state.speed, link_state.duplex); + pl->cur_interface, speed, duplex); pl->mac_ops->mac_link_up(pl->config, pl->phydev, pl->cur_link_an_mode, pl->cur_interface, - link_state.speed, link_state.duplex, + speed, duplex, !!(link_state.pause & MLO_PAUSE_TX), - !!(link_state.pause & MLO_PAUSE_RX)); + rx_pause); if (ndev) netif_carrier_on(ndev); @@ -1102,6 +1184,17 @@ static void phylink_resolve(struct work_struct *w) } link_state.interface = pl->phy_state.interface; + /* If we are doing rate adaption, then the + * media speed/duplex has to come from the PHY. + */ + if (pl->phy_state.rate_adaption) { + link_state.rate_adaption = + pl->phy_state.rate_adaption; + link_state.speed = pl->phy_state.speed; + link_state.duplex = + pl->phy_state.duplex; + } + /* If we have a PHY, we need to update with * the PHY flow control bits. */ @@ -1337,6 +1430,7 @@ static void phylink_phy_change(struct phy_device *phydev, bool up) pl->phy_state.speed = phydev->speed; pl->phy_state.duplex = phydev->duplex; pl->phy_state.pause = MLO_PAUSE_NONE; + pl->phy_state.rate_adaption = phydev->rate_adaption; if (tx_pause) pl->phy_state.pause |= MLO_PAUSE_TX; if (rx_pause) @@ -1414,6 +1508,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, pl->phy_state.pause = MLO_PAUSE_NONE; pl->phy_state.speed = SPEED_UNKNOWN; pl->phy_state.duplex = DUPLEX_UNKNOWN; + pl->phy_state.rate_adaption = RATE_ADAPT_NONE; linkmode_copy(pl->supported, supported); linkmode_copy(pl->link_config.advertising, config.advertising); diff --git a/include/linux/phylink.h b/include/linux/phylink.h index 6d06896fc20d..65301e7961b0 100644 --- a/include/linux/phylink.h +++ b/include/linux/phylink.h @@ -70,6 +70,7 @@ struct phylink_link_state { int speed; int duplex; int pause; + int rate_adaption; unsigned int link:1; unsigned int an_enabled:1; unsigned int an_complete:1;
On 7/20/22 2:50 AM, Russell King (Oracle) wrote: > On Tue, Jul 19, 2022 at 07:49:57PM -0400, Sean Anderson wrote: >> If the phy is configured to use pause-based rate adaptation, ensure that >> the link is full duplex with pause frame reception enabled. As >> suggested, if pause-based rate adaptation is enabled by the phy, then >> pause reception is unconditionally enabled. >> >> The interface duplex is determined based on the rate adaptation type. >> When rate adaptation is enabled, so is the speed. We assume the maximum >> interface speed is used. This is only relevant for MLO_AN_PHY. For >> MLO_AN_INBAND, the MAC/PCS's view of the interface speed will be used. >> >> Signed-off-by: Sean Anderson <sean.anderson@seco.com> >> --- >> >> Changes in v2: >> - Use the phy's rate adaptation setting to determine whether to use its >> link speed/duplex or the MAC's speed/duplex with MLO_AN_INBAND. >> - Always use the rate adaptation setting to determine the interface >> speed/duplex (instead of sometimes using the interface mode). >> >> drivers/net/phy/phylink.c | 126 ++++++++++++++++++++++++++++++++++---- >> include/linux/phylink.h | 1 + >> 2 files changed, 114 insertions(+), 13 deletions(-) >> >> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c >> index da0623d94a64..619ef553476f 100644 >> --- a/drivers/net/phy/phylink.c >> +++ b/drivers/net/phy/phylink.c >> @@ -160,16 +160,93 @@ static const char *phylink_an_mode_str(unsigned int mode) >> * @state: A link state >> * >> * Update the .speed and .duplex members of @state. We can determine them based >> - * on the .link_speed and .link_duplex. This function should be called whenever >> - * .link_speed and .link_duplex are updated. For example, userspace deals with >> - * link speed and duplex, and not the interface speed and duplex. Similarly, >> - * phys deal with link speed and duplex and only implicitly the interface speed >> - * and duplex. >> + * on the .link_speed, .link_duplex, .interface, and .rate_adaptation. This >> + * function should be called whenever .link_speed and .link_duplex are updated. >> + * For example, userspace deals with link speed and duplex, and not the >> + * interface speed and duplex. Similarly, phys deal with link speed and duplex >> + * and only implicitly the interface speed and duplex. >> */ >> static void phylink_state_fill_speed_duplex(struct phylink_link_state *state) >> { >> - state->speed = state->link_speed; >> - state->duplex = state->link_duplex; >> + switch (state->rate_adaptation) { >> + case RATE_ADAPT_NONE: >> + state->speed = state->link_speed; >> + state->duplex = state->link_duplex; >> + return; >> + case RATE_ADAPT_PAUSE: >> + state->duplex = DUPLEX_FULL; >> + break; >> + case RATE_ADAPT_CRS: >> + state->duplex = DUPLEX_HALF; >> + break; >> + case RATE_ADAPT_OPEN_LOOP: >> + state->duplex = state->link_duplex; >> + break; >> + } >> + >> + /* Use the max speed of the interface */ >> + switch (state->interface) { >> + case PHY_INTERFACE_MODE_100BASEX: >> + case PHY_INTERFACE_MODE_REVRMII: >> + case PHY_INTERFACE_MODE_RMII: >> + case PHY_INTERFACE_MODE_SMII: >> + case PHY_INTERFACE_MODE_REVMII: >> + case PHY_INTERFACE_MODE_MII: >> + state->speed = SPEED_100; >> + return; >> + >> + case PHY_INTERFACE_MODE_TBI: >> + case PHY_INTERFACE_MODE_MOCA: >> + case PHY_INTERFACE_MODE_RTBI: >> + case PHY_INTERFACE_MODE_1000BASEX: >> + case PHY_INTERFACE_MODE_1000BASEKX: >> + case PHY_INTERFACE_MODE_TRGMII: >> + case PHY_INTERFACE_MODE_RGMII_TXID: >> + case PHY_INTERFACE_MODE_RGMII_RXID: >> + case PHY_INTERFACE_MODE_RGMII_ID: >> + case PHY_INTERFACE_MODE_RGMII: >> + case PHY_INTERFACE_MODE_QSGMII: >> + case PHY_INTERFACE_MODE_SGMII: >> + case PHY_INTERFACE_MODE_GMII: >> + state->speed = SPEED_1000; >> + return; >> + >> + case PHY_INTERFACE_MODE_2500BASEX: >> + state->speed = SPEED_2500; >> + return; >> + >> + case PHY_INTERFACE_MODE_5GBASER: >> + state->speed = SPEED_5000; >> + return; >> + >> + case PHY_INTERFACE_MODE_XGMII: >> + case PHY_INTERFACE_MODE_RXAUI: >> + case PHY_INTERFACE_MODE_XAUI: >> + case PHY_INTERFACE_MODE_10GBASER: >> + case PHY_INTERFACE_MODE_10GKR: >> + case PHY_INTERFACE_MODE_USXGMII: >> + state->speed = SPEED_10000; >> + return; >> + >> + case PHY_INTERFACE_MODE_25GBASER: >> + state->speed = SPEED_25000; >> + return; >> + >> + case PHY_INTERFACE_MODE_XLGMII: >> + state->speed = SPEED_40000; >> + return; >> + >> + case PHY_INTERFACE_MODE_INTERNAL: >> + state->speed = state->link_speed; >> + return; >> + >> + case PHY_INTERFACE_MODE_NA: >> + case PHY_INTERFACE_MODE_MAX: >> + state->speed = SPEED_UNKNOWN; >> + return; >> + } >> + >> + WARN_ON(1); >> } >> >> /** >> @@ -803,11 +880,12 @@ static void phylink_mac_config(struct phylink *pl, >> const struct phylink_link_state *state) >> { >> phylink_dbg(pl, >> - "%s: mode=%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", >> + "%s: mode=%s/%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", >> __func__, phylink_an_mode_str(pl->cur_link_an_mode), >> phy_modes(state->interface), >> phy_speed_to_str(state->speed), >> phy_duplex_to_str(state->duplex), >> + phy_rate_adaptation_to_str(state->rate_adaptation), >> __ETHTOOL_LINK_MODE_MASK_NBITS, state->advertising, >> state->pause, state->link, state->an_enabled); >> >> @@ -944,6 +1022,7 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, >> linkmode_zero(state->lp_advertising); >> state->interface = pl->link_config.interface; >> state->an_enabled = pl->link_config.an_enabled; >> + state->rate_adaptation = pl->link_config.rate_adaptation; >> if (state->an_enabled) { >> state->link_speed = SPEED_UNKNOWN; >> state->link_duplex = DUPLEX_UNKNOWN; >> @@ -968,8 +1047,10 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, >> else >> state->link = 0; >> >> - state->link_speed = state->speed; >> - state->link_duplex = state->duplex; >> + if (state->rate_adaptation == RATE_ADAPT_NONE) { >> + state->link_speed = state->speed; >> + state->link_duplex = state->duplex; >> + } > > So we need to have every PCS driver be udpated to fill in link_speed > and link_duplex if rate_adaption != none. The PCS doesn't know what the link speed/duplex is. If rate adaptation is enabled, then the PCS only knows what the interface speed/duplex is. > There's got to be a better way - maybe what I suggested in the last > round of only doing the rate adaption thing in the link_up() functions, > since that seems to be the only real difference. > > I'm not even sure we need to do that - in the "open loop" case, we > need to be passing the media speed to the MAC driver with the knowledge > that it should be increasing the IPG. > > So, I'm thinking we don't want any of these changes, what we instead > should be doing is passing the media speed/duplex and the interface > speed/duplex to the PCS and MAC. > We can do that by storing the PHY rate adaption state, and processing > that in phylink_link_up(). This approach sounds better. You patch below looks good. I'll test it and use it for v3. --Sean
On 7/20/22 2:32 PM, Russell King (Oracle) wrote: > On Wed, Jul 20, 2022 at 07:50:52AM +0100, Russell King (Oracle) wrote: >> We can do that by storing the PHY rate adaption state, and processing >> that in phylink_link_up(). > > Something like this? I haven't included the IPG (open loop) stuff in > this - I think when we come to implement that, we need a new mac > method to call to set the IPG just before calling mac_link_up(). > Something like: > > void mac_set_ipg(struct phylink_config *config, int packet_speed, > int interface_speed); > > Note that we also have PCS that do rate adaption too, and I think > fitting those in with the code below may be easier than storing the > media and phy interface speed/duplex separately. This is another area where the MAC has to know a lot about the PCS. We don't keep track of the PCS interface mode, so the MAC has to know how to connect to the PCS. That could already include some rate adaptation, but I suspect it is all done like GMII (where the clock speed changes). The only drawback I see with this approach is that we don't use the MAC/PCS's speed/duplex when in in-band mode. But I think that only matters for things like SGMII, which (as noted below) probably shouldn't use rate adaptation. > A few further question though - does rate adaption make sense with > interface modes that can already natively handle the different speeds, > such as SGMII, RGMII, USXGMII, etc? Generally, no. I think it's reasonable to let the phy decide what's going on, and just do whatever it says. > drivers/net/phy/phylink.c | 103 ++++++++++++++++++++++++++++++++++++++++++++-- > include/linux/phylink.h | 1 + > 2 files changed, 100 insertions(+), 4 deletions(-) > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > index 9bd69328dc4d..c89eb74458cd 100644 > --- a/drivers/net/phy/phylink.c > +++ b/drivers/net/phy/phylink.c > @@ -994,23 +994,105 @@ static const char *phylink_pause_to_str(int pause) > } > } > > +static int phylink_interface_max_speed(phy_interface_t interface) > +{ > + switch (interface) { > + case PHY_INTERFACE_MODE_100BASEX: > + case PHY_INTERFACE_MODE_REVRMII: > + case PHY_INTERFACE_MODE_RMII: > + case PHY_INTERFACE_MODE_SMII: > + case PHY_INTERFACE_MODE_REVMII: > + case PHY_INTERFACE_MODE_MII: > + return SPEED_100; > + > + case PHY_INTERFACE_MODE_TBI: > + case PHY_INTERFACE_MODE_MOCA: > + case PHY_INTERFACE_MODE_RTBI: > + case PHY_INTERFACE_MODE_1000BASEX: > + case PHY_INTERFACE_MODE_1000BASEKX: > + case PHY_INTERFACE_MODE_TRGMII: > + case PHY_INTERFACE_MODE_RGMII_TXID: > + case PHY_INTERFACE_MODE_RGMII_RXID: > + case PHY_INTERFACE_MODE_RGMII_ID: > + case PHY_INTERFACE_MODE_RGMII: > + case PHY_INTERFACE_MODE_QSGMII: > + case PHY_INTERFACE_MODE_SGMII: > + case PHY_INTERFACE_MODE_GMII: > + return SPEED_1000; > + > + case PHY_INTERFACE_MODE_2500BASEX: > + return SPEED_2500; > + > + case PHY_INTERFACE_MODE_5GBASER: > + return SPEED_5000; > + > + case PHY_INTERFACE_MODE_XGMII: > + case PHY_INTERFACE_MODE_RXAUI: > + case PHY_INTERFACE_MODE_XAUI: > + case PHY_INTERFACE_MODE_10GBASER: > + case PHY_INTERFACE_MODE_10GKR: > + case PHY_INTERFACE_MODE_USXGMII: > + return SPEED_10000; > + > + case PHY_INTERFACE_MODE_25GBASER: > + return SPEED_25000; > + > + case PHY_INTERFACE_MODE_XLGMII: > + return SPEED_40000; > + > + case PHY_INTERFACE_MODE_INTERNAL: > + /* Rate adaption is probably not supported */ > + return 0; > + > + case PHY_INTERFACE_MODE_NA: > + case PHY_INTERFACE_MODE_MAX: > + return SPEED_UNKNOWN; > + } > +} > + > static void phylink_link_up(struct phylink *pl, > struct phylink_link_state link_state) > { > struct net_device *ndev = pl->netdev; > + int speed, duplex; > + bool rx_pause; > + > + speed = link_state.speed; > + duplex = link_state.duplex; > + rx_pause = !!(link_state.pause & MLO_PAUSE_RX); > + > + switch (state->rate_adaption) { > + case RATE_ADAPT_PAUSE: > + /* The PHY is doing rate adaption from the media rate (in > + * the link_state) to the interface speed, and will send > + * pause frames to the MAC to limit its transmission speed. > + */ > + speed = phylink_interface_max_speed(link_state.interface); > + duplex = DUPLEX_FULL; > + rx_pause = true; > + break; > + > + case RATE_ADAPT_CRS: > + /* The PHY is doing rate adaption from the media rate (in > + * the link_state) to the interface speed, and will cause > + * collisions to the MAC to limit its transmission speed. > + */ > + speed = phylink_interface_max_speed(link_state.interface); > + duplex = DUPLEX_HALF; > + break; > + } > > pl->cur_interface = link_state.interface; > > if (pl->pcs && pl->pcs->ops->pcs_link_up) > pl->pcs->ops->pcs_link_up(pl->pcs, pl->cur_link_an_mode, > - pl->cur_interface, > - link_state.speed, link_state.duplex); > + pl->cur_interface, speed, duplex); > > pl->mac_ops->mac_link_up(pl->config, pl->phydev, > pl->cur_link_an_mode, pl->cur_interface, > - link_state.speed, link_state.duplex, > + speed, duplex, > !!(link_state.pause & MLO_PAUSE_TX), > - !!(link_state.pause & MLO_PAUSE_RX)); > + rx_pause); > > if (ndev) > netif_carrier_on(ndev); > @@ -1102,6 +1184,17 @@ static void phylink_resolve(struct work_struct *w) > } > link_state.interface = pl->phy_state.interface; > > + /* If we are doing rate adaption, then the > + * media speed/duplex has to come from the PHY. > + */ > + if (pl->phy_state.rate_adaption) { > + link_state.rate_adaption = > + pl->phy_state.rate_adaption; > + link_state.speed = pl->phy_state.speed; > + link_state.duplex = > + pl->phy_state.duplex; > + } > + > /* If we have a PHY, we need to update with > * the PHY flow control bits. > */ > @@ -1337,6 +1430,7 @@ static void phylink_phy_change(struct phy_device *phydev, bool up) > pl->phy_state.speed = phydev->speed; > pl->phy_state.duplex = phydev->duplex; > pl->phy_state.pause = MLO_PAUSE_NONE; > + pl->phy_state.rate_adaption = phydev->rate_adaption; > if (tx_pause) > pl->phy_state.pause |= MLO_PAUSE_TX; > if (rx_pause) > @@ -1414,6 +1508,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, > pl->phy_state.pause = MLO_PAUSE_NONE; > pl->phy_state.speed = SPEED_UNKNOWN; > pl->phy_state.duplex = DUPLEX_UNKNOWN; > + pl->phy_state.rate_adaption = RATE_ADAPT_NONE; > linkmode_copy(pl->supported, supported); > linkmode_copy(pl->link_config.advertising, config.advertising); > > diff --git a/include/linux/phylink.h b/include/linux/phylink.h > index 6d06896fc20d..65301e7961b0 100644 > --- a/include/linux/phylink.h > +++ b/include/linux/phylink.h > @@ -70,6 +70,7 @@ struct phylink_link_state { > int speed; > int duplex; > int pause; > + int rate_adaption; > unsigned int link:1; > unsigned int an_enabled:1; > unsigned int an_complete:1; >
On Thu, Jul 21, 2022 at 05:48:05PM -0400, Sean Anderson wrote: > On 7/20/22 2:32 PM, Russell King (Oracle) wrote: > > On Wed, Jul 20, 2022 at 07:50:52AM +0100, Russell King (Oracle) wrote: > >> We can do that by storing the PHY rate adaption state, and processing > >> that in phylink_link_up(). > > > > Something like this? I haven't included the IPG (open loop) stuff in > > this - I think when we come to implement that, we need a new mac > > method to call to set the IPG just before calling mac_link_up(). > > Something like: > > > > void mac_set_ipg(struct phylink_config *config, int packet_speed, > > int interface_speed); > > > > Note that we also have PCS that do rate adaption too, and I think > > fitting those in with the code below may be easier than storing the > > media and phy interface speed/duplex separately. > > This is another area where the MAC has to know a lot about the PCS. > We don't keep track of the PCS interface mode, so the MAC has to know > how to connect to the PCS. That could already include some rate > adaptation, but I suspect it is all done like GMII (where the clock > speed changes). In many cases, we don't even know what the interface used to connect the PCS to the MAC actually is (we'd have to use something like _INTERNAL). Particularly when the PCS and MAC are integrated on the same die, manufacturers tend not to tell people how the two blocks are connected. Even if we assume did use GMII internally for everything (and I mean everything), then decoding the GMII interface mode to mean SPEED_1000 won't work for anything over 1G speeds - so we can't do that. The more I think about it, the less meaning the interface mode between the PCS and MAC is for our purposes - unless we positively know for every device what that mode is, and can reliably translate that into the speed of that connection to derive the correct "speed" for the MAC. The point of bringing this up was just to bear it in mind, and I think when we add support for this, then... > > static void phylink_link_up(struct phylink *pl, > > struct phylink_link_state link_state) > > { > > struct net_device *ndev = pl->netdev; > > + int speed, duplex; > > + bool rx_pause; > > + > > + speed = link_state.speed; > > + duplex = link_state.duplex; > > + rx_pause = !!(link_state.pause & MLO_PAUSE_RX); > > + > > + switch (state->rate_adaption) { > > + case RATE_ADAPT_PAUSE: > > + /* The PHY is doing rate adaption from the media rate (in > > + * the link_state) to the interface speed, and will send > > + * pause frames to the MAC to limit its transmission speed. > > + */ > > + speed = phylink_interface_max_speed(link_state.interface); > > + duplex = DUPLEX_FULL; > > + rx_pause = true; > > + break; > > + > > + case RATE_ADAPT_CRS: > > + /* The PHY is doing rate adaption from the media rate (in > > + * the link_state) to the interface speed, and will cause > > + * collisions to the MAC to limit its transmission speed. > > + */ > > + speed = phylink_interface_max_speed(link_state.interface); > > + duplex = DUPLEX_HALF; > > + break; > > + } > > > > pl->cur_interface = link_state.interface; > > > > if (pl->pcs && pl->pcs->ops->pcs_link_up) > > pl->pcs->ops->pcs_link_up(pl->pcs, pl->cur_link_an_mode, > > - pl->cur_interface, > > - link_state.speed, link_state.duplex); > > + pl->cur_interface, speed, duplex); > > ... we would want to update the speed, duplex and rx_pause parameters here for the MAC. > > pl->mac_ops->mac_link_up(pl->config, pl->phydev, > > pl->cur_link_an_mode, pl->cur_interface, > > - link_state.speed, link_state.duplex, > > + speed, duplex, > > !!(link_state.pause & MLO_PAUSE_TX), > > - !!(link_state.pause & MLO_PAUSE_RX)); > > + rx_pause);
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index da0623d94a64..619ef553476f 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -160,16 +160,93 @@ static const char *phylink_an_mode_str(unsigned int mode) * @state: A link state * * Update the .speed and .duplex members of @state. We can determine them based - * on the .link_speed and .link_duplex. This function should be called whenever - * .link_speed and .link_duplex are updated. For example, userspace deals with - * link speed and duplex, and not the interface speed and duplex. Similarly, - * phys deal with link speed and duplex and only implicitly the interface speed - * and duplex. + * on the .link_speed, .link_duplex, .interface, and .rate_adaptation. This + * function should be called whenever .link_speed and .link_duplex are updated. + * For example, userspace deals with link speed and duplex, and not the + * interface speed and duplex. Similarly, phys deal with link speed and duplex + * and only implicitly the interface speed and duplex. */ static void phylink_state_fill_speed_duplex(struct phylink_link_state *state) { - state->speed = state->link_speed; - state->duplex = state->link_duplex; + switch (state->rate_adaptation) { + case RATE_ADAPT_NONE: + state->speed = state->link_speed; + state->duplex = state->link_duplex; + return; + case RATE_ADAPT_PAUSE: + state->duplex = DUPLEX_FULL; + break; + case RATE_ADAPT_CRS: + state->duplex = DUPLEX_HALF; + break; + case RATE_ADAPT_OPEN_LOOP: + state->duplex = state->link_duplex; + break; + } + + /* Use the max speed of the interface */ + switch (state->interface) { + case PHY_INTERFACE_MODE_100BASEX: + case PHY_INTERFACE_MODE_REVRMII: + case PHY_INTERFACE_MODE_RMII: + case PHY_INTERFACE_MODE_SMII: + case PHY_INTERFACE_MODE_REVMII: + case PHY_INTERFACE_MODE_MII: + state->speed = SPEED_100; + return; + + case PHY_INTERFACE_MODE_TBI: + case PHY_INTERFACE_MODE_MOCA: + case PHY_INTERFACE_MODE_RTBI: + case PHY_INTERFACE_MODE_1000BASEX: + case PHY_INTERFACE_MODE_1000BASEKX: + case PHY_INTERFACE_MODE_TRGMII: + case PHY_INTERFACE_MODE_RGMII_TXID: + case PHY_INTERFACE_MODE_RGMII_RXID: + case PHY_INTERFACE_MODE_RGMII_ID: + case PHY_INTERFACE_MODE_RGMII: + case PHY_INTERFACE_MODE_QSGMII: + case PHY_INTERFACE_MODE_SGMII: + case PHY_INTERFACE_MODE_GMII: + state->speed = SPEED_1000; + return; + + case PHY_INTERFACE_MODE_2500BASEX: + state->speed = SPEED_2500; + return; + + case PHY_INTERFACE_MODE_5GBASER: + state->speed = SPEED_5000; + return; + + case PHY_INTERFACE_MODE_XGMII: + case PHY_INTERFACE_MODE_RXAUI: + case PHY_INTERFACE_MODE_XAUI: + case PHY_INTERFACE_MODE_10GBASER: + case PHY_INTERFACE_MODE_10GKR: + case PHY_INTERFACE_MODE_USXGMII: + state->speed = SPEED_10000; + return; + + case PHY_INTERFACE_MODE_25GBASER: + state->speed = SPEED_25000; + return; + + case PHY_INTERFACE_MODE_XLGMII: + state->speed = SPEED_40000; + return; + + case PHY_INTERFACE_MODE_INTERNAL: + state->speed = state->link_speed; + return; + + case PHY_INTERFACE_MODE_NA: + case PHY_INTERFACE_MODE_MAX: + state->speed = SPEED_UNKNOWN; + return; + } + + WARN_ON(1); } /** @@ -803,11 +880,12 @@ static void phylink_mac_config(struct phylink *pl, const struct phylink_link_state *state) { phylink_dbg(pl, - "%s: mode=%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", + "%s: mode=%s/%s/%s/%s/%s adv=%*pb pause=%02x link=%u an=%u\n", __func__, phylink_an_mode_str(pl->cur_link_an_mode), phy_modes(state->interface), phy_speed_to_str(state->speed), phy_duplex_to_str(state->duplex), + phy_rate_adaptation_to_str(state->rate_adaptation), __ETHTOOL_LINK_MODE_MASK_NBITS, state->advertising, state->pause, state->link, state->an_enabled); @@ -944,6 +1022,7 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, linkmode_zero(state->lp_advertising); state->interface = pl->link_config.interface; state->an_enabled = pl->link_config.an_enabled; + state->rate_adaptation = pl->link_config.rate_adaptation; if (state->an_enabled) { state->link_speed = SPEED_UNKNOWN; state->link_duplex = DUPLEX_UNKNOWN; @@ -968,8 +1047,10 @@ static void phylink_mac_pcs_get_state(struct phylink *pl, else state->link = 0; - state->link_speed = state->speed; - state->link_duplex = state->duplex; + if (state->rate_adaptation == RATE_ADAPT_NONE) { + state->link_speed = state->speed; + state->link_duplex = state->duplex; + } } /* The fixed state is... fixed except for the link state, @@ -1043,6 +1124,8 @@ static void phylink_link_up(struct phylink *pl, struct net_device *ndev = pl->netdev; pl->cur_interface = link_state.interface; + if (link_state.rate_adaptation == RATE_ADAPT_PAUSE) + link_state.pause |= MLO_PAUSE_RX; if (pl->pcs && pl->pcs->ops->pcs_link_up) pl->pcs->ops->pcs_link_up(pl->pcs, pl->cur_link_an_mode, @@ -1145,6 +1228,18 @@ static void phylink_resolve(struct work_struct *w) } link_state.interface = pl->phy_state.interface; + /* If we are doing rate adaptation, then the + * link speed/duplex comes from the PHY + */ + if (pl->phy_state.rate_adaptation) { + link_state.rate_adaptation = + pl->phy_state.rate_adaptation; + link_state.link_speed = + pl->phy_state.link_speed; + link_state.link_duplex = + pl->phy_state.link_duplex; + } + /* If we have a PHY, we need to update with * the PHY flow control bits. */ @@ -1380,6 +1475,7 @@ static void phylink_phy_change(struct phy_device *phydev, bool up) mutex_lock(&pl->state_mutex); pl->phy_state.link_speed = phydev->speed; pl->phy_state.link_duplex = phydev->duplex; + pl->phy_state.rate_adaptation = phydev->rate_adaptation; pl->phy_state.pause = MLO_PAUSE_NONE; if (tx_pause) pl->phy_state.pause |= MLO_PAUSE_TX; @@ -1392,10 +1488,11 @@ static void phylink_phy_change(struct phy_device *phydev, bool up) phylink_run_resolve(pl); - phylink_dbg(pl, "phy link %s %s/%s/%s/%s\n", up ? "up" : "down", + phylink_dbg(pl, "phy link %s %s/%s/%s/%s/%s\n", up ? "up" : "down", phy_modes(phydev->interface), phy_speed_to_str(phydev->speed), phy_duplex_to_str(phydev->duplex), + phy_rate_adaptation_to_str(phydev->rate_adaptation), phylink_pause_to_str(pl->phy_state.pause)); } @@ -1459,6 +1556,7 @@ static int phylink_bringup_phy(struct phylink *pl, struct phy_device *phy, pl->phy_state.pause = MLO_PAUSE_NONE; pl->phy_state.link_speed = SPEED_UNKNOWN; pl->phy_state.link_duplex = DUPLEX_UNKNOWN; + pl->phy_state.rate_adaptation = RATE_ADAPT_NONE; phylink_state_fill_speed_duplex(&pl->phy_state); linkmode_copy(pl->supported, supported); linkmode_copy(pl->link_config.advertising, config.advertising); @@ -1902,8 +2000,10 @@ static void phylink_get_ksettings(const struct phylink_link_state *state, { phylink_merge_link_mode(kset->link_modes.advertising, state->advertising); linkmode_copy(kset->link_modes.lp_advertising, state->lp_advertising); - kset->base.speed = state->link_speed; - kset->base.duplex = state->link_duplex; + if (kset->base.rate_adaptation == RATE_ADAPT_NONE) { + kset->base.speed = state->link_speed; + kset->base.duplex = state->link_duplex; + } kset->base.autoneg = state->an_enabled ? AUTONEG_ENABLE : AUTONEG_DISABLE; } diff --git a/include/linux/phylink.h b/include/linux/phylink.h index ab5edc1e5330..f32b97f27ddc 100644 --- a/include/linux/phylink.h +++ b/include/linux/phylink.h @@ -80,6 +80,7 @@ struct phylink_link_state { int duplex; int link_duplex; int pause; + int rate_adaptation; unsigned int link:1; unsigned int an_enabled:1; unsigned int an_complete:1;
If the phy is configured to use pause-based rate adaptation, ensure that the link is full duplex with pause frame reception enabled. As suggested, if pause-based rate adaptation is enabled by the phy, then pause reception is unconditionally enabled. The interface duplex is determined based on the rate adaptation type. When rate adaptation is enabled, so is the speed. We assume the maximum interface speed is used. This is only relevant for MLO_AN_PHY. For MLO_AN_INBAND, the MAC/PCS's view of the interface speed will be used. Signed-off-by: Sean Anderson <sean.anderson@seco.com> --- Changes in v2: - Use the phy's rate adaptation setting to determine whether to use its link speed/duplex or the MAC's speed/duplex with MLO_AN_INBAND. - Always use the rate adaptation setting to determine the interface speed/duplex (instead of sometimes using the interface mode). drivers/net/phy/phylink.c | 126 ++++++++++++++++++++++++++++++++++---- include/linux/phylink.h | 1 + 2 files changed, 114 insertions(+), 13 deletions(-)