diff mbox series

[net,1/2] net: phylink: Force link down and retrigger resolve on interface change

Message ID 20211122235154.6392-2-kabel@kernel.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,1/2] net: phylink: Force link down and retrigger resolve on interface change | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers fail 1 blamed authors not CCed: linux@armlinux.org.uk; 2 maintainers not CCed: linux@armlinux.org.uk hkallweit1@gmail.com
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 37 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Marek Behún Nov. 22, 2021, 11:51 p.m. UTC
On PHY state change the phylink_resolve() function can read stale
information from the MAC and report incorrect link speed and duplex to
the kernel message log.

Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell
88E6393X switch:
- PHY driver triggers state change due to PHY interface mode being
  changed from 10gbase-r to 2500base-x due to copper change in speed
  from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed
  its interface to the host, or the interrupt about loss of SerDes link
  hadn't arrived yet (there can be a delay of several milliseconds for
  this), so we still think that the 10gbase-r mode is up
- phylink_resolve()
  - phylink_mac_pcs_get_state()
    - this fills in speed=10g link=up
  - interface mode is updated to 2500base-x but speed is left at 10Gbps
  - phylink_major_config()
    - interface is changed to 2500base-x
  - phylink_link_up()
    - mv88e6xxx_mac_link_up()
      - .port_set_speed_duplex()
        - speed is set to 10Gbps
    - reports "Link is Up - 10Gbps/Full" to dmesg

Afterwards when the interrupt finally arrives for mv88e6xxx, another
resolve is forced in which we get the correct speed from
phylink_mac_pcs_get_state(), but since the interface is not being
changed anymore, we don't call phylink_major_config() but only
phylink_mac_config(), which does not set speed/duplex anymore.

To fix this, we need to force the link down and trigger another resolve
on PHY interface change event.

Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Marek Behún <kabel@kernel.org>
---
 drivers/net/phy/phylink.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

Comments

Russell King (Oracle) Nov. 23, 2021, 11:20 a.m. UTC | #1
On Tue, Nov 23, 2021 at 12:51:53AM +0100, Marek Behún wrote:
> On PHY state change the phylink_resolve() function can read stale
> information from the MAC and report incorrect link speed and duplex to
> the kernel message log.
> 
> Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell
> 88E6393X switch:
> - PHY driver triggers state change due to PHY interface mode being
>   changed from 10gbase-r to 2500base-x due to copper change in speed
>   from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed
>   its interface to the host, or the interrupt about loss of SerDes link
>   hadn't arrived yet (there can be a delay of several milliseconds for
>   this), so we still think that the 10gbase-r mode is up
> - phylink_resolve()
>   - phylink_mac_pcs_get_state()
>     - this fills in speed=10g link=up
>   - interface mode is updated to 2500base-x but speed is left at 10Gbps
>   - phylink_major_config()
>     - interface is changed to 2500base-x
>   - phylink_link_up()
>     - mv88e6xxx_mac_link_up()
>       - .port_set_speed_duplex()
>         - speed is set to 10Gbps
>     - reports "Link is Up - 10Gbps/Full" to dmesg
> 
> Afterwards when the interrupt finally arrives for mv88e6xxx, another
> resolve is forced in which we get the correct speed from
> phylink_mac_pcs_get_state(), but since the interface is not being
> changed anymore, we don't call phylink_major_config() but only
> phylink_mac_config(), which does not set speed/duplex anymore.
> 
> To fix this, we need to force the link down and trigger another resolve
> on PHY interface change event.
> 
> Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> Signed-off-by: Marek Behún <kabel@kernel.org>

I'm pretty sure someone will highlight that the author of the patch
should be the first sign-off - which doesn't match given the way
you've sent this patch. That probably needs fixing before it's
applied.
Marek Behún Nov. 23, 2021, 12:12 p.m. UTC | #2
On Tue, 23 Nov 2021 11:20:59 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Tue, Nov 23, 2021 at 12:51:53AM +0100, Marek Behún wrote:
> > On PHY state change the phylink_resolve() function can read stale
> > information from the MAC and report incorrect link speed and duplex to
> > the kernel message log.
> > 
> > Example with a Marvell 88X3310 PHY connected to a SerDes port on Marvell
> > 88E6393X switch:
> > - PHY driver triggers state change due to PHY interface mode being
> >   changed from 10gbase-r to 2500base-x due to copper change in speed
> >   from 10Gbps to 2.5Gbps, but the PHY itself either hasn't yet changed
> >   its interface to the host, or the interrupt about loss of SerDes link
> >   hadn't arrived yet (there can be a delay of several milliseconds for
> >   this), so we still think that the 10gbase-r mode is up
> > - phylink_resolve()
> >   - phylink_mac_pcs_get_state()
> >     - this fills in speed=10g link=up
> >   - interface mode is updated to 2500base-x but speed is left at 10Gbps
> >   - phylink_major_config()
> >     - interface is changed to 2500base-x
> >   - phylink_link_up()
> >     - mv88e6xxx_mac_link_up()
> >       - .port_set_speed_duplex()
> >         - speed is set to 10Gbps
> >     - reports "Link is Up - 10Gbps/Full" to dmesg
> > 
> > Afterwards when the interrupt finally arrives for mv88e6xxx, another
> > resolve is forced in which we get the correct speed from
> > phylink_mac_pcs_get_state(), but since the interface is not being
> > changed anymore, we don't call phylink_major_config() but only
> > phylink_mac_config(), which does not set speed/duplex anymore.
> > 
> > To fix this, we need to force the link down and trigger another resolve
> > on PHY interface change event.
> > 
> > Fixes: 9525ae83959b ("phylink: add phylink infrastructure")
> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
> > Signed-off-by: Marek Behún <kabel@kernel.org>  
> 
> I'm pretty sure someone will highlight that the author of the patch
> should be the first sign-off - which doesn't match given the way
> you've sent this patch. That probably needs fixing before it's
> applied.
> 

Hmm. Well you're the author of the patch, I only wrote the commit
message. But I forgot to change --author in git commit. I shall resend
this.

Marek
diff mbox series

Patch

diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 3603c024109a..5b8b61daeb98 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -963,6 +963,7 @@  static void phylink_resolve(struct work_struct *w)
 	struct phylink_link_state link_state;
 	struct net_device *ndev = pl->netdev;
 	bool mac_config = false;
+	bool retrigger = false;
 	bool cur_link_state;
 
 	mutex_lock(&pl->state_mutex);
@@ -976,6 +977,7 @@  static void phylink_resolve(struct work_struct *w)
 		link_state.link = false;
 	} else if (pl->mac_link_dropped) {
 		link_state.link = false;
+		retrigger = true;
 	} else {
 		switch (pl->cur_link_an_mode) {
 		case MLO_AN_PHY:
@@ -1000,6 +1002,15 @@  static void phylink_resolve(struct work_struct *w)
 
 			/* Only update if the PHY link is up */
 			if (pl->phydev && pl->phy_state.link) {
+				/* If the interface has changed, force a
+				 * link down event if the link isn't already
+				 * down, and re-resolve.
+				 */
+				if (link_state.interface !=
+				    pl->phy_state.interface) {
+					retrigger = true;
+					link_state.link = false;
+				}
 				link_state.interface = pl->phy_state.interface;
 
 				/* If we have a PHY, we need to update with
@@ -1042,7 +1053,7 @@  static void phylink_resolve(struct work_struct *w)
 		else
 			phylink_link_up(pl, link_state);
 	}
-	if (!link_state.link && pl->mac_link_dropped) {
+	if (!link_state.link && retrigger) {
 		pl->mac_link_dropped = false;
 		queue_work(system_power_efficient_wq, &pl->resolve);
 	}