Message ID | 20231227231657.15152-1-asmaa@nvidia.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,1/1] net: phy: micrel: Add workaround for incomplete autonegotiation | expand |
On 12/28/2023 12:16 AM, Asmaa Mnebhi wrote: > Very rarely, the KSZ9031 fails to complete autonegotiation although it was > initiated via phy_start(). As a result, the link stays down. Restarting > autonegotiation when in this state solves the issue. > > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> Is there a Micrel errata associated with this work around that could be referenced here?
> On 12/28/2023 12:16 AM, Asmaa Mnebhi wrote: > > Very rarely, the KSZ9031 fails to complete autonegotiation although it > > was initiated via phy_start(). As a result, the link stays down. > > Restarting autonegotiation when in this state solves the issue. > > > > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> > > Is there a Micrel errata associated with this work around that could be > referenced here? Hi Florian, No there isn’t. This is based on observations and comparison with the behavior and testing of other PHYs. For example, we don’t see this issue with the Vitesse PHY. Thanks. Asmaa
On 28.12.2023 14:37, Asmaa Mnebhi wrote: > > On 12/28/2023 12:16 AM, Asmaa Mnebhi wrote: >>> Very rarely, the KSZ9031 fails to complete autonegotiation although it >>> was initiated via phy_start(). As a result, the link stays down. >>> Restarting autonegotiation when in this state solves the issue. >>> >>> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> >> >> Is there a Micrel errata associated with this work around that could be >> referenced here? > > Hi Florian, > > No there isn’t. This is based on observations and comparison with the behavior and testing of other PHYs. For example, we don’t see this issue with the Vitesse PHY. > The Microchip KSZ9031 errata documentation lists few link-related errata. May any of these be relevant in your case? If not, please check with Microchip. KSZ9031 isn't new, and most likely we would have seen such reports before, if there's an actual issue. I'd like to avoid that we add code to work around an issue that is specific to your setup. > Thanks. > Asmaa
On 28.12.2023 00:16, Asmaa Mnebhi wrote: > Very rarely, the KSZ9031 fails to complete autonegotiation although it was > initiated via phy_start(). As a result, the link stays down. Restarting > autonegotiation when in this state solves the issue. > The patch isn't addressed to all relevant maintainers. Please use the get_maintainers script. You should use the net/net-next annotation to make clear whether this should be treated as a fix (in this case add a Fixes tag) or net-next material. > Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> > --- > v1->v2: > - Use msleep() instead of mdelay() > > drivers/net/phy/micrel.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c > index 08e3915001c3..9952a073413f 100644 > --- a/drivers/net/phy/micrel.c > +++ b/drivers/net/phy/micrel.c > @@ -1475,6 +1475,7 @@ static int ksz9031_get_features(struct phy_device *phydev) > > static int ksz9031_read_status(struct phy_device *phydev) > { > + u8 timeout = 10; > int err; > int regval; > > @@ -1494,6 +1495,22 @@ static int ksz9031_read_status(struct phy_device *phydev) > return genphy_config_aneg(phydev); > } > > + /* KSZ9031's autonegotiation takes normally 4-5 seconds to complete. > + * Occasionally it fails to complete autonegotiation. The workaround is > + * to restart it. > + */ > + if (phydev->autoneg == AUTONEG_ENABLE) { > + while (timeout) { > + if (phy_aneg_done(phydev)) > + break; > + msleep(1000); > + timeout--; It's not too nice to do this synchronously. Even in the non-problem case this will block the phylib state machine for seconds. Better find a way to do it asynchronously. > + }; > + > + if (timeout == 0) > + phy_restart_aneg(phydev); > + } > + > return 0; > } >
> >> Is there a Micrel errata associated with this work around that could > >> be referenced here? > > > > Hi Florian, > > > > No there isn’t. This is based on observations and comparison with the > behavior and testing of other PHYs. For example, we don’t see this issue with > the Vitesse PHY. > > > The Microchip KSZ9031 errata documentation lists few link-related errata. > May any of these be relevant in your case? If not, please check with Microchip. > KSZ9031 isn't new, and most likely we would have seen such reports before, if > there's an actual issue. > I'd like to avoid that we add code to work around an issue that is specific to > your setup. > Thanks Heiner. I went over the errata and there are couple of issues which could result in the link not coming up: 1) Module 1: Device fails to link after Asymmetric Pause capability is set The micrel.c driver already has a workaround for this and I have verified that when our issue reproduces, only symmetric pause is enabled. 2) Module 5: Auto-Negotiation link-up failure / long link-up time due to default FLP interval setting The micrel.c driver also already has a workaround for this. Apart from the erratas, I see that there were other KSZ9031 issues for which workarounds were needed in the kernel: 1) commit d2fd719bcb0e83cb39cfee22ee800f98a56eceb3 net/phy: micrel: Add workaround for bad autoneg 2) commit c1a8d0a3accf64a014d605e6806ce05d1c17adf1 net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround in our case, I have verified that we don’t stumble upon the above 2 bugs (idle error count is 0x0). Our QA sees that autonegotiation fails to complete after rebooting the system > 2000 times. So it is hard to reproduce. Our OOB MAC is connected to the Micrel KSZ9031, which is connected to a switch. I have checked that phy_start() calls phy_start_aneg() and that the genphy_restart_aneg() sets the BMCR_ANRESTART bit. After that, it doesn’t matter how long we wait, the PHY autonegotiation doesn’t complete and the link is down. Restarting autonegotiation a second time solves the issue. I will share this information with Microchip. I hope they can help. Thanks. Asmaa
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c index 08e3915001c3..9952a073413f 100644 --- a/drivers/net/phy/micrel.c +++ b/drivers/net/phy/micrel.c @@ -1475,6 +1475,7 @@ static int ksz9031_get_features(struct phy_device *phydev) static int ksz9031_read_status(struct phy_device *phydev) { + u8 timeout = 10; int err; int regval; @@ -1494,6 +1495,22 @@ static int ksz9031_read_status(struct phy_device *phydev) return genphy_config_aneg(phydev); } + /* KSZ9031's autonegotiation takes normally 4-5 seconds to complete. + * Occasionally it fails to complete autonegotiation. The workaround is + * to restart it. + */ + if (phydev->autoneg == AUTONEG_ENABLE) { + while (timeout) { + if (phy_aneg_done(phydev)) + break; + msleep(1000); + timeout--; + }; + + if (timeout == 0) + phy_restart_aneg(phydev); + } + return 0; }
Very rarely, the KSZ9031 fails to complete autonegotiation although it was initiated via phy_start(). As a result, the link stays down. Restarting autonegotiation when in this state solves the issue. Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com> --- v1->v2: - Use msleep() instead of mdelay() drivers/net/phy/micrel.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)