diff mbox series

[v1,1/1] net: phy: micrel: Add workaround for incomplete autonegotiation

Message ID 20231226141903.12040-1-asmaa@nvidia.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series [v1,1/1] net: phy: micrel: Add workaround for incomplete autonegotiation | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1113 this patch: 1113
netdev/cc_maintainers warning 5 maintainers not CCed: edumazet@google.com pabeni@redhat.com kuba@kernel.org hkallweit1@gmail.com andrew@lunn.ch
netdev/build_clang success Errors and warnings before: 1140 this patch: 1140
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1140 this patch: 1140
netdev/checkpatch fail ERROR: code indent should use tabs where possible WARNING: please, no spaces at the start of a line
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Asmaa Mnebhi Dec. 26, 2023, 2:19 p.m. UTC
Very rarely, the KSZ9031 fails to complete autonegotiation although it was
initiated via phy_start(). As a result, the link stays down. Restarting
autonegotiation when in this state solves the issue.

Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
---
 drivers/net/phy/micrel.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Marek Mojík Dec. 27, 2023, 10:19 p.m. UTC | #1
On 12/26/23 15:19, Asmaa Mnebhi wrote:
> Very rarely, the KSZ9031 fails to complete autonegotiation although it was
> initiated via phy_start(). As a result, the link stays down. Restarting
> autonegotiation when in this state solves the issue.
> 
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>
> ---
>   drivers/net/phy/micrel.c | 17 +++++++++++++++++
>   1 file changed, 17 insertions(+)
> 
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index 08e3915001c3..de8140c5907f 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -1475,6 +1475,7 @@ static int ksz9031_get_features(struct phy_device *phydev)
>   
>   static int ksz9031_read_status(struct phy_device *phydev)
>   {
> +	u8 timeout = 10;
>   	int err;
>   	int regval;
>   
> @@ -1494,6 +1495,22 @@ static int ksz9031_read_status(struct phy_device *phydev)
>   		return genphy_config_aneg(phydev);
>   	}
>   
> +	/* KSZ9031's autonegotiation takes normally 4-5 seconds to complete.
> +	 * Occasionally it fails to complete autonegotiation. The workaround is
> +	 * to restart it.
> +	 */
> +        if (phydev->autoneg == AUTONEG_ENABLE) {
> +		while (timeout) {
> +			if (phy_aneg_done(phydev))
> +				break;
> +			mdelay(1000);
> +			timeout--;
> +		};
> +
> +		if (timeout == 0)
> +			phy_restart_aneg(phydev);
> +	}
> +
>   	return 0;
>   }


Hi Asmaa, mdelay is busy-wait, so you're unnecessarily blocking cpu core
for 10 seconds, msleep should be used here as explained in the docs 
https://kernel.org/doc/Documentation/timers/timers-howto.txt

Marek
Russell King (Oracle) Jan. 2, 2024, 6:45 p.m. UTC | #2
On Tue, Dec 26, 2023 at 09:19:03AM -0500, Asmaa Mnebhi wrote:
> +	/* KSZ9031's autonegotiation takes normally 4-5 seconds to complete.
> +	 * Occasionally it fails to complete autonegotiation. The workaround is
> +	 * to restart it.
> +	 */
> +        if (phydev->autoneg == AUTONEG_ENABLE) {
> +		while (timeout) {
> +			if (phy_aneg_done(phydev))
> +				break;
> +			mdelay(1000);
> +			timeout--;
> +		};

Extra needless ;

Also.. ouch! This means we end up holding phydev->lock for up to ten
seconds, which prevents anything else happening with phylib during
that time. Not sure I can see a good way around that though. Andrew?
Andrew Lunn Jan. 3, 2024, 1:27 a.m. UTC | #3
On Tue, Jan 02, 2024 at 06:45:17PM +0000, Russell King (Oracle) wrote:
> On Tue, Dec 26, 2023 at 09:19:03AM -0500, Asmaa Mnebhi wrote:
> > +	/* KSZ9031's autonegotiation takes normally 4-5 seconds to complete.
> > +	 * Occasionally it fails to complete autonegotiation. The workaround is
> > +	 * to restart it.
> > +	 */
> > +        if (phydev->autoneg == AUTONEG_ENABLE) {
> > +		while (timeout) {
> > +			if (phy_aneg_done(phydev))
> > +				break;
> > +			mdelay(1000);
> > +			timeout--;
> > +		};
> 
> Extra needless ;
> 
> Also.. ouch! This means we end up holding phydev->lock for up to ten
> seconds, which prevents anything else happening with phylib during
> that time. Not sure I can see a good way around that though. Andrew?

Is there any status registers which indicate energy detection? No
point doing retries if there is no sign of a link partner.

I would also suggest moving the timeout into a driver private data
structure, and rely on phylib polling the PHY once per second and
restart autoneg from that. That will avoid holding the lock for a long
time.

	Andrew
Asmaa Mnebhi Jan. 19, 2024, 6:11 p.m. UTC | #4
> Is there any status registers which indicate energy detection? No point doing
> retries if there is no sign of a link partner.
> 
> I would also suggest moving the timeout into a driver private data structure,
> and rely on phylib polling the PHY once per second and restart autoneg from
> that. That will avoid holding the lock for a long time.
> 
Hi Andrew, 

Thank you for your feedback.

There is no status register indicating energy detection on the KSZ9031 PHY. This issue reproduces during a 2000 reboot test of the BlueField chip. The link partner is always up during the test and provides network to other entities such as servers and BMC.

We use this PHY driver with the mlxbf-gige driver. The PHY irq is used rather than polling in phy_connect_direct. if we don't want to make changes to the micrel.c driver, we could move this logic to mlxbf_gige_open() function right after calling phy_start()? This would ensure that autonegotiation is completed. Please let me know what you think.

Thanks.
Asmaa
Andrew Lunn Jan. 19, 2024, 7:44 p.m. UTC | #5
On Fri, Jan 19, 2024 at 06:11:56PM +0000, Asmaa Mnebhi wrote:
>  
> > Is there any status registers which indicate energy detection? No point doing
> > retries if there is no sign of a link partner.
> > 
> > I would also suggest moving the timeout into a driver private data structure,
> > and rely on phylib polling the PHY once per second and restart autoneg from
> > that. That will avoid holding the lock for a long time.
> > 
> Hi Andrew, 
> 
> Thank you for your feedback.

Lets try to figure out some more about the situation when it fails to
link up.

What is the value of BMSR when it fails to report complete? You say
you are using interrupts, so i just want to make sure its not an
interrupt problem, you are using edge interrupts instead of level,
etc.  Maybe i'm remembering wrong, but i though i made a comment about
this once when reviewing one of your drivers. What about the contents
of registers 0x1b and 0x1f?

Thanks
   Andrew
Andrew Lunn Jan. 19, 2024, 7:56 p.m. UTC | #6
> What is the value of BMSR when it fails to report complete? You say
> you are using interrupts, so i just want to make sure its not an
> interrupt problem, you are using edge interrupts instead of level,
> etc.

Please could you check that /proc/interrupts do show level interrupts
are being used.

       Andrew
Asmaa Mnebhi Jan. 19, 2024, 8:18 p.m. UTC | #7
> > >
> > Hi Andrew,
> >
> > Thank you for your feedback.
> 
> Lets try to figure out some more about the situation when it fails to link up.
> 
> What is the value of BMSR when it fails to report complete? You say you are
> using interrupts, so i just want to make sure its not an interrupt problem, you
> are using edge interrupts instead of level, etc.  Maybe i'm remembering wrong,
> but i though i made a comment about this once when reviewing one of your
> drivers. What about the contents of registers 0x1b and 0x1f?
> 
Yes I dumped all PHY registers and didn't see anything suspicious besides the autonegotiation not completing:
root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x0
0x1140

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x1
0x7949 //aneg didnt complete and link failed

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x9
0x0200 // correct advertisement from PHY

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0xA
0000 // nothing detected on link partner 

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0xF
0x3000 // correct ability advertised

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x15
0000 // no errors

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x1B
0x0500 // no pending interrupts

root@localhost:~/phytool# ./phytool/phytool read oob_net0/0x3/0x1F
0x0300 


I also added the following debug prints. Please see comment next to them if they were printed or not during the reproduction.

--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
static int ksz9031_read_status(struct phy_device *phydev)
        int regval; 
        err = genphy_read_status(phydev);
+        if (err) {
+            printk("ksz9031 genphy_read_status failed"); //not printed
....
          regval = phy_read(phydev, MII_STAT1000);
+        if ((regval & 0xFF) == 0xFF) {
+          printk("ksz9031 err"); //not printed


--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
      void phy_state_machine(struct work_struct *work)
        if (needs_aneg) {
+        printk("needs_aneg"); //printed
          err = phy_start_aneg(phydev);


--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1718,6 +1718,8 @@ int __genphy_config_aneg(struct phy_device *phydev, bool changed)
                        changed = true; /* do restart aneg */
        }
 
+       printk("restart = %d", restart); // prints "restart = 1" so autonegotiation is restarted by the line below.

        /* Only restart aneg if we are advertising something different
         * than we were before.
   

The above prints proved that the micrel PHY started autonegotiation but the result is that it failed to complete it. I also noticed that the KSZ9031 PHY takes ~5 full seconds to complete aneg which is much longer than other PHYs like VSC8221 (which we use with BlueField-3 systems).

Regarding this comment in your next email:

"Please could you check that /proc/interrupts do show level interrupts are being used."
I don't think the problem is the interrupt. The interrupt for link up is issued only when autonegotiation is completed. If autonegotiation is not completed the link just stays down as indicated in PHY register 1, and no interrupt is issued.

Thanks.
Asmaa
Andrew Lunn Jan. 19, 2024, 10:38 p.m. UTC | #8
> The above prints proved that the micrel PHY started autonegotiation
> but the result is that it failed to complete it. I also noticed that
> the KSZ9031 PHY takes ~5 full seconds to complete aneg which is much
> longer than other PHYs like VSC8221 (which we use with BlueField-3
> systems).

What is the link partner? From the datasheet

MMD Address 1h, Register 5Ah – 1000BASE-T Link-Up Time Control

When the link partner is another KSZ9031 device,
the 1000BASE-T link-up time can be long. These
three bits provide an optional setting to reduce the
1000BASE-T link-up time.
100 = Default power-up setting
011 = Optional setting to reduce link-up time when
the link partner is a KSZ9031 device.

Might be worth setting it and see what happens.

Have you tried playing with the prefer master/prefer slave options? If
you have identical PHYs on each end, it could be they are generating
the same 'random' number used to determine who should be master and
who should be slave. If they both pick the same number, they are
supposed to pick a different random number and try again. There have
been some PHYs which are broken in this respect. prefer master/prefer
slave should influence the random number, biasing it higher/lower.

auto-neg should typically take a little over 1 second. 5 seconds is
way too long, something is not correct. You might want to sniff the
fast link pulses, try to decode the values and see what is going on.

I would not be surprised if you find out this 5 second complete time
is somehow related to it not completing at all.

	Andrew
Asmaa Mnebhi Jan. 19, 2024, 10:57 p.m. UTC | #9
> 
> > The above prints proved that the micrel PHY started autonegotiation
> > but the result is that it failed to complete it. I also noticed that
> > the KSZ9031 PHY takes ~5 full seconds to complete aneg which is much
> > longer than other PHYs like VSC8221 (which we use with BlueField-3
> > systems).
> 
> What is the link partner? From the datasheet
> 
> MMD Address 1h, Register 5Ah – 1000BASE-T Link-Up Time Control
> 
> When the link partner is another KSZ9031 device, the 1000BASE-T link-up time
> can be long. These three bits provide an optional setting to reduce the
> 1000BASE-T link-up time.
> 100 = Default power-up setting
> 011 = Optional setting to reduce link-up time when the link partner is a KSZ9031
> device.
> 
> Might be worth setting it and see what happens.
> 
> Have you tried playing with the prefer master/prefer slave options? If you have
> identical PHYs on each end, it could be they are generating the same 'random'
> number used to determine who should be master and who should be slave. If
> they both pick the same number, they are supposed to pick a different random
> number and try again. There have been some PHYs which are broken in this
> respect. prefer master/prefer slave should influence the random number,
> biasing it higher/lower.
> 
> auto-neg should typically take a little over 1 second. 5 seconds is way too long,
> something is not correct. You might want to sniff the fast link pulses, try to
> decode the values and see what is going on.
> 
> I would not be surprised if you find out this 5 second complete time is somehow
> related to it not completing at all.
>

The link partner is a switch (KSZ9893R) so I am not sure setting the 5Ah register to 011 would help.
Florian Fainelli Jan. 19, 2024, 11:14 p.m. UTC | #10
On 12/26/23 06:19, Asmaa Mnebhi wrote:
> Very rarely, the KSZ9031 fails to complete autonegotiation although it was
> initiated via phy_start(). As a result, the link stays down. Restarting
> autonegotiation when in this state solves the issue.
> 
> Signed-off-by: Asmaa Mnebhi <asmaa@nvidia.com>

We used to have a link_timeout as well as the PHY_HAS_MAGICANEG logic in 
the PHY library a long time ago:

https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git/commit/?id=00db8189d984d6c51226dafbbe4a667ce9b7d5da

maybe you can schedule a work queue in case you are using interrupts to 
re-check the link status periodically?
Asmaa Mnebhi Jan. 24, 2024, 10:18 p.m. UTC | #11
> > What is the link partner? From the datasheet
> >
> > MMD Address 1h, Register 5Ah – 1000BASE-T Link-Up Time Control
> >
> > When the link partner is another KSZ9031 device, the 1000BASE-T
> > link-up time can be long. These three bits provide an optional setting
> > to reduce the 1000BASE-T link-up time.
> > 100 = Default power-up setting
> > 011 = Optional setting to reduce link-up time when the link partner is
> > a KSZ9031 device.
> >
> > Might be worth setting it and see what happens.
> >
> > Have you tried playing with the prefer master/prefer slave options? If
> > you have identical PHYs on each end, it could be they are generating the same
> 'random'
> > number used to determine who should be master and who should be slave.
> > If they both pick the same number, they are supposed to pick a
> > different random number and try again. There have been some PHYs which
> > are broken in this respect. prefer master/prefer slave should
> > influence the random number, biasing it higher/lower.
> >
> > auto-neg should typically take a little over 1 second. 5 seconds is
> > way too long, something is not correct. You might want to sniff the
> > fast link pulses, try to decode the values and see what is going on.
> >
> > I would not be surprised if you find out this 5 second complete time
> > is somehow related to it not completing at all.
> >
> 
> The link partner is a switch (KSZ9893R) so I am not sure setting the 5Ah register
> to 011 would help.

I set the 5Ah register to 011 and that didn’t help. I also am consulting the vendor and the hardware team regarding why autonegotiation takes so long with the KSZ9031. Will report back when they get back to me.
diff mbox series

Patch

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 08e3915001c3..de8140c5907f 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -1475,6 +1475,7 @@  static int ksz9031_get_features(struct phy_device *phydev)
 
 static int ksz9031_read_status(struct phy_device *phydev)
 {
+	u8 timeout = 10;
 	int err;
 	int regval;
 
@@ -1494,6 +1495,22 @@  static int ksz9031_read_status(struct phy_device *phydev)
 		return genphy_config_aneg(phydev);
 	}
 
+	/* KSZ9031's autonegotiation takes normally 4-5 seconds to complete.
+	 * Occasionally it fails to complete autonegotiation. The workaround is
+	 * to restart it.
+	 */
+        if (phydev->autoneg == AUTONEG_ENABLE) {
+		while (timeout) {
+			if (phy_aneg_done(phydev))
+				break;
+			mdelay(1000);
+			timeout--;
+		};
+
+		if (timeout == 0)
+			phy_restart_aneg(phydev);
+	}
+
 	return 0;
 }