diff mbox series

usbnet_link_change() fails to call netif_carrier_on()

Message ID m34j43gwto.fsf@t19.piap.pl (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series usbnet_link_change() fails to call netif_carrier_on() | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 3 this patch: 3
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 8 of 8 maintainers
netdev/build_clang success Errors and warnings before: 3 this patch: 3
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 24 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-11-19--18-00 (tests: 789)

Commit Message

Krzysztof Hałasa Nov. 19, 2024, 1:46 p.m. UTC
Hi,

ASIX AX88772B based USB 10/100 Ethernet adapter doesn't come
up ("carrier off"), despite the built-in 100BASE-FX PHY positive link
indication.

The problem appears to be in usbnet.c framework:

void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
{
	/* update link after link is reseted */
	if (link && !need_reset)
		netif_carrier_on(dev->net);
	else
		netif_carrier_off(dev->net);

	if (need_reset && link)
		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
	else
		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
}

I think the author's idea was a bit different than what the code really
ended doing. Especially when called with link = 1 and need_reset = 1.

It seems it may have already caused problems - possible workarounds:
- commit 7be4cb7189f7 ("net: usb: ax88179_178a: improve reset check")
- commit ecf848eb934b ("net: usb: ax88179_178a: fix link status when
  link is set to down/up")
Can't check those due to -ENOHW but ax88179_link_reset() adds
a netif_carrier_on() call on link ups.

Not sure about the "reset" name, though (and the comment in
usbnet_link_change()). It seems it's when the link goes up.

Signed-off-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: ac64995da872 ("usbnet: introduce usbnet_link_change API")
Fixes: 4b49f58fff00 ("usbnet: handle link change")


The code has been introduced in 2013, by 4b49f58fff00 and a bunch of
related commits. Perhaps it visibly affects only AXIS and dm9601
adapters, though.

Comments

Andrew Lunn Nov. 19, 2024, 4:20 p.m. UTC | #1
On Tue, Nov 19, 2024 at 02:46:59PM +0100, Krzysztof Hałasa wrote:
> Hi,
> 
> ASIX AX88772B based USB 10/100 Ethernet adapter doesn't come
> up ("carrier off"), despite the built-in 100BASE-FX PHY positive link
> indication.
> 
> The problem appears to be in usbnet.c framework:
> 
> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
> {
> 	/* update link after link is reseted */
> 	if (link && !need_reset)
> 		netif_carrier_on(dev->net);
> 	else
> 		netif_carrier_off(dev->net);
> 
> 	if (need_reset && link)
> 		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
> 	else
> 		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
> }

static int ax88772_phylink_setup(struct usbnet *dev)
{
        struct asix_common_private *priv = dev->driver_priv;
        phy_interface_t phy_if_mode;
        struct phylink *phylink;

        priv->phylink_config.dev = &dev->net->dev;
        priv->phylink_config.type = PHYLINK_NETDEV;
        priv->phylink_config.mac_capabilities = MAC_SYM_PAUSE | MAC_ASYM_PAUSE |
                MAC_10 | MAC_100;

etc.

This device is using phylink to manage the PHY. phylink will than
manage the carrier. It assumes it is solely responsible for the
carrier. So i think your fix is wrong. You probably should be removing
all code in this driver which touches the carrier.

	Andrew
Krzysztof Hałasa Nov. 21, 2024, 6:51 a.m. UTC | #2
Hi Andrew,
thanks for a looking at this.

Andrew Lunn <andrew@lunn.ch> writes:

>> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
>> {
>>       /* update link after link is reseted */
>>       if (link && !need_reset)
>>               netif_carrier_on(dev->net);
>>       else
>>               netif_carrier_off(dev->net);
>>
>>       if (need_reset && link)
>>               usbnet_defer_kevent(dev, EVENT_LINK_RESET);
>>       else
>>               usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
>> }
>
> This device is using phylink to manage the PHY. phylink will than
> manage the carrier. It assumes it is solely responsible for the
> carrier. So i think your fix is wrong. You probably should be removing
> all code in this driver which touches the carrier.

Ok, I wasn't aware that phylink manages netdev's carrier state.

Then, is the patch wrong just because the asix driver shouldn't use the
function, or is it wrong because the function should work differently
(i.e., the semantics are different)?

Surely the function is broken, isn't it? Calling netif_carrier_off()
on link up event can't be right?


Now the ASIX driver, I'm looking at it for some time now. It consists
of two parts linked together. The ax88172a.c part doesn't use phylink,
while the main asix_devices.c does. So I'm leaving ax88172a.c alone for
now (while it could probably be better ported to the same framework,
i.e., phylink).

The main part uses usbnet.c, which does netif_carrier_{on,off}() in the
above usbnet_link_change(). I guess I can make it use directly
usbnet_defer_kevent() only so it won't be a problem.

Also, usbnet.c calls usbnet_defer_kevent() and thus netif_carrier_off()
in usbnet_probe, removing FLAG_LINK_INTR from asix_devices.c will stop
that.

The last place interacting with carrier status is asix_status(), called
about 8 times a second by usbnet.c intr_complete(). This is independent
of any MDIO traffic. Should I now remove this as well? I guess removing
asix_status would suffice.
Krzysztof Hałasa Nov. 21, 2024, 1:22 p.m. UTC | #3
Hi,

I'm still trying to understand how is it all (phy + phylink) supposed to
work. My adapter uses fixed PHY mode (uses a special SFP module and the
ASIX AX88772B internal PHY, configured by internal SROM memory).

This is not a fixed *MII connection, though. This is a regular clause 22
transceiver, a part of the ASIX MAC IC.

The MDIO registers are initialized (on power-up) to (BMCR) 0x2100 and
(BMSR) 0x780D, meaning autonegotiation is supported but disabled,
100 Mbps FD is selected. Link is established.

Ethtool shows the following:
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
the above is generally true, but:
        Speed: Unknown!
        Duplex: Unknown! (255)
        Auto-negotiation: on <<<<<<<<<<<<<<<<
        Port: Twisted Pair
        PHYAD: 10
        Transceiver: internal
        MDI-X: Unknown
        Link detected: no <<<<<<<<<<<<<<<<

Autonegotiation is definitely off.

Perhaps this code is responsible (in phy_probe()):

	if (!linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT,
			       phydev->supported))
		phydev->autoneg = 0;

Shouldn't it check if the actual PHY BMCR autoneg bit (aka 0.12) isn't
zero, and if it is, set autoneg = 0?


The other part which may be contributing (in genphy_update_link()):

	/* Consider the case that autoneg was started and "aneg complete"
	 * bit has been reset, but "link up" bit not yet.
	 */
	if (phydev->autoneg == AUTONEG_ENABLE && !phydev->autoneg_complete)
		phydev->link = 0;

Since phydev->autoneg apparently means "autoneg is supported", the above
doesn't seem very right.

But I guess phydev->autoneg is supposed to mean "autoneg is actually
enabled".

What do you think?
Andrew Lunn Nov. 21, 2024, 2:25 p.m. UTC | #4
On Thu, Nov 21, 2024 at 07:51:24AM +0100, Krzysztof Hałasa wrote:
> Hi Andrew,
> thanks for a looking at this.
> 
> Andrew Lunn <andrew@lunn.ch> writes:
> 
> >> void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
> >> {
> >>       /* update link after link is reseted */
> >>       if (link && !need_reset)
> >>               netif_carrier_on(dev->net);
> >>       else
> >>               netif_carrier_off(dev->net);
> >>
> >>       if (need_reset && link)
> >>               usbnet_defer_kevent(dev, EVENT_LINK_RESET);
> >>       else
> >>               usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
> >> }
> >
> > This device is using phylink to manage the PHY. phylink will than
> > manage the carrier. It assumes it is solely responsible for the
> > carrier. So i think your fix is wrong. You probably should be removing
> > all code in this driver which touches the carrier.
> 
> Ok, I wasn't aware that phylink manages netdev's carrier state.
> 
> Then, is the patch wrong just because the asix driver shouldn't use the
> function, or is it wrong because the function should work differently
> (i.e., the semantics are different)?
> 
> Surely the function is broken, isn't it? Calling netif_carrier_off()
> on link up event can't be right?
> 
> 
> Now the ASIX driver, I'm looking at it for some time now. It consists
> of two parts linked together. The ax88172a.c part doesn't use phylink,
> while the main asix_devices.c does. So I'm leaving ax88172a.c alone for
> now (while it could probably be better ported to the same framework,
> i.e., phylink).
> 
> The main part uses usbnet.c, which does netif_carrier_{on,off}() in the
> above usbnet_link_change(). I guess I can make it use directly
> usbnet_defer_kevent() only so it won't be a problem.
> 
> Also, usbnet.c calls usbnet_defer_kevent() and thus netif_carrier_off()
> in usbnet_probe, removing FLAG_LINK_INTR from asix_devices.c will stop
> that.
> 
> The last place interacting with carrier status is asix_status(), called
> about 8 times a second by usbnet.c intr_complete(). This is independent
> of any MDIO traffic. Should I now remove this as well? I guess removing
> asix_status would suffice.

I've not looked at this driver in detail, nor usbnet. So i can only
make general comments. I do see there are a number of drivers which
re-invent the wheel and do their own PHY handling, when they should
allow Linux to do it via phylib/phylink.

When the MAC driver is using phylink, or phylib, it should not touch
the carrier, nor access the PHY directly. The exception can be during
probe, when it can turn the carrier off. What the MAC driver should be
doing is exposing its MDIO bus as a Linux MDIO bus. phylib will then
enumerate the bus and find the PHYs on it. The MAC driver which does
not have access to device tree then typically uses phy_find_first() to
find a PHY on the bus, and uses phy_connect() to bind the PHY to the
MAC. The MAC driver then uses phy_start() when the interface is
opened. phylib will poll the PHY for changing in link status, and call
the callback function registered via phy_connect() to let the MAC know
about what the PHY has negotiated. Other than that, the MAC driver
does nothing with the PHY.

It could well be there are historical discrepancies in usbnet, in that
having Linux drive the PHY is somewhat new for usbnet, historically
the wheel was reinvented, and maybe part of that is in the usbnet
core.

	Andrew
diff mbox series

Patch

--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1978,16 +1978,18 @@  EXPORT_SYMBOL(usbnet_manage_power);
 
 void usbnet_link_change(struct usbnet *dev, bool link, bool need_reset)
 {
-	/* update link after link is reseted */
-	if (link && !need_reset)
+	if (link)
 		netif_carrier_on(dev->net);
 	else
 		netif_carrier_off(dev->net);
 
-	if (need_reset && link)
-		usbnet_defer_kevent(dev, EVENT_LINK_RESET);
-	else
-		usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
+	if (need_reset) {
+		/* update link after link is reset */
+		if (link)
+			usbnet_defer_kevent(dev, EVENT_LINK_RESET);
+		else
+			usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
+	}
 }
 EXPORT_SYMBOL(usbnet_link_change);