Message ID | E1u3XG6-000EJg-V8@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | net: stmmac: fix setting RE and TE inappropriately | expand |
On Sat, 12 Apr 2025 10:34:42 +0100 Russell King (Oracle) wrote: > Phylink does not permit drivers to mess with the netif carrier, as > this will de-synchronise phylink with the MAC driver. Moreover, > setting and clearing the TE and RE bits via stmmac_mac_set() in this > path is also wrong as the link may not be up. > > Replace the netif_carrier_on(), netif_carrier_off() and > stmmac_mac_set() calls with the appropriate phylink_start() and > phylink_stop() calls, thereby allowing phylink to manage the netif > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > methods. > > Note that RE should only be set after the DMA is ready to avoid the > receive FIFO between the MAC and DMA blocks overflowing, so > phylink_start() needs to be placed after DMA has been started. IIUC this will case a link loss when XDP is installed, if not disregard the reset of the email. Any idea why it's necessary to mess with the link for XDP changes? Is there no way to discard all the traffic and let the queues go idle without dropping the link? I think we should mention in the commit message that the side effect is link loss on XDP on / off. I don't know of any other driver which would need this, stmmac is a real gift..
On Mon, Apr 14, 2025 at 05:43:42PM -0700, Jakub Kicinski wrote: > On Sat, 12 Apr 2025 10:34:42 +0100 Russell King (Oracle) wrote: > > Phylink does not permit drivers to mess with the netif carrier, as > > this will de-synchronise phylink with the MAC driver. Moreover, > > setting and clearing the TE and RE bits via stmmac_mac_set() in this > > path is also wrong as the link may not be up. > > > > Replace the netif_carrier_on(), netif_carrier_off() and > > stmmac_mac_set() calls with the appropriate phylink_start() and > > phylink_stop() calls, thereby allowing phylink to manage the netif > > carrier and TE/RE bits through the .mac_link_up() and .mac_link_down() > > methods. > > > > Note that RE should only be set after the DMA is ready to avoid the > > receive FIFO between the MAC and DMA blocks overflowing, so > > phylink_start() needs to be placed after DMA has been started. > > IIUC this will case a link loss when XDP is installed, if not disregard > the reset of the email. It will, because the author who added XDP support to stmmac decided it was easier to tear everything down and rebuild, which meant (presumably) that it was necessary to use netif_carrier_off() to stop the net layer queueing packets to the driver. I'm just guessing - I know nothing about XDP, and never knowingly used it. > Any idea why it's necessary to mess with the link for XDP changes? Depends what you mean by "link". If you're asking why it messes with netif_carrier_foo(), my best guess is as above. However, phylink drivers are not allowed to mess with the netif_carrier state (as the commit message states.) This is not a new requirement, it's always been this way with phylink, and this pre-dates the addition of XDP to this driver. As long as the code requires the netif_carrier to be turned off, the only way to guarantee that in a phylink using driver is as per this patch. I'm guessing that the reason it does this is because it completely takes down the MAC and tx/rx rings to reprogram everything from scratch, and thus any interference from a packet coming in to be transmitted is going to cause problems. > I think we should mention in the commit message that the side effect is > link loss on XDP on / off. I don't know of any other driver which would > need this, stmmac is a real gift.. I'll add that. However, it would be nice to find a different solution for XDP on this driver.
On Tue, Apr 15, 2025 at 10:54:44AM +0100, Russell King (Oracle) wrote: > On Mon, Apr 14, 2025 at 05:43:42PM -0700, Jakub Kicinski wrote: > > IIUC this will case a link loss when XDP is installed, if not disregard > > the reset of the email. > > It will, because the author who added XDP support to stmmac decided it > was easier to tear everything down and rebuild, which meant (presumably) > that it was necessary to use netif_carrier_off() to stop the net layer > queueing packets to the driver. I'm just guessing - I know nothing > about XDP, and never knowingly used it. > > > Any idea why it's necessary to mess with the link for XDP changes? > > Depends what you mean by "link". If you're asking why it messes with > netif_carrier_foo(), my best guess is as above. However, phylink > drivers are not allowed to mess with the netif_carrier state (as the > commit message states.) This is not a new requirement, it's always > been this way with phylink, and this pre-dates the addition of XDP > to this driver. > > As long as the code requires the netif_carrier to be turned off, the > only way to guarantee that in a phylink using driver is as per this > patch. > > I'm guessing that the reason it does this is because it completely > takes down the MAC and tx/rx rings to reprogram everything from > scratch, and thus any interference from a packet coming in to be > transmitted is going to cause problems. I'd like the "what do you mean by link" clarified before I update the commit message. If you're referring to the carrier state via netif_carrier_off() / netif_carrier_on(), then nothing actually changes in that respect because the carrier manipulation is being done by the driver today, behind phylink's back. That changes to inside phylink with phylink's knowledge. It is my understanding that netif_carrier_off() / netif_carrier_on() get notified to userspace, so this is visible today when XDP changes. If you are referring to the messages that appear on the kernel console, then yes, phylink will print those in addition, which actually makes it more consistent with what's being reported to userspace. Depending which you are referring to changes what I should say in the commit message. E.g. "We retain the changes to carrier state, which are already being reported to userspace as link loss/link gain events, but we gain kernel messages reporting the link state." if you're referring to the carrier state. Or maybe: "This change will have the side effect of printing link messages to the kernel log, even though the physical link hasn't changed state. This matches the carrier state." if you're referring to the additional kernel messages.
On Wed, 16 Apr 2025 19:03:19 +0100 Russell King (Oracle) wrote: > "This change will have the side effect of printing link messages to > the kernel log, even though the physical link hasn't changed state. > This matches the carrier state." So I did misunderstand. I thought we lose physical link. This paragraph looks good, then, it'd correct my guess.
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 59d07d0d3369..24eaabd1445e 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -6922,6 +6922,8 @@ void stmmac_xdp_release(struct net_device *dev) /* Ensure tx function is not running */ netif_tx_disable(dev); + phylink_stop(priv->phylink); + /* Disable NAPI process */ stmmac_disable_all_queues(priv); @@ -6937,14 +6939,10 @@ void stmmac_xdp_release(struct net_device *dev) /* Release and free the Rx/Tx resources */ free_dma_desc_resources(priv, &priv->dma_conf); - /* Disable the MAC Rx/Tx */ - stmmac_mac_set(priv, priv->ioaddr, false); - /* set trans_start so we don't get spurious * watchdogs during reset */ netif_trans_update(dev); - netif_carrier_off(dev); } int stmmac_xdp_open(struct net_device *dev) @@ -7026,25 +7024,25 @@ int stmmac_xdp_open(struct net_device *dev) hrtimer_setup(&tx_q->txtimer, stmmac_tx_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL); } - /* Enable the MAC Rx/Tx */ - stmmac_mac_set(priv, priv->ioaddr, true); - /* Start Rx & Tx DMA Channels */ stmmac_start_all_dma(priv); + phylink_start(priv->phylink); + ret = stmmac_request_irq(dev); if (ret) goto irq_error; /* Enable NAPI process*/ stmmac_enable_all_queues(priv); - netif_carrier_on(dev); netif_tx_start_all_queues(dev); stmmac_enable_all_dma_irq(priv); return 0; irq_error: + phylink_stop(priv->phylink); + for (chan = 0; chan < priv->plat->tx_queues_to_use; chan++) hrtimer_cancel(&priv->dma_conf.tx_queue[chan].txtimer);