diff mbox series

[net,v4] smsc911x: only update stats when interface is up

Message ID 20230329064010.24657-1-wsa+renesas@sang-engineering.com (mailing list archive)
State Under Review
Delegated to: Geert Uytterhoeven
Headers show
Series [net,v4] smsc911x: only update stats when interface is up | expand

Commit Message

Wolfram Sang March 29, 2023, 6:40 a.m. UTC
Otherwise the clocks are not enabled and reading registers will OOPS.
Copy the behaviour from Renesas SH_ETH and use a custom flag because
using netif_running() is racy. A generic solution still needs to be
implemented. Tested on a Renesas APE6-EK.

Fixes: 1e30b8d755b8 ("net: smsc911x: Make Runtime PM handling more fine-grained")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

Changes since v3:
* broken out of a patch series
* don't use netif_running() but a custom flag

 drivers/net/ethernet/smsc/smsc911x.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

Comments

Steen Hegelund March 29, 2023, 7:41 a.m. UTC | #1
Hi Wolfram,

On Wed Mar 29, 2023 at 8:40 AM CEST, Wolfram Sang wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> Otherwise the clocks are not enabled and reading registers will OOPS.
> Copy the behaviour from Renesas SH_ETH and use a custom flag because
> using netif_running() is racy. A generic solution still needs to be
> implemented. Tested on a Renesas APE6-EK.
>
> Fixes: 1e30b8d755b8 ("net: smsc911x: Make Runtime PM handling more fine-grained")
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
> ---
>
> Changes since v3:
> * broken out of a patch series
> * don't use netif_running() but a custom flag
>
>  drivers/net/ethernet/smsc/smsc911x.c | 14 ++++++++++++--
>  1 file changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
> index a690d139e177..af96986cbc88 100644
> --- a/drivers/net/ethernet/smsc/smsc911x.c
> +++ b/drivers/net/ethernet/smsc/smsc911x.c
> @@ -140,6 +140,8 @@ struct smsc911x_data {
>
>         /* clock */
>         struct clk *clk;
> +
> +       bool is_open;
>  };
>
>  /* Easy access to information */
> @@ -1738,6 +1740,8 @@ static int smsc911x_open(struct net_device *dev)
>         smsc911x_reg_write(pdata, TX_CFG, TX_CFG_TX_ON_);
>
>         netif_start_queue(dev);
> +       pdata->is_open = true;
> +
>         return 0;
>
>  irq_stop_out:
> @@ -1778,6 +1782,8 @@ static int smsc911x_stop(struct net_device *dev)
>                 dev->phydev = NULL;
>         }
>         netif_carrier_off(dev);
> +       pdata->is_open = false;
> +
>         pm_runtime_put(dev->dev.parent);
>
>         SMSC_TRACE(pdata, ifdown, "Interface stopped");
> @@ -1841,8 +1847,12 @@ smsc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  static struct net_device_stats *smsc911x_get_stats(struct net_device *dev)
>  {
>         struct smsc911x_data *pdata = netdev_priv(dev);
> -       smsc911x_tx_update_txcounters(dev);
> -       dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP);
> +
> +       if (pdata->is_open) {

Couldn't you just use netif_carrier_ok() here and drop the is_open
variable?

> +               smsc911x_tx_update_txcounters(dev);
> +               dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP);
> +       }
> +
>         return &dev->stats;
>  }
>
> --
> 2.30.2

BR
Steen
Jakub Kicinski March 29, 2023, 7:39 p.m. UTC | #2
On Wed, 29 Mar 2023 08:40:10 +0200 Wolfram Sang wrote:
> Otherwise the clocks are not enabled and reading registers will OOPS.
> Copy the behaviour from Renesas SH_ETH and use a custom flag because
> using netif_running() is racy. A generic solution still needs to be
> implemented. Tested on a Renesas APE6-EK.

Hm, so you opted to not add the flag in the core?
To keep the backport small? I think we should just add it..
Clearly multiple drivers would benefit and it's not a huge change.
Wolfram Sang March 29, 2023, 7:48 p.m. UTC | #3
On Wed, Mar 29, 2023 at 12:39:58PM -0700, Jakub Kicinski wrote:
> On Wed, 29 Mar 2023 08:40:10 +0200 Wolfram Sang wrote:
> > Otherwise the clocks are not enabled and reading registers will OOPS.
> > Copy the behaviour from Renesas SH_ETH and use a custom flag because
> > using netif_running() is racy. A generic solution still needs to be
> > implemented. Tested on a Renesas APE6-EK.
> 
> Hm, so you opted to not add the flag in the core?
> To keep the backport small? I think we should just add it..
> Clearly multiple drivers would benefit and it's not a huge change.

I did it this way for two reasons. First, yes, this is a minimal patch
for backporting. No dependency on core changes, very easy. Second, this
is a solution I could develop quickly. I am interested in finding
another solution, but I guess it needs more time, especially as it
probably touches 15 drivers. I created an action item for it. I hope
I'll be able to work on it somewhen. But for now, I just need the SMSC
bug fixed and need to move to the next issue. If we later have the
generic solution, converting this driver also won't make a lot of a
difference.
Jakub Kicinski March 29, 2023, 8:17 p.m. UTC | #4
On Wed, 29 Mar 2023 21:48:55 +0200 Wolfram Sang wrote:
> > Hm, so you opted to not add the flag in the core?
> > To keep the backport small? I think we should just add it..
> > Clearly multiple drivers would benefit and it's not a huge change.  
> 
> I did it this way for two reasons. First, yes, this is a minimal patch
> for backporting. No dependency on core changes, very easy. Second, this
> is a solution I could develop quickly. I am interested in finding
> another solution, but I guess it needs more time, especially as it
> probably touches 15 drivers. I created an action item for it. I hope
> I'll be able to work on it somewhen. But for now, I just need the SMSC
> bug fixed and need to move to the next issue. If we later have the
> generic solution, converting this driver also won't make a lot of a
> difference.

Okay, core changes aside - does pm_runtime_put() imply an RCU sync?
Otherwise your check in get_stats is racy...
Wolfram Sang March 31, 2023, 6:02 a.m. UTC | #5
> > +       if (pdata->is_open) {
> 
> Couldn't you just use netif_carrier_ok() here and drop the is_open
> variable?

From my research, I can't:

1) netif_carrier_ok() uses __LINK_STATE_NOCARRIER
2) __LINK_STATE_NOCARRIER gets cleared in netif_carrier_on()
3) netif_carrier_on() is this code:

	if (test_and_clear_bit(__LINK_STATE_NOCARRIER, &dev->state)) {
		if (dev->reg_state == NETREG_UNINITIALIZED)
			return;
		atomic_inc(&dev->carrier_up_count);
		linkwatch_fire_event(dev);
		if (netif_running(dev))
			__netdev_watchdog_up(dev);
	}

4) Notice the last if. It checks netif_running(). So, it is possible to
have the carrier on and the device not opened yet.
5) Sadly, no cigar. If I didn't miss something...

But thanks for the suggestion! Happy hacking,

   Wolfram
Wolfram Sang March 31, 2023, 6:33 a.m. UTC | #6
> Okay, core changes aside - does pm_runtime_put() imply an RCU sync?
> Otherwise your check in get_stats is racy...

From some light research, I can't find a trace of RCU sync. Pity, I'll
need to move furhter investigation to later. I'll report back if I have
something, but it may take a while. Thanks for the help!
Jakub Kicinski March 31, 2023, 6:50 a.m. UTC | #7
On Fri, 31 Mar 2023 08:33:24 +0200 Wolfram Sang wrote:
> > Okay, core changes aside - does pm_runtime_put() imply an RCU sync?
> > Otherwise your check in get_stats is racy...  
> 
> From some light research, I can't find a trace of RCU sync. Pity, I'll
> need to move furhter investigation to later. I'll report back if I have
> something, but it may take a while. Thanks for the help!

If you don't want to spend too much time you can just call it yourself
right? :) Put a synchronize_rcu() with a comment about concurrent stat
reading after pdata->is_open = false; and that's it?
Wolfram Sang March 31, 2023, 9:53 a.m. UTC | #8
> > From some light research, I can't find a trace of RCU sync. Pity, I'll
> > need to move furhter investigation to later. I'll report back if I have
> > something, but it may take a while. Thanks for the help!
> 
> If you don't want to spend too much time you can just call it yourself
> right? :) Put a synchronize_rcu() with a comment about concurrent stat
> reading after pdata->is_open = false; and that's it?

Probably. Although I trust you, I would still add code which I a) don't
fully understand myself yet and b) feels somehow wrong because something
like synchronize_rcu() should likely not be in drivers but somewhere
more generic. Given that the bug in this driver went unnoticed for years
(same with the race in other drivers), I do prefer to come back later
with the good feeling that I understood the problem and fixed it
thoroughly.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index a690d139e177..af96986cbc88 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -140,6 +140,8 @@  struct smsc911x_data {
 
 	/* clock */
 	struct clk *clk;
+
+	bool is_open;
 };
 
 /* Easy access to information */
@@ -1738,6 +1740,8 @@  static int smsc911x_open(struct net_device *dev)
 	smsc911x_reg_write(pdata, TX_CFG, TX_CFG_TX_ON_);
 
 	netif_start_queue(dev);
+	pdata->is_open = true;
+
 	return 0;
 
 irq_stop_out:
@@ -1778,6 +1782,8 @@  static int smsc911x_stop(struct net_device *dev)
 		dev->phydev = NULL;
 	}
 	netif_carrier_off(dev);
+	pdata->is_open = false;
+
 	pm_runtime_put(dev->dev.parent);
 
 	SMSC_TRACE(pdata, ifdown, "Interface stopped");
@@ -1841,8 +1847,12 @@  smsc911x_hard_start_xmit(struct sk_buff *skb, struct net_device *dev)
 static struct net_device_stats *smsc911x_get_stats(struct net_device *dev)
 {
 	struct smsc911x_data *pdata = netdev_priv(dev);
-	smsc911x_tx_update_txcounters(dev);
-	dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP);
+
+	if (pdata->is_open) {
+		smsc911x_tx_update_txcounters(dev);
+		dev->stats.rx_dropped += smsc911x_reg_read(pdata, RX_DROP);
+	}
+
 	return &dev->stats;
 }