Message ID | 20230911082016.3694700-1-yajun.deng@linux.dev (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/core: Export dev_core_stats_rx_dropped_inc sets | expand |
On Mon, 11 Sep 2023 16:20:16 +0800 Yajun Deng <yajun.deng@linux.dev> wrote: > Although there is a kfree_skb_reason() helper function that can be > used to find the reason for dropped packets, but most callers didn't > increase one of rx_dropped, tx_dropped, rx_nohandler and > rx_otherhost_dropped. > > For the users, people are more concerned about why the dropped in > ifconfig is increasing. So we can export > dev_core_stats_rx_dropped_inc sets, which users would trace them know > why rx_dropped is increasing. ifconfig has been frozen for over 10 years, and is deprecated so there is no point in catering to legacy api's. There are better API's such as ethtool and netlink that can provide more info.
September 12, 2023 at 12:15 AM, "Stephen Hemminger" <stephen@networkplumber.org> wrote: > > On Mon, 11 Sep 2023 16:20:16 +0800 > Yajun Deng <yajun.deng@linux.dev> wrote: > > > > > Although there is a kfree_skb_reason() helper function that can be > > used to find the reason for dropped packets, but most callers didn't > > increase one of rx_dropped, tx_dropped, rx_nohandler and > > rx_otherhost_dropped. > > > > For the users, people are more concerned about why the dropped in > > ifconfig is increasing. So we can export > > dev_core_stats_rx_dropped_inc sets, which users would trace them know > > why rx_dropped is increasing. > > > > ifconfig has been frozen for over 10 years, and is deprecated so there > is no point in catering to legacy api's. There are better API's such as > ethtool and netlink that can provide more info. > Yes, ifconfig is deprecated. but the dropped in ifconfig and ip is the same. We're more concerned about the reason for dropped packets. ip, ethtool and netlink couldn't show the reason.
On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: > > Although there is a kfree_skb_reason() helper function that can be used > to find the reason for dropped packets, but most callers didn't increase > one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. > > For the users, people are more concerned about why the dropped in ifconfig > is increasing. So we can export dev_core_stats_rx_dropped_inc sets, > which users would trace them know why rx_dropped is increasing. > > Export dev_core_stats_{rx_dropped, tx_dropped, rx_nohandler, > rx_otherhost_dropped}_inc for trace. Also, move dev_core_stats() > and netdev_core_stats_alloc() in dev.c, because they are not called > externally. > > Signed-off-by: Yajun Deng <yajun.deng@linux.dev> Okay, but it seems you forgot to say which tree was targeted by this patch. Documentation/process/maintainer-netdev.rst I would guess net-next, but patch authors are supposed to be explicit. > --- > include/linux/netdevice.h | 32 +++++--------------------------- > net/core/dev.c | 30 ++++++++++++++++++++++++++++-- > 2 files changed, 33 insertions(+), 29 deletions(-) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 0896aaa91dd7..879b01c85ba4 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -3954,6 +3954,11 @@ int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb); > bool is_skb_forwardable(const struct net_device *dev, > const struct sk_buff *skb); > > +void dev_core_stats_rx_dropped_inc(struct net_device *dev); > +void dev_core_stats_tx_dropped_inc(struct net_device *dev); > +void dev_core_stats_rx_nohandler_inc(struct net_device *dev); > +void dev_core_stats_rx_otherhost_dropped_inc(struct net_device *dev); > + > static __always_inline bool __is_skb_forwardable(const struct net_device *dev, > const struct sk_buff *skb, > const bool check_mtu) > @@ -3980,33 +3985,6 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev, > return false; > } > > -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev); > - > -static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > -{ > - /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > - struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > - > - if (likely(p)) > - return p; > - > - return netdev_core_stats_alloc(dev); > -} > - > -#define DEV_CORE_STATS_INC(FIELD) \ > -static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ > -{ \ > - struct net_device_core_stats __percpu *p; \ > - \ > - p = dev_core_stats(dev); \ > - if (p) \ > - this_cpu_inc(p->FIELD); \ > -} > -DEV_CORE_STATS_INC(rx_dropped) > -DEV_CORE_STATS_INC(tx_dropped) > -DEV_CORE_STATS_INC(rx_nohandler) > -DEV_CORE_STATS_INC(rx_otherhost_dropped) > - > static __always_inline int ____dev_forward_skb(struct net_device *dev, > struct sk_buff *skb, > const bool check_mtu) > diff --git a/net/core/dev.c b/net/core/dev.c > index ccff2b6ef958..32ba730405b4 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -10475,7 +10475,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64, > } > EXPORT_SYMBOL(netdev_stats_to_stats64); > > -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > { > struct net_device_core_stats __percpu *p; > > @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device > /* This READ_ONCE() pairs with the cmpxchg() above */ > return READ_ONCE(dev->core_stats); > } > -EXPORT_SYMBOL(netdev_core_stats_alloc); > + > +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) Please remove this inline attritbute. Consider using __cold instead. > +{ > + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > + > + if (likely(p)) > + return p; > + > + return netdev_core_stats_alloc(dev); > +} > + > +#define DEV_CORE_STATS_INC(FIELD) \ > +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ > +{ \ > + struct net_device_core_stats __percpu *p; \ > + \ > + p = dev_core_stats(dev); \ > + if (p) \ > + this_cpu_inc(p->FIELD); \ > +} \ > +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc) > + > +DEV_CORE_STATS_INC(rx_dropped); > +DEV_CORE_STATS_INC(tx_dropped); > +DEV_CORE_STATS_INC(rx_nohandler); > +DEV_CORE_STATS_INC(rx_otherhost_dropped); #undef DEV_CORE_STATS_INC > > /** > * dev_get_stats - get network device statistics > -- > 2.25.1 >
From: Eric Dumazet <edumazet@google.com> Date: Tue, 12 Sep 2023 06:23:24 +0200 > On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: >> >> Although there is a kfree_skb_reason() helper function that can be used >> to find the reason for dropped packets, but most callers didn't increase >> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. [...] >> EXPORT_SYMBOL(netdev_stats_to_stats64); >> >> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >> { >> struct net_device_core_stats __percpu *p; >> >> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device >> /* This READ_ONCE() pairs with the cmpxchg() above */ >> return READ_ONCE(dev->core_stats); >> } >> -EXPORT_SYMBOL(netdev_core_stats_alloc); >> + >> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > > Please remove this inline attritbute. Consider using __cold instead. __cold? O_o I thought the author's inlining it as it's a couple locs/intstructions, while the compilers would most likely keep it non-inlined as it's referenced 4 times. __cold will for sure keep it standalone and place it in .text.cold, i.e. far away from the call sites. I realize dev_core_stats_*() aren't called frequently, but why making only one small helper cold rather than all of them then? > >> +{ >> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >> + >> + if (likely(p)) >> + return p; >> + >> + return netdev_core_stats_alloc(dev); >> +} [...] Thanks, Olek
On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > Date: Tue, 12 Sep 2023 06:23:24 +0200 > > > On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: > >> > >> Although there is a kfree_skb_reason() helper function that can be used > >> to find the reason for dropped packets, but most callers didn't increase > >> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. > > [...] > > >> EXPORT_SYMBOL(netdev_stats_to_stats64); > >> > >> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >> { > >> struct net_device_core_stats __percpu *p; > >> > >> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device > >> /* This READ_ONCE() pairs with the cmpxchg() above */ > >> return READ_ONCE(dev->core_stats); > >> } > >> -EXPORT_SYMBOL(netdev_core_stats_alloc); > >> + > >> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > > > > Please remove this inline attritbute. Consider using __cold instead. > > __cold? O_o I thought the author's inlining it as it's a couple > locs/intstructions, while the compilers would most likely keep it > non-inlined as it's referenced 4 times. __cold will for sure keep it > standalone and place it in .text.cold, i.e. far away from the call sites. > I realize dev_core_stats_*() aren't called frequently, but why making > only one small helper cold rather than all of them then? > This helper is used at least one time per netdevice lifetime. This is definitely cold. Forcing an inline makes no sense, this would duplicate the code four times, for absolutely no gain. > > > >> +{ > >> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > >> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > >> + > >> + if (likely(p)) > >> + return p; > >> + > >> + return netdev_core_stats_alloc(dev); > >> +} > > [...] > > Thanks, > Olek
From: Yajun Deng <yajun.deng@linux.dev> Date: Mon, 11 Sep 2023 16:20:16 +0800 > Although there is a kfree_skb_reason() helper function that can be used > to find the reason for dropped packets, but most callers didn't increase > one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. [...] > diff --git a/net/core/dev.c b/net/core/dev.c > index ccff2b6ef958..32ba730405b4 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -10475,7 +10475,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64, > } > EXPORT_SYMBOL(netdev_stats_to_stats64); > > -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > { > struct net_device_core_stats __percpu *p; > > @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device > /* This READ_ONCE() pairs with the cmpxchg() above */ > return READ_ONCE(dev->core_stats); > } > -EXPORT_SYMBOL(netdev_core_stats_alloc); > + > +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > +{ > + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > + > + if (likely(p)) > + return p; > + > + return netdev_core_stats_alloc(dev); > +} > + > +#define DEV_CORE_STATS_INC(FIELD) \ > +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ > +{ \ > + struct net_device_core_stats __percpu *p; \ > + \ > + p = dev_core_stats(dev); \ > + if (p) \ > + this_cpu_inc(p->FIELD); \ > +} \ > +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc) > + > +DEV_CORE_STATS_INC(rx_dropped); > +DEV_CORE_STATS_INC(tx_dropped); > +DEV_CORE_STATS_INC(rx_nohandler); > +DEV_CORE_STATS_INC(rx_otherhost_dropped); I realize you need to have an external function to be able to trace it, but why don't you make it just 1 function instead of 4+ (will only be increasing)? Define 1 function void dev_core_stats_inc(struct net_device *dev, u32 offset) { struct net_device_core_stats __percpu *p; p = dev_core_stats(dev); if (p) this_cpu_inc(*(unsigned long *)(void *)p + offset); } EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW? And then build inlines: #define DEV_CORE_STATS_INC(FIELD) \ static inline void \ dev_core_stats_##FIELD##_inc(struct net_device *dev) \ { \ dev_core_stats_inc(dev, \ offsetof(struct net_device_core_stats, FIELD)); \ } DEV_CORE_STATS_INC(rx_dropped); ... OR even just make them macros #define __DEV_CORE_STATS_INC(dev, field) \ dev_core_stats_inc(dev, \ offsetof(struct net_device_core_stats, field)) #define dev_core_stats_rx_dropped_inc(dev) \ __DEV_CORE_STATS_INC(dev, rx_dropped) ... Just don't copy that awful Thunderbird's line wrap and don't assume this code builds and works and that is something finished/polished. You'll be able to trace functions and you'll be able to understand which counter has been incremented by checking the second argument, i.e. the field offset (IIRC tracing shows you arguments). And that way you wouldn't geometrically increase the number of symbol exports and deal with its consequences. > > /** > * dev_get_stats - get network device statistics Thanks, Olek
From: Eric Dumazet <edumazet@google.com> Date: Tue, 12 Sep 2023 18:04:44 +0200 > On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin > <aleksander.lobakin@intel.com> wrote: >> >> From: Eric Dumazet <edumazet@google.com> >> Date: Tue, 12 Sep 2023 06:23:24 +0200 >> >>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: >>>> >>>> Although there is a kfree_skb_reason() helper function that can be used >>>> to find the reason for dropped packets, but most callers didn't increase >>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. >> >> [...] >> >>>> EXPORT_SYMBOL(netdev_stats_to_stats64); >>>> >>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >>>> { >>>> struct net_device_core_stats __percpu *p; >>>> >>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device >>>> /* This READ_ONCE() pairs with the cmpxchg() above */ >>>> return READ_ONCE(dev->core_stats); >>>> } >>>> -EXPORT_SYMBOL(netdev_core_stats_alloc); >>>> + >>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) >>> >>> Please remove this inline attritbute. Consider using __cold instead. >> >> __cold? O_o I thought the author's inlining it as it's a couple >> locs/intstructions, while the compilers would most likely keep it >> non-inlined as it's referenced 4 times. __cold will for sure keep it >> standalone and place it in .text.cold, i.e. far away from the call sites. >> I realize dev_core_stats_*() aren't called frequently, but why making >> only one small helper cold rather than all of them then? >> > > This helper is used at least one time per netdevice lifetime. > This is definitely cold. But then each dev_stats_*_inc() (not cold) has to call it from a completely different piece of .text far from their. I either don't understand the idea or dunno. Why not make them cold as well then? > Forcing an inline makes no sense, this would duplicate the code four times, > for absolutely no gain. I'd love to see bloat-o-meter numbers, I suspect we're talking about 20-30 bytes. > >>> >>>> +{ >>>> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >>>> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >>>> + >>>> + if (likely(p)) >>>> + return p; >>>> + >>>> + return netdev_core_stats_alloc(dev); >>>> +} >> >> [...] >> >> Thanks, >> Olek Thanks, Olek
On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > Date: Tue, 12 Sep 2023 18:04:44 +0200 > > > On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin > > <aleksander.lobakin@intel.com> wrote: > >> > >> From: Eric Dumazet <edumazet@google.com> > >> Date: Tue, 12 Sep 2023 06:23:24 +0200 > >> > >>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: > >>>> > >>>> Although there is a kfree_skb_reason() helper function that can be used > >>>> to find the reason for dropped packets, but most callers didn't increase > >>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. > >> > >> [...] > >> > >>>> EXPORT_SYMBOL(netdev_stats_to_stats64); > >>>> > >>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >>>> { > >>>> struct net_device_core_stats __percpu *p; > >>>> > >>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device > >>>> /* This READ_ONCE() pairs with the cmpxchg() above */ > >>>> return READ_ONCE(dev->core_stats); > >>>> } > >>>> -EXPORT_SYMBOL(netdev_core_stats_alloc); > >>>> + > >>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > >>> > >>> Please remove this inline attritbute. Consider using __cold instead. > >> > >> __cold? O_o I thought the author's inlining it as it's a couple > >> locs/intstructions, while the compilers would most likely keep it > >> non-inlined as it's referenced 4 times. __cold will for sure keep it > >> standalone and place it in .text.cold, i.e. far away from the call sites. > >> I realize dev_core_stats_*() aren't called frequently, but why making > >> only one small helper cold rather than all of them then? > >> > > > > This helper is used at least one time per netdevice lifetime. > > This is definitely cold. > > But then each dev_stats_*_inc() (not cold) has to call it from a > completely different piece of .text far from their. I either don't > understand the idea or dunno. Why not make them cold as well then? > The __cold attribute is only applied to the helper _allocating_ the memory, once. Not on the functions actually incrementing the stats. There are situations where they can be called thousands/millions of times per second (incast flood). If this situation happens, the _allocation_ still happens once. > > Forcing an inline makes no sense, this would duplicate the code four times, > > for absolutely no gain. > > I'd love to see bloat-o-meter numbers, I suspect we're talking about > 20-30 bytes. > > > > >>> > >>>> +{ > >>>> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > >>>> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > >>>> + > >>>> + if (likely(p)) > >>>> + return p; > >>>> + > >>>> + return netdev_core_stats_alloc(dev); > >>>> +} > >> > >> [...] > >> > >> Thanks, > >> Olek > > Thanks, > Olek
From: Eric Dumazet <edumazet@google.com> Date: Tue, 12 Sep 2023 19:28:50 +0200 > On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin > <aleksander.lobakin@intel.com> wrote: >> >> From: Eric Dumazet <edumazet@google.com> >> Date: Tue, 12 Sep 2023 18:04:44 +0200 >> >>> On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin >>> <aleksander.lobakin@intel.com> wrote: >>>> >>>> From: Eric Dumazet <edumazet@google.com> >>>> Date: Tue, 12 Sep 2023 06:23:24 +0200 >>>> >>>>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: >>>>>> >>>>>> Although there is a kfree_skb_reason() helper function that can be used >>>>>> to find the reason for dropped packets, but most callers didn't increase >>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. >>>> >>>> [...] >>>> >>>>>> EXPORT_SYMBOL(netdev_stats_to_stats64); >>>>>> >>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >>>>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) >>>>>> { >>>>>> struct net_device_core_stats __percpu *p; >>>>>> >>>>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device >>>>>> /* This READ_ONCE() pairs with the cmpxchg() above */ >>>>>> return READ_ONCE(dev->core_stats); >>>>>> } >>>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc); >>>>>> + >>>>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) >>>>> >>>>> Please remove this inline attritbute. Consider using __cold instead. >>>> >>>> __cold? O_o I thought the author's inlining it as it's a couple >>>> locs/intstructions, while the compilers would most likely keep it >>>> non-inlined as it's referenced 4 times. __cold will for sure keep it >>>> standalone and place it in .text.cold, i.e. far away from the call sites. >>>> I realize dev_core_stats_*() aren't called frequently, but why making >>>> only one small helper cold rather than all of them then? >>>> >>> >>> This helper is used at least one time per netdevice lifetime. >>> This is definitely cold. >> >> But then each dev_stats_*_inc() (not cold) has to call it from a >> completely different piece of .text far from their. I either don't >> understand the idea or dunno. Why not make them cold as well then? >> > > The __cold attribute is only applied to the helper _allocating_ the > memory, once. Then it should be applied to netdev_core_stats_alloc(), not dev_core_stats(). The latter only dereferences the already existing pointer or calls the former, which actually does the allocation. That's why I don't get why make one if/else non-inline or even cold. > > Not on the functions actually incrementing the stats. > > There are situations where they can be called thousands/millions of > times per second (incast flood). > If this situation happens, the _allocation_ still happens once. Correct, but dev_core_stats() will be called the same millions of times per second, see above. It's called unconditionally each increment. So seems like I got the idea of .cold correctly, but you were referring to the wrong function. > > > >>> Forcing an inline makes no sense, this would duplicate the code four times, >>> for absolutely no gain. >> >> I'd love to see bloat-o-meter numbers, I suspect we're talking about >> 20-30 bytes. >> >>> >>>>> >>>>>> +{ >>>>>> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >>>>>> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >>>>>> + >>>>>> + if (likely(p)) >>>>>> + return p; >>>>>> + >>>>>> + return netdev_core_stats_alloc(dev); >>>>>> +} >>>> >>>> [...] >>>> >>>> Thanks, >>>> Olek >> >> Thanks, >> Olek Thanks, Olek
On Tue, Sep 12, 2023 at 7:44 PM Alexander Lobakin <aleksander.lobakin@intel.com> wrote: > > From: Eric Dumazet <edumazet@google.com> > Date: Tue, 12 Sep 2023 19:28:50 +0200 > > > On Tue, Sep 12, 2023 at 7:16 PM Alexander Lobakin > > <aleksander.lobakin@intel.com> wrote: > >> > >> From: Eric Dumazet <edumazet@google.com> > >> Date: Tue, 12 Sep 2023 18:04:44 +0200 > >> > >>> On Tue, Sep 12, 2023 at 5:58 PM Alexander Lobakin > >>> <aleksander.lobakin@intel.com> wrote: > >>>> > >>>> From: Eric Dumazet <edumazet@google.com> > >>>> Date: Tue, 12 Sep 2023 06:23:24 +0200 > >>>> > >>>>> On Mon, Sep 11, 2023 at 10:20 AM Yajun Deng <yajun.deng@linux.dev> wrote: > >>>>>> > >>>>>> Although there is a kfree_skb_reason() helper function that can be used > >>>>>> to find the reason for dropped packets, but most callers didn't increase > >>>>>> one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. > >>>> > >>>> [...] > >>>> > >>>>>> EXPORT_SYMBOL(netdev_stats_to_stats64); > >>>>>> > >>>>>> -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >>>>>> +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) > >>>>>> { > >>>>>> struct net_device_core_stats __percpu *p; > >>>>>> > >>>>>> @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device > >>>>>> /* This READ_ONCE() pairs with the cmpxchg() above */ > >>>>>> return READ_ONCE(dev->core_stats); > >>>>>> } > >>>>>> -EXPORT_SYMBOL(netdev_core_stats_alloc); > >>>>>> + > >>>>>> +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) > >>>>> > >>>>> Please remove this inline attritbute. Consider using __cold instead. > >>>> > >>>> __cold? O_o I thought the author's inlining it as it's a couple > >>>> locs/intstructions, while the compilers would most likely keep it > >>>> non-inlined as it's referenced 4 times. __cold will for sure keep it > >>>> standalone and place it in .text.cold, i.e. far away from the call sites. > >>>> I realize dev_core_stats_*() aren't called frequently, but why making > >>>> only one small helper cold rather than all of them then? > >>>> > >>> > >>> This helper is used at least one time per netdevice lifetime. > >>> This is definitely cold. > >> > >> But then each dev_stats_*_inc() (not cold) has to call it from a > >> completely different piece of .text far from their. I either don't > >> understand the idea or dunno. Why not make them cold as well then? > >> > > > > The __cold attribute is only applied to the helper _allocating_ the > > memory, once. > > Then it should be applied to netdev_core_stats_alloc(), not > dev_core_stats(). The latter only dereferences the already existing > pointer or calls the former, which actually does the allocation. > That's why I don't get why make one if/else non-inline or even cold. Sure, this was what was suggested (perhaps not _very_ precisely, but the general idea was pretty clear). v2 seems ok, right ? It seems we are all on the same page. +static __cold struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) +{ + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); + + if (likely(p)) + return p; + + return netdev_core_stats_alloc(dev); +} + +#define DEV_CORE_STATS_INC(FIELD) \ +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ +{ \ + struct net_device_core_stats __percpu *p; \ + \ + p = dev_core_stats(dev); \ + if (p) \ + this_cpu_inc(p->FIELD); \ +} \ +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc)
On Tue, Sep 12, 2023 at 8:03 PM Eric Dumazet <edumazet@google.com> wrote: > Sure, this was what was suggested (perhaps not _very_ precisely, but > the general idea was pretty clear). > v2 seems ok, right ? > > It seems we are all on the same page. > > +static __cold struct net_device_core_stats __percpu > *dev_core_stats(struct net_device *dev) > +{ > + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ > + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); > + > + if (likely(p)) > + return p; > + > + return netdev_core_stats_alloc(dev); > +} > + > +#define DEV_CORE_STATS_INC(FIELD) \ > +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ > +{ \ > + struct net_device_core_stats __percpu *p; \ > + \ > + p = dev_core_stats(dev); \ > + if (p) \ > + this_cpu_inc(p->FIELD); \ > +} \ > +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc) Oh well, I just read the patch, and it seems wrong indeed. netdev_core_stats_alloc() is the one that can be cold.
On 2023/9/13 02:05, Eric Dumazet wrote: > On Tue, Sep 12, 2023 at 8:03 PM Eric Dumazet <edumazet@google.com> wrote: > >> Sure, this was what was suggested (perhaps not _very_ precisely, but >> the general idea was pretty clear). >> v2 seems ok, right ? >> >> It seems we are all on the same page. >> >> +static __cold struct net_device_core_stats __percpu >> *dev_core_stats(struct net_device *dev) >> +{ >> + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ >> + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); >> + >> + if (likely(p)) >> + return p; >> + >> + return netdev_core_stats_alloc(dev); >> +} >> + >> +#define DEV_CORE_STATS_INC(FIELD) \ >> +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ >> +{ \ >> + struct net_device_core_stats __percpu *p; \ >> + \ >> + p = dev_core_stats(dev); \ >> + if (p) \ >> + this_cpu_inc(p->FIELD); \ >> +} \ >> +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc) > Oh well, I just read the patch, and it seems wrong indeed. > > netdev_core_stats_alloc() is the one that can be cold. Okay, I would add __cold to netdev_core_stats_alloc() in v3. Olek suggest that define a new dev_core_stats_inc() function. I hope to see the suggestion in another reply.
From: Yajun Deng <yajun.deng@linux.dev> Date: Wed, 13 Sep 2023 10:08:08 +0800 > > On 2023/9/13 00:22, Alexander Lobakin wrote: >> From: Yajun Deng <yajun.deng@linux.dev> >> Date: Mon, 11 Sep 2023 16:20:16 +0800 [...] >> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW? > > This may be a better option. > > Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before, but I think > > EXPORT_SYMBOL_GPL is better. Ah I see. BTW, if you will still define increment functions as externals, there will be no reason to export netdev_core_stats_alloc() or even make it non-static at all. > > >> And then build inlines: >> >> #define DEV_CORE_STATS_INC(FIELD) \ >> static inline void \ >> dev_core_stats_##FIELD##_inc(struct net_device *dev) \ >> { \ >> dev_core_stats_inc(dev, \ >> offsetof(struct net_device_core_stats, FIELD)); \ >> } >> >> DEV_CORE_STATS_INC(rx_dropped); >> ... >> >> OR even just make them macros >> >> #define __DEV_CORE_STATS_INC(dev, field) \ >> dev_core_stats_inc(dev, \ >> offsetof(struct net_device_core_stats, field)) >> >> #define dev_core_stats_rx_dropped_inc(dev) \ >> __DEV_CORE_STATS_INC(dev, rx_dropped) >> ... > > I would like the former. Keep it the same as before. By "the former" you mean to build static inlines or externals? Seems like the first one, but I got confused by your "the same as before" :D > > >> Just don't copy that awful Thunderbird's line wrap and don't assume this >> code builds and works and that is something finished/polished. >> >> You'll be able to trace functions and you'll be able to understand which >> counter has been incremented by checking the second argument, i.e. the >> field offset (IIRC tracing shows you arguments). >> And that way you wouldn't geometrically increase the number of symbol >> exports and deal with its consequences. > I agree that. Ok, after this one I guess you meant "I'd like to use your approach with static inlines". >>> >>> /** >>> * dev_get_stats - get network device statistics >> Thanks, >> Olek Thanks, Olek
On 2023/9/13 17:58, Alexander Lobakin wrote: > From: Yajun Deng <yajun.deng@linux.dev> > Date: Wed, 13 Sep 2023 10:08:08 +0800 > >> On 2023/9/13 00:22, Alexander Lobakin wrote: >>> From: Yajun Deng <yajun.deng@linux.dev> >>> Date: Mon, 11 Sep 2023 16:20:16 +0800 > [...] > >>> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW? >> This may be a better option. >> >> Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before, but I think >> >> EXPORT_SYMBOL_GPL is better. > Ah I see. BTW, if you will still define increment functions as > externals, there will be no reason to export netdev_core_stats_alloc() > or even make it non-static at all. > >> >>> And then build inlines: >>> >>> #define DEV_CORE_STATS_INC(FIELD) \ >>> static inline void \ >>> dev_core_stats_##FIELD##_inc(struct net_device *dev) \ >>> { \ >>> dev_core_stats_inc(dev, \ >>> offsetof(struct net_device_core_stats, FIELD)); \ >>> } >>> >>> DEV_CORE_STATS_INC(rx_dropped); >>> ... >>> >>> OR even just make them macros >>> >>> #define __DEV_CORE_STATS_INC(dev, field) \ >>> dev_core_stats_inc(dev, \ >>> offsetof(struct net_device_core_stats, field)) >>> >>> #define dev_core_stats_rx_dropped_inc(dev) \ >>> __DEV_CORE_STATS_INC(dev, rx_dropped) >>> ... >> I would like the former. Keep it the same as before. > By "the former" you mean to build static inlines or externals? Seems > like the first one, but I got confused by your "the same as before" :D > >> >>> Just don't copy that awful Thunderbird's line wrap and don't assume this >>> code builds and works and that is something finished/polished. >>> >>> You'll be able to trace functions and you'll be able to understand which >>> counter has been incremented by checking the second argument, i.e. the >>> field offset (IIRC tracing shows you arguments). >>> And that way you wouldn't geometrically increase the number of symbol >>> exports and deal with its consequences. >> I agree that. > Ok, after this one I guess you meant "I'd like to use your approach with > static inlines". Finally, I give up this approach. The new function dev_core_stats_inc() didn't called by external modules directly. So EXPORT_SYMBOL_GPL(dev_core_stats_inc) can be removed by anyone. >>>> >>>> /** >>>> * dev_get_stats - get network device statistics >>> Thanks, >>> Olek > Thanks, > Olek
From: Yajun Deng <yajun.deng@linux.dev> Date: Thu, 14 Sep 2023 10:44:14 +0800 > > On 2023/9/13 17:58, Alexander Lobakin wrote: >> From: Yajun Deng <yajun.deng@linux.dev> >> Date: Wed, 13 Sep 2023 10:08:08 +0800 >> >>> On 2023/9/13 00:22, Alexander Lobakin wrote: >>>> From: Yajun Deng <yajun.deng@linux.dev> >>>> Date: Mon, 11 Sep 2023 16:20:16 +0800 >> [...] >> >>>> EXPORT_SYMBOL_GPL(dev_core_stats_inc); // Why not GPL BTW? >>> This may be a better option. >>> >>> Just because EXPORT_SYMBOL(netdev_core_stats_alloc) before, but I think >>> >>> EXPORT_SYMBOL_GPL is better. >> Ah I see. BTW, if you will still define increment functions as >> externals, there will be no reason to export netdev_core_stats_alloc() >> or even make it non-static at all. >> >>> >>>> And then build inlines: >>>> >>>> #define DEV_CORE_STATS_INC(FIELD) \ >>>> static inline void \ >>>> dev_core_stats_##FIELD##_inc(struct net_device *dev) \ >>>> { \ >>>> dev_core_stats_inc(dev, \ >>>> offsetof(struct net_device_core_stats, FIELD)); \ >>>> } >>>> >>>> DEV_CORE_STATS_INC(rx_dropped); >>>> ... >>>> >>>> OR even just make them macros >>>> >>>> #define __DEV_CORE_STATS_INC(dev, field) \ >>>> dev_core_stats_inc(dev, \ >>>> offsetof(struct net_device_core_stats, field)) >>>> >>>> #define dev_core_stats_rx_dropped_inc(dev) \ >>>> __DEV_CORE_STATS_INC(dev, rx_dropped) >>>> ... >>> I would like the former. Keep it the same as before. >> By "the former" you mean to build static inlines or externals? Seems >> like the first one, but I got confused by your "the same as before" :D >> >>> >>>> Just don't copy that awful Thunderbird's line wrap and don't assume >>>> this >>>> code builds and works and that is something finished/polished. >>>> >>>> You'll be able to trace functions and you'll be able to understand >>>> which >>>> counter has been incremented by checking the second argument, i.e. the >>>> field offset (IIRC tracing shows you arguments). >>>> And that way you wouldn't geometrically increase the number of symbol >>>> exports and deal with its consequences. >>> I agree that. >> Ok, after this one I guess you meant "I'd like to use your approach with >> static inlines". > > Finally, I give up this approach. > > The new function dev_core_stats_inc() didn't called by external modules > directly. If it's called via an inline or macro or whatever, it still needs to be exported. Double-check that modpost doesn't complain on allmodconfig build. > > So EXPORT_SYMBOL_GPL(dev_core_stats_inc) can be removed by anyone. That doesn't mean it won't be needed tomorrow. And I don't feel like it's a good excuse to define 1 external function per counter instead of 1 external + static inlines for the rest. It's not only about the exports. Esp. given that I wrote almost the whole code needed for it to work in one of my previous replies. If you don't want to do that, I could take it over xD > > >>>>> /** >>>>> * dev_get_stats - get network device statistics >>>> Thanks, >>>> Olek >> Thanks, >> Olek Thanks, Olek
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 0896aaa91dd7..879b01c85ba4 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3954,6 +3954,11 @@ int dev_forward_skb_nomtu(struct net_device *dev, struct sk_buff *skb); bool is_skb_forwardable(const struct net_device *dev, const struct sk_buff *skb); +void dev_core_stats_rx_dropped_inc(struct net_device *dev); +void dev_core_stats_tx_dropped_inc(struct net_device *dev); +void dev_core_stats_rx_nohandler_inc(struct net_device *dev); +void dev_core_stats_rx_otherhost_dropped_inc(struct net_device *dev); + static __always_inline bool __is_skb_forwardable(const struct net_device *dev, const struct sk_buff *skb, const bool check_mtu) @@ -3980,33 +3985,6 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev, return false; } -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev); - -static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) -{ - /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ - struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); - - if (likely(p)) - return p; - - return netdev_core_stats_alloc(dev); -} - -#define DEV_CORE_STATS_INC(FIELD) \ -static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ -{ \ - struct net_device_core_stats __percpu *p; \ - \ - p = dev_core_stats(dev); \ - if (p) \ - this_cpu_inc(p->FIELD); \ -} -DEV_CORE_STATS_INC(rx_dropped) -DEV_CORE_STATS_INC(tx_dropped) -DEV_CORE_STATS_INC(rx_nohandler) -DEV_CORE_STATS_INC(rx_otherhost_dropped) - static __always_inline int ____dev_forward_skb(struct net_device *dev, struct sk_buff *skb, const bool check_mtu) diff --git a/net/core/dev.c b/net/core/dev.c index ccff2b6ef958..32ba730405b4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -10475,7 +10475,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64, } EXPORT_SYMBOL(netdev_stats_to_stats64); -struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) +static struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev) { struct net_device_core_stats __percpu *p; @@ -10488,7 +10488,33 @@ struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device /* This READ_ONCE() pairs with the cmpxchg() above */ return READ_ONCE(dev->core_stats); } -EXPORT_SYMBOL(netdev_core_stats_alloc); + +static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev) +{ + /* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */ + struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats); + + if (likely(p)) + return p; + + return netdev_core_stats_alloc(dev); +} + +#define DEV_CORE_STATS_INC(FIELD) \ +void dev_core_stats_##FIELD##_inc(struct net_device *dev) \ +{ \ + struct net_device_core_stats __percpu *p; \ + \ + p = dev_core_stats(dev); \ + if (p) \ + this_cpu_inc(p->FIELD); \ +} \ +EXPORT_SYMBOL(dev_core_stats_##FIELD##_inc) + +DEV_CORE_STATS_INC(rx_dropped); +DEV_CORE_STATS_INC(tx_dropped); +DEV_CORE_STATS_INC(rx_nohandler); +DEV_CORE_STATS_INC(rx_otherhost_dropped); /** * dev_get_stats - get network device statistics
Although there is a kfree_skb_reason() helper function that can be used to find the reason for dropped packets, but most callers didn't increase one of rx_dropped, tx_dropped, rx_nohandler and rx_otherhost_dropped. For the users, people are more concerned about why the dropped in ifconfig is increasing. So we can export dev_core_stats_rx_dropped_inc sets, which users would trace them know why rx_dropped is increasing. Export dev_core_stats_{rx_dropped, tx_dropped, rx_nohandler, rx_otherhost_dropped}_inc for trace. Also, move dev_core_stats() and netdev_core_stats_alloc() in dev.c, because they are not called externally. Signed-off-by: Yajun Deng <yajun.deng@linux.dev> --- include/linux/netdevice.h | 32 +++++--------------------------- net/core/dev.c | 30 ++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 29 deletions(-)