Message ID | 20240509163216.108665-2-danielj@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Add TX stop/wake counters | expand |
On Thu, May 09, 2024 at 11:32:15AM -0500, Daniel Jurgens wrote: > TX queue stop and wake are counted by some drivers. > Support reporting these via netdev-genl queue stats. > > Signed-off-by: Daniel Jurgens <danielj@nvidia.com> > Reviewed-by: Jiri Pirko <jiri@nvidia.com> > --- > Documentation/netlink/specs/netdev.yaml | 10 ++++++++++ > include/net/netdev_queues.h | 3 +++ > include/uapi/linux/netdev.h | 2 ++ > net/core/netdev-genl.c | 4 +++- > tools/include/uapi/linux/netdev.h | 3 ++- > 5 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml > index 2be4b3714d17..c8b976d03330 100644 > --- a/Documentation/netlink/specs/netdev.yaml > +++ b/Documentation/netlink/specs/netdev.yaml > @@ -439,6 +439,16 @@ attribute-sets: > Number of the packets dropped by the device due to the transmit > packets bitrate exceeding the device rate limit. > type: uint > + - > + name: tx-stop > + doc: | > + Number of times the tx queue was stopped. > + type: uint > + - > + name: tx-wake > + doc: | > + Number of times the tx queue was restarted. > + type: uint I'm curious where these names came from. The opposite of stop would be start. The opposite of wake would be sleep. Are these meant to be opposites of each other? If they are opposites, why would they differ by more than 1? And if they can only differ by 1, why do we need both? Andrew
> From: Andrew Lunn <andrew@lunn.ch> > Sent: Thursday, May 9, 2024 3:47 PM > To: Dan Jurgens <danielj@nvidia.com> > Subject: Re: [PATCH net-next 1/2] netdev: Add queue stats for TX stop and > wake > > On Thu, May 09, 2024 at 11:32:15AM -0500, Daniel Jurgens wrote: > > TX queue stop and wake are counted by some drivers. > > Support reporting these via netdev-genl queue stats. > > > > + name: tx-wake > > + doc: | > > + Number of times the tx queue was restarted. > > + type: uint > > I'm curious where these names came from. The opposite of stop would be > start. The opposite of wake would be sleep. Are these meant to be > opposites of each other? If they are opposites, why would they differ by > more than 1? And if they can only differ by 1, why do we need both? The names come from the API. netif_tx_stop_queue, netif_tx_wake_queue. It's true that they can only ever differ by 1, but when they do that's interesting. Though eventually a TX timeout will occur if it's due to something like a lost interrupt. The most useful thing is knowing if queues are being stopped frequently, so if there's objection to the wake side it can be dropped. > > Andrew
On Thu, 9 May 2024 11:32:15 -0500 Daniel Jurgens wrote: > diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h > index cf24f1d9adf8..ccf6976b1693 100644 > --- a/tools/include/uapi/linux/netdev.h > +++ b/tools/include/uapi/linux/netdev.h > @@ -164,7 +164,8 @@ enum { > NETDEV_A_QSTATS_TX_HW_GSO_BYTES, > NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, > NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, > - NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, Looks like an accidental removal? > + NETDEV_A_QSTATS_TX_STOP, > + NETDEV_A_QSTATS_TX_WAKE, Since you'll have to respin let me nit pick on the docs, as I'm hoping that those will be comprehensible to users not only devs. > + name: tx-stop > + doc: | > + Number of times the tx queue was stopped. How about: Number of times driver paused accepting new tx packets from the stack to this queue, because the queue was full. Note that if BQL is supported and enabled on the device the networking stack will avoid queuing a lot of data at once. > + name: tx-wake > + doc: | > + Number of times the tx queue was restarted. Number of times driver re-started accepting send requests to this queue from the stack.
> From: Jakub Kicinski <kuba@kernel.org> > Sent: Thursday, May 9, 2024 8:31 PM > To: Dan Jurgens <danielj@nvidia.com> > Cc: netdev@vger.kernel.org; mst@redhat.com; jasowang@redhat.com; > xuanzhuo@linux.alibaba.com; virtualization@lists.linux.dev; > davem@davemloft.net; edumazet@google.com; pabeni@redhat.com; Jiri > Pirko <jiri@nvidia.com> > Subject: Re: [PATCH net-next 1/2] netdev: Add queue stats for TX stop and > wake > > On Thu, 9 May 2024 11:32:15 -0500 Daniel Jurgens wrote: > > diff --git a/tools/include/uapi/linux/netdev.h > > b/tools/include/uapi/linux/netdev.h > > index cf24f1d9adf8..ccf6976b1693 100644 > > --- a/tools/include/uapi/linux/netdev.h > > +++ b/tools/include/uapi/linux/netdev.h > > @@ -164,7 +164,8 @@ enum { > > NETDEV_A_QSTATS_TX_HW_GSO_BYTES, > > NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, > > NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, > > - NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, > > Looks like an accidental removal? Yes, thanks. > > > + NETDEV_A_QSTATS_TX_STOP, > > + NETDEV_A_QSTATS_TX_WAKE, > > Since you'll have to respin let me nit pick on the docs, as I'm hoping that > those will be comprehensible to users not only devs. > > > + name: tx-stop > > + doc: | > > + Number of times the tx queue was stopped. > > How about: > > Number of times driver paused accepting new tx packets > from the stack to this queue, because the queue was full. > Note that if BQL is supported and enabled on the device > the networking stack will avoid queuing a lot of data at once. > > > + name: tx-wake > > + doc: | > > + Number of times the tx queue was restarted. > > Number of times driver re-started accepting send > requests to this queue from the stack. Will update it. Thanks for the text!
On Thu, May 09, 2024 at 09:19:52PM +0000, Dan Jurgens wrote: > > From: Andrew Lunn <andrew@lunn.ch> > > Sent: Thursday, May 9, 2024 3:47 PM > > To: Dan Jurgens <danielj@nvidia.com> > > Subject: Re: [PATCH net-next 1/2] netdev: Add queue stats for TX stop and > > wake > > > > On Thu, May 09, 2024 at 11:32:15AM -0500, Daniel Jurgens wrote: > > > TX queue stop and wake are counted by some drivers. > > > Support reporting these via netdev-genl queue stats. > > > > > > + name: tx-wake > > > + doc: | > > > + Number of times the tx queue was restarted. > > > + type: uint > > > > I'm curious where these names came from. The opposite of stop would be > > start. The opposite of wake would be sleep. Are these meant to be > > opposites of each other? If they are opposites, why would they differ by > > more than 1? And if they can only differ by 1, why do we need both? > > The names come from the API. netif_tx_stop_queue, netif_tx_wake_queue. O.K. So in that context, these names make sense. Maybe extend the doc: to mention these function names? You say there are a few drivers with these counters? Does it make sense to actually push the increment into netif_tx_stop_queue(), netif_tx_wake_queue() so that they become available for all drivers? I've no idea what that implies... Andrew
> From: Andrew Lunn <andrew@lunn.ch> > Sent: Friday, May 10, 2024 7:59 AM > To: Dan Jurgens <danielj@nvidia.com> > Cc: netdev@vger.kernel.org; mst@redhat.com; jasowang@redhat.com; > xuanzhuo@linux.alibaba.com; virtualization@lists.linux.dev; > davem@davemloft.net; edumazet@google.com; kuba@kernel.org; > pabeni@redhat.com; Jiri Pirko <jiri@nvidia.com> > Subject: Re: [PATCH net-next 1/2] netdev: Add queue stats for TX stop and > wake > > On Thu, May 09, 2024 at 09:19:52PM +0000, Dan Jurgens wrote: > > > From: Andrew Lunn <andrew@lunn.ch> > > > Sent: Thursday, May 9, 2024 3:47 PM > > > To: Dan Jurgens <danielj@nvidia.com> > > > Subject: Re: [PATCH net-next 1/2] netdev: Add queue stats for TX > > > stop and wake > > > > > > On Thu, May 09, 2024 at 11:32:15AM -0500, Daniel Jurgens wrote: > > > > TX queue stop and wake are counted by some drivers. > > > > Support reporting these via netdev-genl queue stats. > > > > > > > > + name: tx-wake > > > > + doc: | > > > > + Number of times the tx queue was restarted. > > > > + type: uint > > > > > > I'm curious where these names came from. The opposite of stop would > > > be start. The opposite of wake would be sleep. Are these meant to be > > > opposites of each other? If they are opposites, why would they > > > differ by more than 1? And if they can only differ by 1, why do we need > both? > > > > The names come from the API. netif_tx_stop_queue, > netif_tx_wake_queue. > > O.K. So in that context, these names make sense. Maybe extend the doc: > to mention these function names? > > You say there are a few drivers with these counters? Does it make sense to > actually push the increment into netif_tx_stop_queue(), > netif_tx_wake_queue() so that they become available for all drivers? > I've no idea what that implies... It wouldn't be trivial. The stats are queried from the driver. > > Andrew
> It wouldn't be trivial. The stats are queried from the driver.
So are page pool stats, with the increments happening in the page pool
code, not the driver.
Andrew
diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index 2be4b3714d17..c8b976d03330 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -439,6 +439,16 @@ attribute-sets: Number of the packets dropped by the device due to the transmit packets bitrate exceeding the device rate limit. type: uint + - + name: tx-stop + doc: | + Number of times the tx queue was stopped. + type: uint + - + name: tx-wake + doc: | + Number of times the tx queue was restarted. + type: uint operations: list: diff --git a/include/net/netdev_queues.h b/include/net/netdev_queues.h index e7b84f018cee..a8a7e48dfa6c 100644 --- a/include/net/netdev_queues.h +++ b/include/net/netdev_queues.h @@ -41,6 +41,9 @@ struct netdev_queue_stats_tx { u64 hw_gso_wire_bytes; u64 hw_drop_ratelimits; + + u64 stop; + u64 wake; }; /** diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index cf24f1d9adf8..a8188202413e 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -165,6 +165,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1) diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 4b5054087309..1f6ae6379e0f 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -517,7 +517,9 @@ netdev_nl_stats_write_tx(struct sk_buff *rsp, struct netdev_queue_stats_tx *tx) netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_BYTES, tx->hw_gso_bytes) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, tx->hw_gso_wire_packets) || netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, tx->hw_gso_wire_bytes) || - netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits)) + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, tx->hw_drop_ratelimits) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_STOP, tx->stop) || + netdev_stat_put(rsp, NETDEV_A_QSTATS_TX_WAKE, tx->wake)) return -EMSGSIZE; return 0; } diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index cf24f1d9adf8..ccf6976b1693 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -164,7 +164,8 @@ enum { NETDEV_A_QSTATS_TX_HW_GSO_BYTES, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_PACKETS, NETDEV_A_QSTATS_TX_HW_GSO_WIRE_BYTES, - NETDEV_A_QSTATS_TX_HW_DROP_RATELIMITS, + NETDEV_A_QSTATS_TX_STOP, + NETDEV_A_QSTATS_TX_WAKE, __NETDEV_A_QSTATS_MAX, NETDEV_A_QSTATS_MAX = (__NETDEV_A_QSTATS_MAX - 1)