mbox series

[net-next,v3,00/10] eth: bnxt: maintain basic pkt/byte counters in SW

Message ID 20250305225215.1567043-1-kuba@kernel.org (mailing list archive)
Headers show
Series eth: bnxt: maintain basic pkt/byte counters in SW | expand

Message

Jakub Kicinski March 5, 2025, 10:52 p.m. UTC
Some workloads want to be able to track bandwidth utilization on
the scale of 10s of msecs. bnxt uses HW stats and async stats
updates, with update frequency controlled via ethtool -C.
Updating all HW stats more often than 100 msec is both hard for
the device and consumes PCIe bandwidth. Switch to maintaining
basic Rx / Tx packet and byte counters in SW.

Tested with drivers/net/stats.py:
  # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

Manually tested by comparing the ethtool -S stats (which continues
to show HW stats) with qstats, and total interface stats.
With and without HW-GRO, and with XDP on / off.
Stopping and starting the interface also doesn't corrupt the values.

v3:
 - try to include vlan tag and padding length in the stats
v2: https://lore.kernel.org/20250228012534.3460918-1-kuba@kernel.org
 - fix skipping XDP vs the XDP Tx ring handling (Michael)
 - rename the defines as well as the structs (Przemek)
 - fix counding frag'ed packets in XDP Tx
v1: https://lore.kernel.org/20250226211003.2790916-1-kuba@kernel.org

Jakub Kicinski (10):
  eth: bnxt: use napi_consume_skb()
  eth: bnxt: don't run xdp programs on fallback traffic
  eth: bnxt: rename ring_err_stats -> ring_drv_stats
  eth: bnxt: snapshot driver stats
  eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO
  eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb()
  eth: bnxt: extract VLAN info early on
  eth: bnxt: maintain rx pkt/byte stats in SW
  eth: bnxt: maintain tx pkt/byte stats in SW
  eth: bnxt: count xdp xmit packets

 drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  49 +++-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   5 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 272 +++++++++++-------
 .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  20 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  47 ++-
 5 files changed, 264 insertions(+), 129 deletions(-)

Comments

Taehee Yoo March 6, 2025, 10:54 a.m. UTC | #1
On Thu, Mar 6, 2025 at 7:52 AM Jakub Kicinski <kuba@kernel.org> wrote:
>

Hi Jakub,
Thanks a lot for this work!

> Some workloads want to be able to track bandwidth utilization on
> the scale of 10s of msecs. bnxt uses HW stats and async stats
> updates, with update frequency controlled via ethtool -C.
> Updating all HW stats more often than 100 msec is both hard for
> the device and consumes PCIe bandwidth. Switch to maintaining
> basic Rx / Tx packet and byte counters in SW.
>
> Tested with drivers/net/stats.py:
>   # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

I found kernel panic while testing stats.py, it occurs when the
interface is down. If the interface is down, cp_ring is not allocated
but bnxt_get_queue_stats_rx() accesses it without null check.

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 163990067 P4D 163990067 PUD 144363067 PMD 0
Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 0 UID: 0 PID: 1654 Comm: python3 Not tainted 6.14.0-rc1+ #6
da0f9ad0522edf8bf0c96e8453594913017a5fc9
Hardware name: ASUS System Product Name/PRIME Z690-P D4, BIOS 0603 11/01/2021
RIP: 0010:bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en]
Code: c6 87 b5 18 00 00 02 eb a2 66 90 90 90 90 90 90 90 90 90 90 90
90 90 90 90 90 90 0f 1f 44 00 00 48 8b 87 48 0b 00 00 48 63 f6 <48> 8b
1
RSP: 0018:ffffa95ac3c2b7e0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: ffffffffc0650710 RCX: 0000000000000000
RDX: ffffa95ac3c2b858 RSI: 0000000000000000 RDI: ffffa25a1c1e8000
RBP: ffffa259e3947100 R08: 0000000000000004 R09: ffffa259e4a6601c
R10: 0000000000000015 R11: ffffa259e4a66000 R12: 0000000000000000
R13: ffffa95ac3c2b8c0 R14: ffffa25a1c1e8000 R15: 0000000000000000
FS:  00007f0b3ccbf080(0000) GS:ffffa260df600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 0000000162d5a000 CR4: 00000000007506f0
PKRU: 55555554
Call Trace:
 <TASK>
 ? __die+0x20/0x70
 ? page_fault_oops+0x15a/0x460
 ? exc_page_fault+0x6e/0x180
 ? asm_exc_page_fault+0x22/0x30
 ? bnxt_get_queue_stats_rx+0xf/0x70 [bnxt_en
3bf73dc1ebebb3ca46ef8948d1fc1a94acbeeba1]
 netdev_nl_stats_by_netdev+0x2b1/0x4e0
 ? xas_load+0x9/0xb0
 ? xas_find+0x183/0x1d0
 ? xa_find+0x8b/0xe0
 netdev_nl_qstats_get_dumpit+0xbf/0x1e0
 genl_dumpit+0x31/0x90
 netlink_dump+0x1a8/0x360

This is not a bug of this series, we can reproduce top of net/net-next
without this series.
Reproduce:
 ip link set $interface down
 ./cli.py --spec netdev.yaml --dump qstats-get
OR
 python ./stats.py

It seems that the driver is supposed to return qstats even if interface
is down. So, I think bnxt driver needs to store the sw_stats when the
interface is down. that may be very similar to the bnxt_get_ring_stats()
and bnxt_get_ring_drv_stats().
What do you think about it?

Thanks a lot!
Taehee Yoo

>
> Manually tested by comparing the ethtool -S stats (which continues
> to show HW stats) with qstats, and total interface stats.
> With and without HW-GRO, and with XDP on / off.
> Stopping and starting the interface also doesn't corrupt the values.
>
> v3:
>  - try to include vlan tag and padding length in the stats
> v2: https://lore.kernel.org/20250228012534.3460918-1-kuba@kernel.org
>  - fix skipping XDP vs the XDP Tx ring handling (Michael)
>  - rename the defines as well as the structs (Przemek)
>  - fix counding frag'ed packets in XDP Tx
> v1: https://lore.kernel.org/20250226211003.2790916-1-kuba@kernel.org
>
> Jakub Kicinski (10):
>   eth: bnxt: use napi_consume_skb()
>   eth: bnxt: don't run xdp programs on fallback traffic
>   eth: bnxt: rename ring_err_stats -> ring_drv_stats
>   eth: bnxt: snapshot driver stats
>   eth: bnxt: don't use ifdef to check for CONFIG_INET in GRO
>   eth: bnxt: consolidate the GRO-but-not-really paths in bnxt_gro_skb()
>   eth: bnxt: extract VLAN info early on
>   eth: bnxt: maintain rx pkt/byte stats in SW
>   eth: bnxt: maintain tx pkt/byte stats in SW
>   eth: bnxt: count xdp xmit packets
>
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h     |  49 +++-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.h |   5 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 272 +++++++++++-------
>  .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  20 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |  47 ++-
>  5 files changed, 264 insertions(+), 129 deletions(-)
>
> --
> 2.48.1
>
>