diff mbox series

[net] bnxt: fix crashes when reducing ring count with active RSS contexts

Message ID 20240705020005.681746-1-kuba@kernel.org (mailing list archive)
State Accepted
Commit 0d1b7d6c927431126ece585246e23aa877243360
Delegated to: Netdev Maintainers
Headers show
Series [net] bnxt: fix crashes when reducing ring count with active RSS contexts | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 833 this patch: 833
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/build_clang success Errors and warnings before: 835 this patch: 835
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 839 this patch: 839
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 40 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 2 this patch: 2
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-07-05--06-00 (tests: 695)

Commit Message

Jakub Kicinski July 5, 2024, 2 a.m. UTC
bnxt doesn't check if a ring is used by RSS contexts when reducing
ring count. Core performs a similar check for the drivers for
the main context, but core doesn't know about additional contexts,
so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
id to index bp->rx_ring[], which without the check may end up
being out of bounds.

  BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
  Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
  Call Trace:
  __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
   bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
   __bnxt_setup_vnic_p5+0x12e/0x270
   __bnxt_open_nic+0x2262/0x2f30
   bnxt_open_nic+0x5d/0xf0
   ethnl_set_channels+0x5d4/0xb30
   ethnl_default_set_doit+0x2f1/0x620

Core does track the additional contexts in net-next, so we can
move this validation out of the driver as a follow up there.

Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
---
CC: michael.chan@broadcom.com
CC: pavan.chebbi@broadcom.com
CC: kalesh-anakkur.purayil@broadcom.com
---
 drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 15 +++++++++++++++
 drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  1 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  6 ++++++
 3 files changed, 22 insertions(+)

Comments

Pavan Chebbi July 5, 2024, 10:15 a.m. UTC | #1
On Fri, Jul 5, 2024 at 7:30 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> bnxt doesn't check if a ring is used by RSS contexts when reducing
> ring count. Core performs a similar check for the drivers for
> the main context, but core doesn't know about additional contexts,
> so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
> id to index bp->rx_ring[], which without the check may end up
> being out of bounds.
>
>   BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
>   Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
>   Call Trace:
>   __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
>    bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
>    __bnxt_setup_vnic_p5+0x12e/0x270
>    __bnxt_open_nic+0x2262/0x2f30
>    bnxt_open_nic+0x5d/0xf0
>    ethnl_set_channels+0x5d4/0xb30
>    ethnl_default_set_doit+0x2f1/0x620
>
> Core does track the additional contexts in net-next, so we can
> move this validation out of the driver as a follow up there.
>
> Fixes: b3d0083caf9a ("bnxt_en: Support RSS contexts in ethtool .{get|set}_rxfh()")
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> ---
> CC: michael.chan@broadcom.com
> CC: pavan.chebbi@broadcom.com
> CC: kalesh-anakkur.purayil@broadcom.com
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c         | 15 +++++++++++++++
>  drivers/net/ethernet/broadcom/bnxt/bnxt.h         |  1 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c |  6 ++++++
>  3 files changed, 22 insertions(+)
>
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> index 220d05e2f6fa..80fce0aaad66 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
> @@ -6282,6 +6282,21 @@ static u16 bnxt_get_max_rss_ring(struct bnxt *bp)
>         return max_ring;
>  }
>
> +u16 bnxt_get_max_rss_ctx_ring(struct bnxt *bp)
> +{
> +       u16 i, tbl_size, max_ring = 0;
> +       struct bnxt_rss_ctx *rss_ctx;
> +
> +       tbl_size = bnxt_get_rxfh_indir_size(bp->dev);
> +
> +       list_for_each_entry(rss_ctx, &bp->rss_ctx_list, list) {
> +               for (i = 0; i < tbl_size; i++)
> +                       max_ring = max(max_ring, rss_ctx->rss_indir_tbl[i]);
> +       }
> +
> +       return max_ring;
> +}
> +
>  int bnxt_get_nr_rss_ctxs(struct bnxt *bp, int rx_rings)
>  {
>         if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> index e46bd11e52b0..3c8826875ceb 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
> @@ -2814,6 +2814,7 @@ int bnxt_hwrm_vnic_set_tpa(struct bnxt *bp, struct bnxt_vnic_info *vnic,
>  void bnxt_fill_ipv6_mask(__be32 mask[4]);
>  int bnxt_alloc_rss_indir_tbl(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx);
>  void bnxt_set_dflt_rss_indir_tbl(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx);
> +u16 bnxt_get_max_rss_ctx_ring(struct bnxt *bp);
>  int bnxt_get_nr_rss_ctxs(struct bnxt *bp, int rx_rings);
>  int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic);
>  int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic,
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> index bf157f6cc042..4d53ec7adc61 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
> @@ -961,6 +961,12 @@ static int bnxt_set_channels(struct net_device *dev,
>                 return rc;
>         }
>
> +       if (req_rx_rings < bp->rx_nr_rings &&
> +           req_rx_rings <= bnxt_get_max_rss_ctx_ring(bp)) {
> +               netdev_warn(dev, "Can't deactivate rings used by RSS contexts\n");
> +               return -EINVAL;
> +       }
> +
>         if (bnxt_get_nr_rss_ctxs(bp, req_rx_rings) !=
>             bnxt_get_nr_rss_ctxs(bp, bp->rx_nr_rings) &&
>             netif_is_rxfh_configured(dev)) {
> --
> 2.45.2
>
Thanks Jakub for the patch. This is much better than my earlier
thought of prioritizing ring change by destroying all the RSS ctxs.
Patch LGTM.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
patchwork-bot+netdevbpf@kernel.org July 9, 2024, 10:40 a.m. UTC | #2
Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Thu,  4 Jul 2024 19:00:05 -0700 you wrote:
> bnxt doesn't check if a ring is used by RSS contexts when reducing
> ring count. Core performs a similar check for the drivers for
> the main context, but core doesn't know about additional contexts,
> so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
> id to index bp->rx_ring[], which without the check may end up
> being out of bounds.
> 
> [...]

Here is the summary with links:
  - [net] bnxt: fix crashes when reducing ring count with active RSS contexts
    https://git.kernel.org/netdev/net/c/0d1b7d6c9274

You are awesome, thank you!
Breno Leitao Aug. 5, 2024, 11:42 a.m. UTC | #3
Hello,

On Thu, Jul 04, 2024 at 07:00:05PM -0700, Jakub Kicinski wrote:
> bnxt doesn't check if a ring is used by RSS contexts when reducing
> ring count. Core performs a similar check for the drivers for
> the main context, but core doesn't know about additional contexts,
> so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
> id to index bp->rx_ring, which without the check may end up
> being out of bounds.
> 
>   BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
>   Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
>   Call Trace:
>   __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
>    bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
>    __bnxt_setup_vnic_p5+0x12e/0x270
>    __bnxt_open_nic+0x2262/0x2f30
>    bnxt_open_nic+0x5d/0xf0
>    ethnl_set_channels+0x5d4/0xb30
>    ethnl_default_set_doit+0x2f1/0x620

I have this patch applied to my tree, and I am still finding a very
similar KASAN report in the last net-next/main tree - commit
3608d6aca5e793958462e6e01a8cdb6c6e8088d0 ("Merge branch 'dsa-en7581'
into main")

Skimmer over the code, In bnxt_fill_hw_rss_tbl(), bp->rss_indir_tbl[i]
returns 8, but, vnic->fw_grp_id size is 8, thus, it tries to access over
the last element (7).

Somehow bp->rss_indir_tbl[i] goes beynd rx_nr_rings.

--breno


	 ==================================================================
	 BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6307 drivers/net/ethernet/broadcom/bnxt/bnxt.c:6347)
	 Read of size 2 at addr ffff88812c518f90 by task (udev-worker)/794

	 Call Trace:
	  <TASK>
	 dump_stack_lvl (lib/dump_stack.c:122)
	 print_report (mm/kasan/report.c:378 mm/kasan/report.c:488)
	 ? __virt_addr_valid (./arch/x86/include/asm/preempt.h:103 ./include/linux/rcupdate.h:953 ./include/linux/mmzone.h:2034 arch/x86/mm/physaddr.c:65)
	 ? __bnxt_hwrm_vnic_set_rss (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6307 drivers/net/ethernet/broadcom/bnxt/bnxt.c:6347)
	 kasan_report (mm/kasan/report.c:603)
	 ? __bnxt_hwrm_vnic_set_rss (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6307 drivers/net/ethernet/broadcom/bnxt/bnxt.c:6347)
	 __bnxt_hwrm_vnic_set_rss (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6307 drivers/net/ethernet/broadcom/bnxt/bnxt.c:6347)
	 ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
	 bnxt_hwrm_vnic_set_rss.part.0 (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6379)
	 ? __bnxt_hwrm_vnic_set_rss (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6364)
	 ? __bnxt_setup_vnic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:6624)
	 __bnxt_setup_vnic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:10073)
	 bnxt_init_nic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:10144 drivers/net/ethernet/broadcom/bnxt/bnxt.c:10336 drivers/net/ethernet/broadcom/bnxt/bnxt.c:10432)
	 ? bnxt_alloc_and_setup_vnic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:10425)
	 ? __irq_apply_affinity_hint (kernel/irq/manage.c:471 kernel/irq/manage.c:516)
	 ? irq_set_affinity_locked (kernel/irq/manage.c:507)
	 ? alloc_cpumask_var_node (lib/cpumask.c:62)
	 __bnxt_open_nic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:12103)
	 ? __netdev_update_features (net/core/dev.c:10116)
	 ? bnxt_init_one (drivers/net/ethernet/broadcom/bnxt/bnxt.c:12064)
	 ? __bnxt_close_nic.constprop.0 (drivers/net/ethernet/broadcom/bnxt/bnxt.c:10918 drivers/net/ethernet/broadcom/bnxt/bnxt.c:12323)
	 ? bnxt_set_channels (./arch/x86/include/asm/bitops.h:206 ./arch/x86/include/asm/bitops.h:238 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 ./include/linux/netdevice.h:3588 drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:1003)
	 bnxt_open_nic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:12179)
	 ethtool_set_channels (net/ethtool/ioctl.c:2117)
	 ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4995)
	 ? ethtool_set_settings (net/ethtool/ioctl.c:2065)
	 ? security_capable (security/security.c:1036 (discriminator 13))
	 __dev_ethtool (net/ethtool/ioctl.c:3275)
	 ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:673)
	 ? arch_stack_walk (arch/x86/kernel/stacktrace.c:24)
	 ? ethtool_get_module_info_call (net/ethtool/ioctl.c:3044)
	 ? __lock_acquire (./arch/x86/include/asm/bitops.h:227 ./arch/x86/include/asm/bitops.h:239 ./include/asm-generic/bitops/instrumented-non-atomic.h:142 kernel/locking/lockdep.c:227 kernel/locking/lockdep.c:3780 kernel/locking/lockdep.c:3836 kernel/locking/lockdep.c:5142)
	 ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4995)
	 ? stack_trace_save (kernel/stacktrace.c:123)
	 ? lock_acquire (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5761 kernel/locking/lockdep.c:5724)
	 ? lock_sync (kernel/locking/lockdep.c:5727)
	 ? __kasan_kmalloc (mm/kasan/common.c:391)
	 ? dev_ethtool (net/ethtool/ioctl.c:3351)
	 ? dev_ioctl (net/core/dev_ioctl.c:721)
	 ? sock_ioctl (net/socket.c:1344)
	 ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:726)
	 ? trace_contention_end (./include/trace/events/lock.h:122 (discriminator 52))
	 ? __mutex_lock (./arch/x86/include/asm/preempt.h:103 kernel/locking/mutex.c:618 kernel/locking/mutex.c:752)
	 ? lock_downgrade (kernel/locking/lockdep.c:5767)
	 ? dev_ethtool (net/ethtool/ioctl.c:3365)
	 ? sock_do_ioctl (net/socket.c:1237)
	 ? mutex_lock_io_nested (kernel/locking/mutex.c:751)
	 ? _raw_spin_unlock_irqrestore (./include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194)
	 ? _raw_spin_unlock_irqrestore (./arch/x86/include/asm/preempt.h:103 ./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
	 ? rcu_is_watching (./include/linux/context_tracking.h:122 kernel/rcu/tree.c:726)
	 ? trace_kmalloc (./include/trace/events/kmem.h:54 (discriminator 52))
	 ? __kmalloc_cache_noprof (./include/linux/kasan.h:211 mm/slub.c:4189)
	 dev_ethtool (net/ethtool/ioctl.c:3365)
	 ? __dev_ethtool (net/ethtool/ioctl.c:3342)
	 dev_ioctl (net/core/dev_ioctl.c:721)
	 sock_do_ioctl (net/socket.c:1237)
	 ? put_user_ifreq (net/socket.c:1214)
	 ? find_held_lock (kernel/locking/lockdep.c:5249)
	 sock_ioctl (net/socket.c:1344)
	 ? br_ioctl_call (net/socket.c:1250)
	 ? seccomp_notify_ioctl (kernel/seccomp.c:1218)
	 ? ktime_get_coarse_real_ts64 (./include/linux/seqlock.h:74 kernel/time/timekeeping.c:2390)
	 ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4420)
	 __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:907 fs/ioctl.c:893 fs/ioctl.c:893)
	 do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
	 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
	 RIP: 0033:0x7fab3150357b
	 Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 75 68 0f 00 f7 d8 64 89 01 48
	All code
	========
	   0:   ff                      (bad)
	   1:   ff                      (bad)
	   2:   ff 85 c0 79 9b 49       incl   0x499b79c0(%rbp)
	   8:   c7 c4 ff ff ff ff       mov    $0xffffffff,%esp
	   e:   5b                      pop    %rbx
	   f:   5d                      pop    %rbp
	  10:   4c 89 e0                mov    %r12,%rax
	  13:   41 5c                   pop    %r12
	  15:   c3                      ret
	  16:   66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
	  1d:   00 00
	  1f:   f3 0f 1e fa             endbr64
	  23:   b8 10 00 00 00          mov    $0x10,%eax
	  28:   0f 05                   syscall
	  2a:*  48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax         <-- trapping instruction
	  30:   73 01                   jae    0x33
	  32:   c3                      ret
	  33:   48 8b 0d 75 68 0f 00    mov    0xf6875(%rip),%rcx        # 0xf68af
	  3a:   f7 d8                   neg    %eax
	  3c:   64 89 01                mov    %eax,%fs:(%rcx)
	  3f:   48                      rex.W

	Code starting with the faulting instruction
	===========================================
	   0:   48 3d 01 f0 ff ff       cmp    $0xfffffffffffff001,%rax
	   6:   73 01                   jae    0x9
	   8:   c3                      ret
	   9:   48 8b 0d 75 68 0f 00    mov    0xf6875(%rip),%rcx        # 0xf6885
	  10:   f7 d8                   neg    %eax
	  12:   64 89 01                mov    %eax,%fs:(%rcx)
	  15:   48                      rex.W
	 RSP: 002b:00007ffe53677a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
	 RAX: ffffffffffffffda RBX: 000055b5a1868cd8 RCX: 00007fab3150357b
	 RDX: 00007ffe53677a60 RSI: 0000000000008946 RDI: 000000000000001f
	 RBP: 00007ffe53677ab0 R08: 0000000000000000 R09: 0000000000000000
	 R10: 000055b5a18c9110 R11: 0000000000000246 R12: 000055b5a18c0ca0
	 R13: 000055b5a1866d18 R14: 00007ffe53677a60 R15: 000055b5a18becb0
	  </TASK>

	 Allocated by task 794:
	 kasan_save_stack (mm/kasan/common.c:48)
	 kasan_save_track (./arch/x86/include/asm/current.h:49 mm/kasan/common.c:60 mm/kasan/common.c:69)
	 __kasan_kmalloc (mm/kasan/common.c:391)
	 __kmalloc_noprof (mm/slub.c:4159 mm/slub.c:4170)
	 bnxt_alloc_mem (drivers/net/ethernet/broadcom/bnxt/bnxt.c:4696 drivers/net/ethernet/broadcom/bnxt/bnxt.c:5323)
	 __bnxt_open_nic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:12088)
	 bnxt_open_nic (drivers/net/ethernet/broadcom/bnxt/bnxt.c:12179)
	 ethtool_set_channels (net/ethtool/ioctl.c:2117)
	 __dev_ethtool (net/ethtool/ioctl.c:3275)
	 dev_ethtool (net/ethtool/ioctl.c:3365)
	 dev_ioctl (net/core/dev_ioctl.c:721)
	 sock_do_ioctl (net/socket.c:1237)
	 sock_ioctl (net/socket.c:1344)
	 __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:907 fs/ioctl.c:893 fs/ioctl.c:893)
	 do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
	 entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

	 The buggy address belongs to the object at ffff88812c518f80
	which belongs to the cache kmalloc-16 of size 16
	 The buggy address is located 0 bytes to the right of
	allocated 16-byte region [ffff88812c518f80, ffff88812c518f90)

	 The buggy address belongs to the physical page:
	 page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x12c518
	 anon flags: 0x5ffff0000000000(node=0|zone=2|lastcpupid=0x1ffff)
	 page_type: 0xfdffffff(slab)
	 raw: 05ffff0000000000 ffff88810004c640 0000000000000000 0000000000000001
	 raw: 0000000000000000 0000000080800080 00000001fdffffff 0000000000000000
	 page dumped because: kasan: bad access detected

	 Memory state around the buggy address:
	  ffff88812c518e80: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
	  ffff88812c518f00: fa fb fc fc fa fb fc fc fa fb fc fc fa fb fc fc
	 >ffff88812c518f80: 00 00 fc fc 00 00 fc fc 00 01 fc fc fa fb fc fc
				  ^
	  ffff88812c519000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	  ffff88812c519080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
	 ==================================================================
Breno Leitao Aug. 5, 2024, 6:04 p.m. UTC | #4
On Mon, Aug 05, 2024 at 04:42:11AM -0700, Breno Leitao wrote:
> Hello,
> 
> On Thu, Jul 04, 2024 at 07:00:05PM -0700, Jakub Kicinski wrote:
> > bnxt doesn't check if a ring is used by RSS contexts when reducing
> > ring count. Core performs a similar check for the drivers for
> > the main context, but core doesn't know about additional contexts,
> > so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring
> > id to index bp->rx_ring, which without the check may end up
> > being out of bounds.
> > 
> >   BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
> >   Read of size 2 at addr ffff8881c5809618 by task ethtool/31525
> >   Call Trace:
> >   __bnxt_hwrm_vnic_set_rss+0xb79/0xe40
> >    bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460
> >    __bnxt_setup_vnic_p5+0x12e/0x270
> >    __bnxt_open_nic+0x2262/0x2f30
> >    bnxt_open_nic+0x5d/0xf0
> >    ethnl_set_channels+0x5d4/0xb30
> >    ethnl_default_set_doit+0x2f1/0x620
> 
> I have this patch applied to my tree, and I am still finding a very
> similar KASAN report in the last net-next/main tree - commit
> 3608d6aca5e793958462e6e01a8cdb6c6e8088d0 ("Merge branch 'dsa-en7581'
> into main")
> 
> Skimmer over the code, In bnxt_fill_hw_rss_tbl(), bp->rss_indir_tbl[i]
> returns 8, but, vnic->fw_grp_id size is 8, thus, it tries to access over
> the last element (7).
> 
> Somehow bp->rss_indir_tbl[i] goes beynd rx_nr_rings.

I was able to debug this further, and the culprit is

98ba1d931f611e8f8f519c040 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
Michael Chan Aug. 5, 2024, 6:23 p.m. UTC | #5
On Mon, Aug 5, 2024 at 11:04 AM Breno Leitao <leitao@debian.org> wrote:
>
> On Mon, Aug 05, 2024 at 04:42:11AM -0700, Breno Leitao wrote:
> > Somehow bp->rss_indir_tbl[i] goes beynd rx_nr_rings.
>
> I was able to debug this further, and the culprit is
>
> 98ba1d931f611e8f8f519c040 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")

Thanks, we'll look into it.  Do you have the steps to reproduce it?
Michael Chan Aug. 5, 2024, 6:36 p.m. UTC | #6
On Mon, Aug 5, 2024 at 11:23 AM Michael Chan <michael.chan@broadcom.com> wrote:
>
> On Mon, Aug 5, 2024 at 11:04 AM Breno Leitao <leitao@debian.org> wrote:
> >
> > On Mon, Aug 05, 2024 at 04:42:11AM -0700, Breno Leitao wrote:
> > > Somehow bp->rss_indir_tbl[i] goes beynd rx_nr_rings.
> >
> > I was able to debug this further, and the culprit is
> >
> > 98ba1d931f611e8f8f519c040 ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()")
>
> Thanks, we'll look into it.  Do you have the steps to reproduce it?

Maybe the FW on your NIC is older and BNXT_NEW_RM() is false.  That
code path may not be correct.  Anyway, we'll look into it.  Thanks
again.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index 220d05e2f6fa..80fce0aaad66 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -6282,6 +6282,21 @@  static u16 bnxt_get_max_rss_ring(struct bnxt *bp)
 	return max_ring;
 }
 
+u16 bnxt_get_max_rss_ctx_ring(struct bnxt *bp)
+{
+	u16 i, tbl_size, max_ring = 0;
+	struct bnxt_rss_ctx *rss_ctx;
+
+	tbl_size = bnxt_get_rxfh_indir_size(bp->dev);
+
+	list_for_each_entry(rss_ctx, &bp->rss_ctx_list, list) {
+		for (i = 0; i < tbl_size; i++)
+			max_ring = max(max_ring, rss_ctx->rss_indir_tbl[i]);
+	}
+
+	return max_ring;
+}
+
 int bnxt_get_nr_rss_ctxs(struct bnxt *bp, int rx_rings)
 {
 	if (bp->flags & BNXT_FLAG_CHIP_P5_PLUS) {
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index e46bd11e52b0..3c8826875ceb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -2814,6 +2814,7 @@  int bnxt_hwrm_vnic_set_tpa(struct bnxt *bp, struct bnxt_vnic_info *vnic,
 void bnxt_fill_ipv6_mask(__be32 mask[4]);
 int bnxt_alloc_rss_indir_tbl(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx);
 void bnxt_set_dflt_rss_indir_tbl(struct bnxt *bp, struct bnxt_rss_ctx *rss_ctx);
+u16 bnxt_get_max_rss_ctx_ring(struct bnxt *bp);
 int bnxt_get_nr_rss_ctxs(struct bnxt *bp, int rx_rings);
 int bnxt_hwrm_vnic_cfg(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic,
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index bf157f6cc042..4d53ec7adc61 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -961,6 +961,12 @@  static int bnxt_set_channels(struct net_device *dev,
 		return rc;
 	}
 
+	if (req_rx_rings < bp->rx_nr_rings &&
+	    req_rx_rings <= bnxt_get_max_rss_ctx_ring(bp)) {
+		netdev_warn(dev, "Can't deactivate rings used by RSS contexts\n");
+		return -EINVAL;
+	}
+
 	if (bnxt_get_nr_rss_ctxs(bp, req_rx_rings) !=
 	    bnxt_get_nr_rss_ctxs(bp, bp->rx_nr_rings) &&
 	    netif_is_rxfh_configured(dev)) {