mbox series

[net-next,v10,00/14] net: Hold netdev instance lock during ndo operations

Message ID 20250302000901.2729164-1-sdf@fomichev.me (mailing list archive)
Headers show
Series net: Hold netdev instance lock during ndo operations | expand

Message

Stanislav Fomichev March 2, 2025, 12:08 a.m. UTC
As the gradual purging of rtnl continues, start grabbing netdev
instance lock in more places so we can get to the state where
most paths are working without rtnl. Start with requiring the
drivers that use shaper api (and later queue mgmt api) to work
with both rtnl and netdev instance lock. Eventually we might
attempt to drop rtnl. This mostly affects iavf, gve, bnxt and
netdev sim (as the drivers that implement shaper/queue mgmt)
so those drivers are converted in the process.

call_netdevice_notifiers locking is very inconsistent and might need
a separate follow up. Some notified events are covered by the
instance lock, some are not, which might complicate the driver
expectations.

Changes since v9:
- rework ndo_setup_tc locking (Saeed)
  - net: hold netdev instance lock during ndo_setup_tc
    - keep only nft parts (hopefully ok to keep Eric's RB)
  - 2 new patches to grab the lock at sch_api netlink level
    - net: sched: wrap doit/dumpit methods
      - general refactoring to make it easier to grab instance lock
    - net: hold netdev instance lock during qdisc ndo_setup_tc
  - net: ethtool: try to protect all callback with netdev instance lock
    - remove the lock around get_ts_info

Changes since v8:
- rebase on top of net-next

Changes since v7:
- fix AA deadlock detection in netdev_lock_cmp_fn (Jakub)

Changes since v6:
- rebase on top of net-next

Changes since v5:
- fix comment in bnxt_lock_sp (Michael)
- add netdev_lock/unlock around GVE suspend/resume (Sabrina)
- grab netdev lock around ethtool_ops->reset in cmis_fw_update_reset (Sabrina)

Changes since v4:
- reword documentation about rtnl_lock and instance lock relation
  (Jakub)
- do s/RTNL/rtnl_lock/ in the documentation (Jakub)
- mention dev_xxx/netif_xxx distinction (Paolo)
- add new patch to add request_ops_lock opt-in (Jakub)
- drop patch that adds shaper API to dummy (Jakub)
- drop () around dev in netdev_need_ops_lock

Changes since v3:
- add instance lock to netdev_lockdep_set_classes,
  move lock_set_cmp_fn to happen after set_class (NIPA)

Changes since v2:
- new patch to replace dev_addr_sem with instance lock (forwarding tests)
- CONFIG_LOCKDEP around netdev_lock_cmp_fn (Jakub)
- remove netif_device_present check from dev_setup_tc (bpf_offload.py)
- reorder bpf_devs_locks and instance lock ordering in bpf map
  offload (bpf_offload.py)

Changes since v1:
- fix netdev_set_mtu_ext_locked in the wrong place (lkp@intel.com)
- add missing depend on CONFIG_NET_SHAPER for dummy device
  (lkp@intel.com)
  - not sure we need to apply dummy device patch..
- need_netdev_ops_lock -> netdev_need_ops_lock (Jakub)
- remove netdev_assert_locked near napi_xxx_locked calls (Jakub)
- fix netdev_lock_cmp_fn comment and line length (Jakub)
- fix kdoc style of dev_api.c routines (Jakub)
- reflow dev_setup_tc to avoid indent (Jakub)
- keep tc_can_offload checks outside of dev_setup_tc (Jakub)

Changes since RFC:
- other control paths are protected
- bntx has been converted to mostly depend on netdev instance lock

Reviewed-by: Eric Dumazet <edumazet@google.com>
Cc: Saeed Mahameed <saeed@kernel.org>
Cc: David Wei <dw@davidwei.uk>

Jakub Kicinski (1):
  net: ethtool: try to protect all callback with netdev instance lock

Stanislav Fomichev (13):
  net: hold netdev instance lock during ndo_open/ndo_stop
  net: hold netdev instance lock during nft ndo_setup_tc
  net: sched: wrap doit/dumpit methods
  net: hold netdev instance lock during qdisc ndo_setup_tc
  net: hold netdev instance lock during queue operations
  net: hold netdev instance lock during rtnetlink operations
  net: hold netdev instance lock during ioctl operations
  net: hold netdev instance lock during sysfs operations
  net: hold netdev instance lock during ndo_bpf
  net: replace dev_addr_sem with netdev instance lock
  net: add option to request netdev instance lock
  docs: net: document new locking reality
  eth: bnxt: remove most dependencies on RTNL

 Documentation/networking/netdevices.rst       |  65 +++-
 drivers/net/bonding/bond_main.c               |  16 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 133 ++++----
 .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |   9 +
 .../net/ethernet/broadcom/bnxt/bnxt_sriov.c   |   6 +
 drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  16 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c |  18 +-
 drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   3 +-
 drivers/net/ethernet/google/gve/gve_main.c    |  12 +-
 drivers/net/ethernet/google/gve/gve_utils.c   |   6 +-
 drivers/net/ethernet/intel/iavf/iavf_main.c   |  16 +-
 drivers/net/netdevsim/ethtool.c               |   2 -
 drivers/net/netdevsim/netdev.c                |  39 ++-
 drivers/net/tap.c                             |   2 +-
 drivers/net/tun.c                             |   2 +-
 include/linux/netdevice.h                     |  90 ++++-
 kernel/bpf/offload.c                          |   6 +-
 net/8021q/vlan_dev.c                          |   4 +-
 net/core/Makefile                             |   2 +-
 net/core/dev.c                                | 284 ++++++----------
 net/core/dev.h                                |  22 +-
 net/core/dev_api.c                            | 318 ++++++++++++++++++
 net/core/dev_ioctl.c                          |  69 ++--
 net/core/net-sysfs.c                          |   9 +-
 net/core/netdev_rx_queue.c                    |   5 +
 net/core/rtnetlink.c                          |  50 ++-
 net/dsa/conduit.c                             |  16 +-
 net/ethtool/cabletest.c                       |  20 +-
 net/ethtool/cmis_fw_update.c                  |   7 +-
 net/ethtool/features.c                        |   6 +-
 net/ethtool/ioctl.c                           |   6 +
 net/ethtool/module.c                          |   8 +-
 net/ethtool/netlink.c                         |  12 +
 net/ethtool/phy.c                             |  20 +-
 net/ethtool/rss.c                             |   2 +
 net/ethtool/tsinfo.c                          |   9 +-
 net/netfilter/nf_flow_table_offload.c         |   2 +-
 net/netfilter/nf_tables_offload.c             |   2 +-
 net/sched/sch_api.c                           | 214 ++++++++----
 net/xdp/xsk.c                                 |   3 +
 net/xdp/xsk_buff_pool.c                       |   2 +
 41 files changed, 1045 insertions(+), 488 deletions(-)
 create mode 100644 net/core/dev_api.c

Comments

Lei Yang March 3, 2025, 4:22 a.m. UTC | #1
QE tested this series of patches with virtio-net regression tests,
everything works fine.

Tested-by: Lei Yang <leiyang@redhat.com>

On Sun, Mar 2, 2025 at 8:09 AM Stanislav Fomichev <sdf@fomichev.me> wrote:
>
> As the gradual purging of rtnl continues, start grabbing netdev
> instance lock in more places so we can get to the state where
> most paths are working without rtnl. Start with requiring the
> drivers that use shaper api (and later queue mgmt api) to work
> with both rtnl and netdev instance lock. Eventually we might
> attempt to drop rtnl. This mostly affects iavf, gve, bnxt and
> netdev sim (as the drivers that implement shaper/queue mgmt)
> so those drivers are converted in the process.
>
> call_netdevice_notifiers locking is very inconsistent and might need
> a separate follow up. Some notified events are covered by the
> instance lock, some are not, which might complicate the driver
> expectations.
>
> Changes since v9:
> - rework ndo_setup_tc locking (Saeed)
>   - net: hold netdev instance lock during ndo_setup_tc
>     - keep only nft parts (hopefully ok to keep Eric's RB)
>   - 2 new patches to grab the lock at sch_api netlink level
>     - net: sched: wrap doit/dumpit methods
>       - general refactoring to make it easier to grab instance lock
>     - net: hold netdev instance lock during qdisc ndo_setup_tc
>   - net: ethtool: try to protect all callback with netdev instance lock
>     - remove the lock around get_ts_info
>
> Changes since v8:
> - rebase on top of net-next
>
> Changes since v7:
> - fix AA deadlock detection in netdev_lock_cmp_fn (Jakub)
>
> Changes since v6:
> - rebase on top of net-next
>
> Changes since v5:
> - fix comment in bnxt_lock_sp (Michael)
> - add netdev_lock/unlock around GVE suspend/resume (Sabrina)
> - grab netdev lock around ethtool_ops->reset in cmis_fw_update_reset (Sabrina)
>
> Changes since v4:
> - reword documentation about rtnl_lock and instance lock relation
>   (Jakub)
> - do s/RTNL/rtnl_lock/ in the documentation (Jakub)
> - mention dev_xxx/netif_xxx distinction (Paolo)
> - add new patch to add request_ops_lock opt-in (Jakub)
> - drop patch that adds shaper API to dummy (Jakub)
> - drop () around dev in netdev_need_ops_lock
>
> Changes since v3:
> - add instance lock to netdev_lockdep_set_classes,
>   move lock_set_cmp_fn to happen after set_class (NIPA)
>
> Changes since v2:
> - new patch to replace dev_addr_sem with instance lock (forwarding tests)
> - CONFIG_LOCKDEP around netdev_lock_cmp_fn (Jakub)
> - remove netif_device_present check from dev_setup_tc (bpf_offload.py)
> - reorder bpf_devs_locks and instance lock ordering in bpf map
>   offload (bpf_offload.py)
>
> Changes since v1:
> - fix netdev_set_mtu_ext_locked in the wrong place (lkp@intel.com)
> - add missing depend on CONFIG_NET_SHAPER for dummy device
>   (lkp@intel.com)
>   - not sure we need to apply dummy device patch..
> - need_netdev_ops_lock -> netdev_need_ops_lock (Jakub)
> - remove netdev_assert_locked near napi_xxx_locked calls (Jakub)
> - fix netdev_lock_cmp_fn comment and line length (Jakub)
> - fix kdoc style of dev_api.c routines (Jakub)
> - reflow dev_setup_tc to avoid indent (Jakub)
> - keep tc_can_offload checks outside of dev_setup_tc (Jakub)
>
> Changes since RFC:
> - other control paths are protected
> - bntx has been converted to mostly depend on netdev instance lock
>
> Reviewed-by: Eric Dumazet <edumazet@google.com>
> Cc: Saeed Mahameed <saeed@kernel.org>
> Cc: David Wei <dw@davidwei.uk>
>
> Jakub Kicinski (1):
>   net: ethtool: try to protect all callback with netdev instance lock
>
> Stanislav Fomichev (13):
>   net: hold netdev instance lock during ndo_open/ndo_stop
>   net: hold netdev instance lock during nft ndo_setup_tc
>   net: sched: wrap doit/dumpit methods
>   net: hold netdev instance lock during qdisc ndo_setup_tc
>   net: hold netdev instance lock during queue operations
>   net: hold netdev instance lock during rtnetlink operations
>   net: hold netdev instance lock during ioctl operations
>   net: hold netdev instance lock during sysfs operations
>   net: hold netdev instance lock during ndo_bpf
>   net: replace dev_addr_sem with netdev instance lock
>   net: add option to request netdev instance lock
>   docs: net: document new locking reality
>   eth: bnxt: remove most dependencies on RTNL
>
>  Documentation/networking/netdevices.rst       |  65 +++-
>  drivers/net/bonding/bond_main.c               |  16 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt.c     | 133 ++++----
>  .../net/ethernet/broadcom/bnxt/bnxt_devlink.c |   9 +
>  .../net/ethernet/broadcom/bnxt/bnxt_sriov.c   |   6 +
>  drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c |  16 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c |  18 +-
>  drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c |   3 +-
>  drivers/net/ethernet/google/gve/gve_main.c    |  12 +-
>  drivers/net/ethernet/google/gve/gve_utils.c   |   6 +-
>  drivers/net/ethernet/intel/iavf/iavf_main.c   |  16 +-
>  drivers/net/netdevsim/ethtool.c               |   2 -
>  drivers/net/netdevsim/netdev.c                |  39 ++-
>  drivers/net/tap.c                             |   2 +-
>  drivers/net/tun.c                             |   2 +-
>  include/linux/netdevice.h                     |  90 ++++-
>  kernel/bpf/offload.c                          |   6 +-
>  net/8021q/vlan_dev.c                          |   4 +-
>  net/core/Makefile                             |   2 +-
>  net/core/dev.c                                | 284 ++++++----------
>  net/core/dev.h                                |  22 +-
>  net/core/dev_api.c                            | 318 ++++++++++++++++++
>  net/core/dev_ioctl.c                          |  69 ++--
>  net/core/net-sysfs.c                          |   9 +-
>  net/core/netdev_rx_queue.c                    |   5 +
>  net/core/rtnetlink.c                          |  50 ++-
>  net/dsa/conduit.c                             |  16 +-
>  net/ethtool/cabletest.c                       |  20 +-
>  net/ethtool/cmis_fw_update.c                  |   7 +-
>  net/ethtool/features.c                        |   6 +-
>  net/ethtool/ioctl.c                           |   6 +
>  net/ethtool/module.c                          |   8 +-
>  net/ethtool/netlink.c                         |  12 +
>  net/ethtool/phy.c                             |  20 +-
>  net/ethtool/rss.c                             |   2 +
>  net/ethtool/tsinfo.c                          |   9 +-
>  net/netfilter/nf_flow_table_offload.c         |   2 +-
>  net/netfilter/nf_tables_offload.c             |   2 +-
>  net/sched/sch_api.c                           | 214 ++++++++----
>  net/xdp/xsk.c                                 |   3 +
>  net/xdp/xsk_buff_pool.c                       |   2 +
>  41 files changed, 1045 insertions(+), 488 deletions(-)
>  create mode 100644 net/core/dev_api.c
>
> --
> 2.48.1
>
>