Message ID | 20250302000901.2729164-1-sdf@fomichev.me (mailing list archive) |
---|---|
Headers | show |
Series | net: Hold netdev instance lock during ndo operations | expand |
QE tested this series of patches with virtio-net regression tests, everything works fine. Tested-by: Lei Yang <leiyang@redhat.com> On Sun, Mar 2, 2025 at 8:09 AM Stanislav Fomichev <sdf@fomichev.me> wrote: > > As the gradual purging of rtnl continues, start grabbing netdev > instance lock in more places so we can get to the state where > most paths are working without rtnl. Start with requiring the > drivers that use shaper api (and later queue mgmt api) to work > with both rtnl and netdev instance lock. Eventually we might > attempt to drop rtnl. This mostly affects iavf, gve, bnxt and > netdev sim (as the drivers that implement shaper/queue mgmt) > so those drivers are converted in the process. > > call_netdevice_notifiers locking is very inconsistent and might need > a separate follow up. Some notified events are covered by the > instance lock, some are not, which might complicate the driver > expectations. > > Changes since v9: > - rework ndo_setup_tc locking (Saeed) > - net: hold netdev instance lock during ndo_setup_tc > - keep only nft parts (hopefully ok to keep Eric's RB) > - 2 new patches to grab the lock at sch_api netlink level > - net: sched: wrap doit/dumpit methods > - general refactoring to make it easier to grab instance lock > - net: hold netdev instance lock during qdisc ndo_setup_tc > - net: ethtool: try to protect all callback with netdev instance lock > - remove the lock around get_ts_info > > Changes since v8: > - rebase on top of net-next > > Changes since v7: > - fix AA deadlock detection in netdev_lock_cmp_fn (Jakub) > > Changes since v6: > - rebase on top of net-next > > Changes since v5: > - fix comment in bnxt_lock_sp (Michael) > - add netdev_lock/unlock around GVE suspend/resume (Sabrina) > - grab netdev lock around ethtool_ops->reset in cmis_fw_update_reset (Sabrina) > > Changes since v4: > - reword documentation about rtnl_lock and instance lock relation > (Jakub) > - do s/RTNL/rtnl_lock/ in the documentation (Jakub) > - mention dev_xxx/netif_xxx distinction (Paolo) > - add new patch to add request_ops_lock opt-in (Jakub) > - drop patch that adds shaper API to dummy (Jakub) > - drop () around dev in netdev_need_ops_lock > > Changes since v3: > - add instance lock to netdev_lockdep_set_classes, > move lock_set_cmp_fn to happen after set_class (NIPA) > > Changes since v2: > - new patch to replace dev_addr_sem with instance lock (forwarding tests) > - CONFIG_LOCKDEP around netdev_lock_cmp_fn (Jakub) > - remove netif_device_present check from dev_setup_tc (bpf_offload.py) > - reorder bpf_devs_locks and instance lock ordering in bpf map > offload (bpf_offload.py) > > Changes since v1: > - fix netdev_set_mtu_ext_locked in the wrong place (lkp@intel.com) > - add missing depend on CONFIG_NET_SHAPER for dummy device > (lkp@intel.com) > - not sure we need to apply dummy device patch.. > - need_netdev_ops_lock -> netdev_need_ops_lock (Jakub) > - remove netdev_assert_locked near napi_xxx_locked calls (Jakub) > - fix netdev_lock_cmp_fn comment and line length (Jakub) > - fix kdoc style of dev_api.c routines (Jakub) > - reflow dev_setup_tc to avoid indent (Jakub) > - keep tc_can_offload checks outside of dev_setup_tc (Jakub) > > Changes since RFC: > - other control paths are protected > - bntx has been converted to mostly depend on netdev instance lock > > Reviewed-by: Eric Dumazet <edumazet@google.com> > Cc: Saeed Mahameed <saeed@kernel.org> > Cc: David Wei <dw@davidwei.uk> > > Jakub Kicinski (1): > net: ethtool: try to protect all callback with netdev instance lock > > Stanislav Fomichev (13): > net: hold netdev instance lock during ndo_open/ndo_stop > net: hold netdev instance lock during nft ndo_setup_tc > net: sched: wrap doit/dumpit methods > net: hold netdev instance lock during qdisc ndo_setup_tc > net: hold netdev instance lock during queue operations > net: hold netdev instance lock during rtnetlink operations > net: hold netdev instance lock during ioctl operations > net: hold netdev instance lock during sysfs operations > net: hold netdev instance lock during ndo_bpf > net: replace dev_addr_sem with netdev instance lock > net: add option to request netdev instance lock > docs: net: document new locking reality > eth: bnxt: remove most dependencies on RTNL > > Documentation/networking/netdevices.rst | 65 +++- > drivers/net/bonding/bond_main.c | 16 +- > drivers/net/ethernet/broadcom/bnxt/bnxt.c | 133 ++++---- > .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 9 + > .../net/ethernet/broadcom/bnxt/bnxt_sriov.c | 6 + > drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 16 +- > drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 18 +- > drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 3 +- > drivers/net/ethernet/google/gve/gve_main.c | 12 +- > drivers/net/ethernet/google/gve/gve_utils.c | 6 +- > drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +- > drivers/net/netdevsim/ethtool.c | 2 - > drivers/net/netdevsim/netdev.c | 39 ++- > drivers/net/tap.c | 2 +- > drivers/net/tun.c | 2 +- > include/linux/netdevice.h | 90 ++++- > kernel/bpf/offload.c | 6 +- > net/8021q/vlan_dev.c | 4 +- > net/core/Makefile | 2 +- > net/core/dev.c | 284 ++++++---------- > net/core/dev.h | 22 +- > net/core/dev_api.c | 318 ++++++++++++++++++ > net/core/dev_ioctl.c | 69 ++-- > net/core/net-sysfs.c | 9 +- > net/core/netdev_rx_queue.c | 5 + > net/core/rtnetlink.c | 50 ++- > net/dsa/conduit.c | 16 +- > net/ethtool/cabletest.c | 20 +- > net/ethtool/cmis_fw_update.c | 7 +- > net/ethtool/features.c | 6 +- > net/ethtool/ioctl.c | 6 + > net/ethtool/module.c | 8 +- > net/ethtool/netlink.c | 12 + > net/ethtool/phy.c | 20 +- > net/ethtool/rss.c | 2 + > net/ethtool/tsinfo.c | 9 +- > net/netfilter/nf_flow_table_offload.c | 2 +- > net/netfilter/nf_tables_offload.c | 2 +- > net/sched/sch_api.c | 214 ++++++++---- > net/xdp/xsk.c | 3 + > net/xdp/xsk_buff_pool.c | 2 + > 41 files changed, 1045 insertions(+), 488 deletions(-) > create mode 100644 net/core/dev_api.c > > -- > 2.48.1 > >
As the gradual purging of rtnl continues, start grabbing netdev instance lock in more places so we can get to the state where most paths are working without rtnl. Start with requiring the drivers that use shaper api (and later queue mgmt api) to work with both rtnl and netdev instance lock. Eventually we might attempt to drop rtnl. This mostly affects iavf, gve, bnxt and netdev sim (as the drivers that implement shaper/queue mgmt) so those drivers are converted in the process. call_netdevice_notifiers locking is very inconsistent and might need a separate follow up. Some notified events are covered by the instance lock, some are not, which might complicate the driver expectations. Changes since v9: - rework ndo_setup_tc locking (Saeed) - net: hold netdev instance lock during ndo_setup_tc - keep only nft parts (hopefully ok to keep Eric's RB) - 2 new patches to grab the lock at sch_api netlink level - net: sched: wrap doit/dumpit methods - general refactoring to make it easier to grab instance lock - net: hold netdev instance lock during qdisc ndo_setup_tc - net: ethtool: try to protect all callback with netdev instance lock - remove the lock around get_ts_info Changes since v8: - rebase on top of net-next Changes since v7: - fix AA deadlock detection in netdev_lock_cmp_fn (Jakub) Changes since v6: - rebase on top of net-next Changes since v5: - fix comment in bnxt_lock_sp (Michael) - add netdev_lock/unlock around GVE suspend/resume (Sabrina) - grab netdev lock around ethtool_ops->reset in cmis_fw_update_reset (Sabrina) Changes since v4: - reword documentation about rtnl_lock and instance lock relation (Jakub) - do s/RTNL/rtnl_lock/ in the documentation (Jakub) - mention dev_xxx/netif_xxx distinction (Paolo) - add new patch to add request_ops_lock opt-in (Jakub) - drop patch that adds shaper API to dummy (Jakub) - drop () around dev in netdev_need_ops_lock Changes since v3: - add instance lock to netdev_lockdep_set_classes, move lock_set_cmp_fn to happen after set_class (NIPA) Changes since v2: - new patch to replace dev_addr_sem with instance lock (forwarding tests) - CONFIG_LOCKDEP around netdev_lock_cmp_fn (Jakub) - remove netif_device_present check from dev_setup_tc (bpf_offload.py) - reorder bpf_devs_locks and instance lock ordering in bpf map offload (bpf_offload.py) Changes since v1: - fix netdev_set_mtu_ext_locked in the wrong place (lkp@intel.com) - add missing depend on CONFIG_NET_SHAPER for dummy device (lkp@intel.com) - not sure we need to apply dummy device patch.. - need_netdev_ops_lock -> netdev_need_ops_lock (Jakub) - remove netdev_assert_locked near napi_xxx_locked calls (Jakub) - fix netdev_lock_cmp_fn comment and line length (Jakub) - fix kdoc style of dev_api.c routines (Jakub) - reflow dev_setup_tc to avoid indent (Jakub) - keep tc_can_offload checks outside of dev_setup_tc (Jakub) Changes since RFC: - other control paths are protected - bntx has been converted to mostly depend on netdev instance lock Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Saeed Mahameed <saeed@kernel.org> Cc: David Wei <dw@davidwei.uk> Jakub Kicinski (1): net: ethtool: try to protect all callback with netdev instance lock Stanislav Fomichev (13): net: hold netdev instance lock during ndo_open/ndo_stop net: hold netdev instance lock during nft ndo_setup_tc net: sched: wrap doit/dumpit methods net: hold netdev instance lock during qdisc ndo_setup_tc net: hold netdev instance lock during queue operations net: hold netdev instance lock during rtnetlink operations net: hold netdev instance lock during ioctl operations net: hold netdev instance lock during sysfs operations net: hold netdev instance lock during ndo_bpf net: replace dev_addr_sem with netdev instance lock net: add option to request netdev instance lock docs: net: document new locking reality eth: bnxt: remove most dependencies on RTNL Documentation/networking/netdevices.rst | 65 +++- drivers/net/bonding/bond_main.c | 16 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 133 ++++---- .../net/ethernet/broadcom/bnxt/bnxt_devlink.c | 9 + .../net/ethernet/broadcom/bnxt/bnxt_sriov.c | 6 + drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 16 +- drivers/net/ethernet/broadcom/bnxt/bnxt_vfr.c | 18 +- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 3 +- drivers/net/ethernet/google/gve/gve_main.c | 12 +- drivers/net/ethernet/google/gve/gve_utils.c | 6 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 16 +- drivers/net/netdevsim/ethtool.c | 2 - drivers/net/netdevsim/netdev.c | 39 ++- drivers/net/tap.c | 2 +- drivers/net/tun.c | 2 +- include/linux/netdevice.h | 90 ++++- kernel/bpf/offload.c | 6 +- net/8021q/vlan_dev.c | 4 +- net/core/Makefile | 2 +- net/core/dev.c | 284 ++++++---------- net/core/dev.h | 22 +- net/core/dev_api.c | 318 ++++++++++++++++++ net/core/dev_ioctl.c | 69 ++-- net/core/net-sysfs.c | 9 +- net/core/netdev_rx_queue.c | 5 + net/core/rtnetlink.c | 50 ++- net/dsa/conduit.c | 16 +- net/ethtool/cabletest.c | 20 +- net/ethtool/cmis_fw_update.c | 7 +- net/ethtool/features.c | 6 +- net/ethtool/ioctl.c | 6 + net/ethtool/module.c | 8 +- net/ethtool/netlink.c | 12 + net/ethtool/phy.c | 20 +- net/ethtool/rss.c | 2 + net/ethtool/tsinfo.c | 9 +- net/netfilter/nf_flow_table_offload.c | 2 +- net/netfilter/nf_tables_offload.c | 2 +- net/sched/sch_api.c | 214 ++++++++---- net/xdp/xsk.c | 3 + net/xdp/xsk_buff_pool.c | 2 + 41 files changed, 1045 insertions(+), 488 deletions(-) create mode 100644 net/core/dev_api.c