diff mbox series

[net-next,v4,11/12] docs: net: document new locking reality

Message ID 20250218020948.160643-12-sdf@fomichev.me (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net: Hold netdev instance lock during ndo operations | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next, async
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 40 this patch: 40
netdev/build_tools success Errors and warnings before: 26 (+1) this patch: 26 (+1)
netdev/cc_maintainers warning 4 maintainers not CCed: linux-doc@vger.kernel.org andrew+netdev@lunn.ch horms@kernel.org corbet@lwn.net
netdev/build_clang success Errors and warnings before: 541 this patch: 541
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4116 this patch: 4116
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 115 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 16 this patch: 16
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2025-02-18--12-00 (tests: 891)

Commit Message

Stanislav Fomichev Feb. 18, 2025, 2:09 a.m. UTC
Also clarify ndo_get_stats (that reads and write paths can run
concurrently) and mention only RCU.

Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
---
 Documentation/networking/netdevices.rst | 57 +++++++++++++++++++------
 include/linux/netdevice.h               |  4 ++
 2 files changed, 48 insertions(+), 13 deletions(-)

Comments

Jakub Kicinski Feb. 19, 2025, 2:53 a.m. UTC | #1
On Mon, 17 Feb 2025 18:09:47 -0800 Stanislav Fomichev wrote:
> +RTNL and netdev instance lock
> +=============================
> +
> +Historically, all networking control operations were protected by a single
> +global lock known as RTNL. There is an ongoing effort to replace this global

I think RTNL stands for RouTeNetLink. RTNL -> rtnl_lock here?

> +lock with separate locks for each network namespace. The netdev instance lock
> +represents another step towards making the locking mechanism more granular.

Reads a bit like the per-netns and instance locks are related.
Maybe rephrase as:

  lock with separate locks for each network namespace. Additionally, properties
  of individual netdev are increasingly protected by per-netdev locks.

> +For device drivers that implement shaping or queue management APIs, all control
> +operations will be performed under the netdev instance lock. Currently, this
> +instance lock is acquired within the context of RTNL. In the future, there will
> +be an option for individual drivers to opt out of using RTNL and instead
> +perform their control operations directly under the netdev instance lock.
> +
> +Devices drivers are encouraged to rely on the instance lock where possible.
Stanislav Fomichev Feb. 19, 2025, 4:52 a.m. UTC | #2
On 02/18, Jakub Kicinski wrote:
> On Mon, 17 Feb 2025 18:09:47 -0800 Stanislav Fomichev wrote:
> > +RTNL and netdev instance lock
> > +=============================
> > +
> > +Historically, all networking control operations were protected by a single
> > +global lock known as RTNL. There is an ongoing effort to replace this global
> 
> I think RTNL stands for RouTeNetLink. RTNL -> rtnl_lock here?

SG. Will do s/RTNL/rtnl_lock/ in a bunch of other (new) places.

> > +lock with separate locks for each network namespace. The netdev instance lock
> > +represents another step towards making the locking mechanism more granular.
> 
> Reads a bit like the per-netns and instance locks are related.
> Maybe rephrase as:
> 
>   lock with separate locks for each network namespace. Additionally, properties
>   of individual netdev are increasingly protected by per-netdev locks.

Sure.
 
> > +For device drivers that implement shaping or queue management APIs, all control
> > +operations will be performed under the netdev instance lock. Currently, this
> > +instance lock is acquired within the context of RTNL. In the future, there will
> > +be an option for individual drivers to opt out of using RTNL and instead
> > +perform their control operations directly under the netdev instance lock.
> > +
> > +Devices drivers are encouraged to rely on the instance lock where possible.
Paolo Abeni Feb. 19, 2025, 8:37 a.m. UTC | #3
On 2/18/25 3:09 AM, Stanislav Fomichev wrote:
[...]
> +RTNL and netdev instance lock
> +=============================
> +
> +Historically, all networking control operations were protected by a single
> +global lock known as RTNL. There is an ongoing effort to replace this global
> +lock with separate locks for each network namespace. The netdev instance lock
> +represents another step towards making the locking mechanism more granular.
> +
> +For device drivers that implement shaping or queue management APIs, all control
> +operations will be performed under the netdev instance lock. Currently, this
> +instance lock is acquired within the context of RTNL. In the future, there will
> +be an option for individual drivers to opt out of using RTNL and instead
> +perform their control operations directly under the netdev instance lock.
> +
> +Devices drivers are encouraged to rely on the instance lock where possible.

Possibly worth mentioning explicitly the netif_* <> dev_* helpers
relationship?

/P
Stanislav Fomichev Feb. 19, 2025, 3:43 p.m. UTC | #4
On 02/19, Paolo Abeni wrote:
> On 2/18/25 3:09 AM, Stanislav Fomichev wrote:
> [...]
> > +RTNL and netdev instance lock
> > +=============================
> > +
> > +Historically, all networking control operations were protected by a single
> > +global lock known as RTNL. There is an ongoing effort to replace this global
> > +lock with separate locks for each network namespace. The netdev instance lock
> > +represents another step towards making the locking mechanism more granular.
> > +
> > +For device drivers that implement shaping or queue management APIs, all control
> > +operations will be performed under the netdev instance lock. Currently, this
> > +instance lock is acquired within the context of RTNL. In the future, there will
> > +be an option for individual drivers to opt out of using RTNL and instead
> > +perform their control operations directly under the netdev instance lock.
> > +
> > +Devices drivers are encouraged to rely on the instance lock where possible.
> 
> Possibly worth mentioning explicitly the netif_* <> dev_* helpers
> relationship?

Sure, let me try to add a sentence about that as well..
diff mbox series

Patch

diff --git a/Documentation/networking/netdevices.rst b/Documentation/networking/netdevices.rst
index 1d37038e9fbe..0d2ab558f86a 100644
--- a/Documentation/networking/netdevices.rst
+++ b/Documentation/networking/netdevices.rst
@@ -210,49 +210,55 @@  packets is preferred.
 struct net_device synchronization rules
 =======================================
 ndo_open:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 ndo_stop:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 	Note: netif_running() is guaranteed false
 
 ndo_do_ioctl:
 	Synchronization: rtnl_lock() semaphore.
-	Context: process
 
-        This is only called by network subsystems internally,
-        not by user space calling ioctl as it was in before
-        linux-5.14.
+	This is only called by network subsystems internally,
+	not by user space calling ioctl as it was in before
+	linux-5.14.
 
 ndo_siocbond:
-        Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
         Context: process
 
-        Used by the bonding driver for the SIOCBOND family of
-        ioctl commands.
+	Used by the bonding driver for the SIOCBOND family of
+	ioctl commands.
 
 ndo_siocwandev:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 	Used by the drivers/net/wan framework to handle
 	the SIOCWANDEV ioctl with the if_settings structure.
 
 ndo_siocdevprivate:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 	This is used to implement SIOCDEVPRIVATE ioctl helpers.
 	These should not be added to new drivers, so don't use.
 
 ndo_eth_ioctl:
-	Synchronization: rtnl_lock() semaphore.
+	Synchronization: rtnl_lock() semaphore. In addition, netdev instance
+	lock if the driver implements queue management or shaper API.
 	Context: process
 
 ndo_get_stats:
-	Synchronization: rtnl_lock() semaphore, or RCU.
+	Synchronization: RCU (can be called concurrently with the stats
+	update path).
 	Context: atomic (can't sleep under RCU)
 
 ndo_start_xmit:
@@ -284,6 +290,10 @@  struct net_device synchronization rules
 	Synchronization: netif_addr_lock spinlock.
 	Context: BHs disabled
 
+Most ndo callbacks not specified in the list above are running
+under RTNL. In addition, netdev instance lock is taken as well if
+the driver implements queue management or shaper API.
+
 struct napi_struct synchronization rules
 ========================================
 napi->poll:
@@ -298,6 +308,27 @@  struct napi_struct synchronization rules
 		 softirq
 		 will be called with interrupts disabled by netconsole.
 
+struct netdev_queue_mgmt_ops synchronization rules
+==================================================
+
+All queue management ndo callbacks are holding netdev instance lock.
+
+RTNL and netdev instance lock
+=============================
+
+Historically, all networking control operations were protected by a single
+global lock known as RTNL. There is an ongoing effort to replace this global
+lock with separate locks for each network namespace. The netdev instance lock
+represents another step towards making the locking mechanism more granular.
+
+For device drivers that implement shaping or queue management APIs, all control
+operations will be performed under the netdev instance lock. Currently, this
+instance lock is acquired within the context of RTNL. In the future, there will
+be an option for individual drivers to opt out of using RTNL and instead
+perform their control operations directly under the netdev instance lock.
+
+Devices drivers are encouraged to rely on the instance lock where possible.
+
 NETDEV_INTERNAL symbol namespace
 ================================
 
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index f3b1b6df775f..983c8e9e767f 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2470,6 +2470,10 @@  struct net_device {
 	 *
 	 * Also protects some fields in struct napi_struct.
 	 *
+	 * For the drivers that implement shaper or queue API, the scope
+	 * of this lock is expanded to cover most ndo/queue/ethtool/sysfs
+	 * operations.
+	 *
 	 * Ordering: take after rtnl_lock.
 	 */
 	struct mutex		lock;