From patchwork Thu Dec 12 11:56:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905098 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B165212F84 for ; Thu, 12 Dec 2024 11:56:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004577; cv=none; b=sseGvOj0DVclsi+tmYb4Rqps1FMZIYtqmrDpIAIMldrsz6eZsTy44ypFGhAeRHSbVFYC0AvOU2uJfBdyDGLC4IBojyMpQcXaCfymGv0sFw2/M7I3HETf2vGmya+iqV5hOclLH3GfkyC4q5EH5eSetQeruqscp7nvypf8H0Wy+lk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004577; c=relaxed/simple; bh=fU88zLcq6ecPFqxhMBQ+2G9JGuR1hEfsB9CdeCQqDmY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Ln5DLGKKcU21lb0AYYbliqxksoFSjx5sijhbtOhdDlRvP/Htq4ZxhjSS7DUKuwu0mr8p7U/Mu3XsQUXeB5NT67C3YB8dMmj4VyoY6oVdol7um5LY3BISPLAg59gKgk1+v21rjqEe1rqyKQeu3EDdvTR6AD71F+QMWF8i7o20hgg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=l4KiaQzm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="l4KiaQzm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 36E01C4CED0; Thu, 12 Dec 2024 11:56:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004576; bh=fU88zLcq6ecPFqxhMBQ+2G9JGuR1hEfsB9CdeCQqDmY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=l4KiaQzmKUsYHc4h/FleA3LUDUmBIDPrenQAy14MNEc2KqxQt/mwArQCi1ztB0OJW XQkM/DO+fvAX8VjIqR3VhZk33sj0P436sU1UZPm0CPdZB+Y/u9fAsoSxeIJxor0YEp Zbm1phv6crPceXcpd030KZJK/5wW/hkWi9jdQaUM8Glg8vpH/zlLE6Hy3J7Uz86D9i ADd+agxpAWnAV8/fnazHJ/LbXRb4JBdfZ/RLYyt1DEZqam85eM0Wfe1i+DED+fGrE9 GZLNC5b2VInweNgpXWOTqGFJo5a5uv7zsB9l/7i3xfTlseVLB7zg8LMw6hyjWGzK7A ae3ex28RmfSvg== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:00 +0100 Subject: [PATCH v2 1/8] mount: remove inlude/nospec.h include Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-1-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=545; i=brauner@kernel.org; h=from:subject:message-id; bh=fU88zLcq6ecPFqxhMBQ+2G9JGuR1hEfsB9CdeCQqDmY=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY+Zfe+Zw754Px5NRuY8lT0Ht6/U9/8x+0dWW7DR9 3lz+K04OkpZGMS4GGTFFFkc2k3C5ZbzVGw2ytSAmcPKBDKEgYtTACbi/JLhN8s/pZMWV/7EfvXf U/VIMqZA/fyX5I319zbLqZuKPcs7cJ6R4SXvo8tTHs3ayGiar6dppHrZ3La3SdVpuYtzpd1zo+v /eQA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 It's not needed, so remove it. Signed-off-by: Christian Brauner --- fs/namespace.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index 23e81c2a1e3fee7d97df2a84a69438a677933654..c3dbe6a7ab6b1c77c2693cc75941da89fa921048 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -32,7 +32,6 @@ #include #include #include -#include #include "pnode.h" #include "internal.h" From patchwork Thu Dec 12 11:56:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905099 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E405214A92 for ; Thu, 12 Dec 2024 11:56:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004580; cv=none; b=ogwnTxAHkwDs/Euc874d/lsl/qzzjhP2xnJ5h8Qszw/SR8rzfmelurzaba6Q6Npv+wlgk0fuPXflYA35u00HXrIsxQChvPz84eyascuYFcZxehm4FijQDCFZ4wcPeLQMzVw55ZVW1hRQL62D6hFcHXVJtmosqGQp8tTa9d4Cj/w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004580; c=relaxed/simple; bh=jkMKbLRTLoaucwliP/7pN/v0DVUIV8X0yAaVu+aF01E=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=jP2OrBr2gC8BvSoskCkuYYejIx5m0YD0PhGNPY7r820m7c9VBNUJHHXI5RKgvhhJ1szJbaJtwS5hY9uKcx20yQep0l0bzqmDMuhPhr3HMAmaYCDEEGQ2dE6x3Po10GpTILYJCz/RBUqh/c54GMZJcHeXb+DIF27OOBRLRTBVeJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=plsGg2KJ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="plsGg2KJ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A5FBC4CED7; Thu, 12 Dec 2024 11:56:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004578; bh=jkMKbLRTLoaucwliP/7pN/v0DVUIV8X0yAaVu+aF01E=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=plsGg2KJn0FgNeyWFJBFMBsTJjyaEJzczZkA5MTV+nV7UL63HYSVygNUUpx3L7Mko M63fSK3PlmzXij8mVE7eZtL0k7MavY4wFoBmBzo87WRfih7hAnQSJLQwwy7t6TKd6A Iih7dzHBnV3AlfNvwSLTfyryEWaXoOOoyFO9mJFpqbA03gnHCY8OBCsb8ePO24caXM vRnbKC7Q1/4WvEtR0lTC9TnCSpjcXyWrtzGYi/4StUwVzPN9v/+p125FlfzQ1pBuWp LjieTgHV4xUhmewuC70IfyajHI4ITeJdL/QRwO7zwypnKuIynELX8TGCjomKoc9UYp +tDFVjP+//sWw== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:01 +0100 Subject: [PATCH v2 2/8] fs: add mount namespace to rbtree late Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-2-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=964; i=brauner@kernel.org; h=from:subject:message-id; bh=jkMKbLRTLoaucwliP/7pN/v0DVUIV8X0yAaVu+aF01E=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY/ZJ+KzmHHpXuMlNq/Dz7QcTY5Wm+SVwim1ck5M5 czHJxZadJSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAExE6jXDX/Hueb+Y1gpfr8ia ZcJ+J8H92aG+7IR/7G+rOERCVnmwKTL84ZcSPCD+653Fcge5Ixrn4+KLJr/7dFPR51pI6N+WRZd OMQIA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 There's no point doing that under the namespace semaphore it just gives the false impression that it protects the mount namespace rbtree and it simply doesn't. Signed-off-by: Christian Brauner --- fs/namespace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/namespace.c b/fs/namespace.c index c3dbe6a7ab6b1c77c2693cc75941da89fa921048..10fa18dd66018fadfdc9d18c59a851eed7bd55ad 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -3983,7 +3983,6 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, while (p->mnt.mnt_root != q->mnt.mnt_root) p = next_mnt(skip_mnt_tree(p), old); } - mnt_ns_tree_add(new_ns); namespace_unlock(); if (rootmnt) @@ -3991,6 +3990,7 @@ struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns, if (pwdmnt) mntput(pwdmnt); + mnt_ns_tree_add(new_ns); return new_ns; } From patchwork Thu Dec 12 11:56:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905100 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F11F921505C for ; Thu, 12 Dec 2024 11:56:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004581; cv=none; b=HO+bYdg+O+xq7dGu9xF0AY7IcGsCtxnKFR8tEQBZQlkgcAErXFsK6cCtLDjnt1wKK/eoCNS6V+liKsr4RIfGL9jv5EBDNQG0kamcOXj6H8ZBsZ403WFZ3ph0JLRsRi9O7ulx25cZwSAjmbSlS2CextFYbJXmQoaGwL+C/SpUir4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004581; c=relaxed/simple; bh=9uZqIQyo8NI+Xi741dduNxbfHpc/apSEbGpd2RtJsJI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=R+UjCq1kTD0l/pPT3EJJXauHUbY5MsIZdKeR9eYEy8G91EBZ9B1Su4Bm3sidErPV82cyLpVgyflYWcHleMDzeQenZTrYxDV21dVF8Qq4HgdSlzM2u9EA7F0EnP643Nz1LaX+FhZD6cRbyJ8RKOfmVLSSugTDQbUI1JjE9tZY1UE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=pWRH0tDV; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="pWRH0tDV" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4B1D9C4CED0; Thu, 12 Dec 2024 11:56:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004580; bh=9uZqIQyo8NI+Xi741dduNxbfHpc/apSEbGpd2RtJsJI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=pWRH0tDVJjNtwkwmZ2ZQxaGAR4kd9ZisQa90QP/GN01thzzLrReQQ1IkS3hIR20e+ 6Tw8NoowdpPDgbxXo2HchiUZKcNtNEtoha07qWfB5JPDVaYhGmgniWuMyWTLqBUnrw Rh3QGP8vij6hS7e/oNCNEdzUINmyy4KgnZYjynCi0IVgFOMnmWlieQnBZtYFyL0lfP BncAeP39A4vt41NjcFYG5JDGtD4tXTanvwuO58GcZcqpQvuZz+OWY+5qSfpHn66IRa PLV1Q28UCVanymNJBcdbWZmNMJkKaAFVl21hiH6WJgpNmRU/kVi2ffNTiYJq+eTZlw lZ2YBhmwUAkyQ== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:02 +0100 Subject: [PATCH v2 3/8] fs: lockless mntns rbtree lookup Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-3-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=6771; i=brauner@kernel.org; h=from:subject:message-id; bh=9uZqIQyo8NI+Xi741dduNxbfHpc/apSEbGpd2RtJsJI=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY85x+uo+SzkhCzjquzUHw+PeAt9L8t8a9L8bvmTV 5MSxBp9O0pZGMS4GGTFFFkc2k3C5ZbzVGw2ytSAmcPKBDKEgYtTACZif4OR4VHDp8/inroOc2Xk n3IveddZt/KeTeG5+A+fT0T/11/ZeZSR4fCEhy5+cv/fvrtfxRi+6nNL8PUN7bITc/WMY/ffmTa fmRUA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Currently we use a read-write lock but for the simple search case we can make this lockless. Creating a new mount namespace is a rather rare event compared with querying mounts in a foreign mount namespace. Once this is picked up by e.g., systemd to list mounts in another mount in it's isolated services or in containers this will be used a lot so this seems worthwhile doing. Signed-off-by: Christian Brauner --- fs/mount.h | 5 ++- fs/namespace.c | 119 +++++++++++++++++++++++++++++++++++---------------------- 2 files changed, 77 insertions(+), 47 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index 185fc56afc13338f8185fe818051444d540cbd5b..3c3763d8ae821d6a117c528808dbc94d0251f964 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -16,7 +16,10 @@ struct mnt_namespace { u64 event; unsigned int nr_mounts; /* # of mounts in the namespace */ unsigned int pending_mounts; - struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ + union { + struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ + struct rcu_head mnt_ns_rcu; + }; refcount_t passive; /* number references not pinning @mounts */ } __randomize_layout; diff --git a/fs/namespace.c b/fs/namespace.c index 10fa18dd66018fadfdc9d18c59a851eed7bd55ad..9463b9ab95f0a5db32cfe5fc5564d7f25ce3e06f 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -79,6 +79,8 @@ static DECLARE_RWSEM(namespace_sem); static HLIST_HEAD(unmounted); /* protected by namespace_sem */ static LIST_HEAD(ex_mountpoints); /* protected by namespace_sem */ static DEFINE_RWLOCK(mnt_ns_tree_lock); +static seqcount_rwlock_t mnt_ns_tree_seqcount = SEQCNT_RWLOCK_ZERO(mnt_ns_tree_seqcount, &mnt_ns_tree_lock); + static struct rb_root mnt_ns_tree = RB_ROOT; /* protected by mnt_ns_tree_lock */ struct mount_kattr { @@ -105,17 +107,6 @@ EXPORT_SYMBOL_GPL(fs_kobj); */ __cacheline_aligned_in_smp DEFINE_SEQLOCK(mount_lock); -static int mnt_ns_cmp(u64 seq, const struct mnt_namespace *ns) -{ - u64 seq_b = ns->seq; - - if (seq < seq_b) - return -1; - if (seq > seq_b) - return 1; - return 0; -} - static inline struct mnt_namespace *node_to_mnt_ns(const struct rb_node *node) { if (!node) @@ -123,19 +114,41 @@ static inline struct mnt_namespace *node_to_mnt_ns(const struct rb_node *node) return rb_entry(node, struct mnt_namespace, mnt_ns_tree_node); } -static bool mnt_ns_less(struct rb_node *a, const struct rb_node *b) +static int mnt_ns_cmp(struct rb_node *a, const struct rb_node *b) { struct mnt_namespace *ns_a = node_to_mnt_ns(a); struct mnt_namespace *ns_b = node_to_mnt_ns(b); u64 seq_a = ns_a->seq; + u64 seq_b = ns_b->seq; + + if (seq_a < seq_b) + return -1; + if (seq_a > seq_b) + return 1; + return 0; +} - return mnt_ns_cmp(seq_a, ns_b) < 0; +static inline void mnt_ns_tree_write_lock(void) +{ + write_lock(&mnt_ns_tree_lock); + write_seqcount_begin(&mnt_ns_tree_seqcount); +} + +static inline void mnt_ns_tree_write_unlock(void) +{ + write_seqcount_end(&mnt_ns_tree_seqcount); + write_unlock(&mnt_ns_tree_lock); } static void mnt_ns_tree_add(struct mnt_namespace *ns) { - guard(write_lock)(&mnt_ns_tree_lock); - rb_add(&ns->mnt_ns_tree_node, &mnt_ns_tree, mnt_ns_less); + struct rb_node *node; + + mnt_ns_tree_write_lock(); + node = rb_find_add_rcu(&ns->mnt_ns_tree_node, &mnt_ns_tree, mnt_ns_cmp); + mnt_ns_tree_write_unlock(); + + WARN_ON_ONCE(node); } static void mnt_ns_release(struct mnt_namespace *ns) @@ -150,41 +163,36 @@ static void mnt_ns_release(struct mnt_namespace *ns) } DEFINE_FREE(mnt_ns_release, struct mnt_namespace *, if (_T) mnt_ns_release(_T)) +static void mnt_ns_release_rcu(struct rcu_head *rcu) +{ + struct mnt_namespace *mnt_ns; + + mnt_ns = container_of(rcu, struct mnt_namespace, mnt_ns_rcu); + mnt_ns_release(mnt_ns); +} + static void mnt_ns_tree_remove(struct mnt_namespace *ns) { /* remove from global mount namespace list */ if (!is_anon_ns(ns)) { - guard(write_lock)(&mnt_ns_tree_lock); + mnt_ns_tree_write_lock(); rb_erase(&ns->mnt_ns_tree_node, &mnt_ns_tree); + mnt_ns_tree_write_unlock(); } - mnt_ns_release(ns); + call_rcu(&ns->mnt_ns_rcu, mnt_ns_release_rcu); } -/* - * Returns the mount namespace which either has the specified id, or has the - * next smallest id afer the specified one. - */ -static struct mnt_namespace *mnt_ns_find_id_at(u64 mnt_ns_id) +static int mnt_ns_find(const void *key, const struct rb_node *node) { - struct rb_node *node = mnt_ns_tree.rb_node; - struct mnt_namespace *ret = NULL; - - lockdep_assert_held(&mnt_ns_tree_lock); - - while (node) { - struct mnt_namespace *n = node_to_mnt_ns(node); + const u64 mnt_ns_id = *(u64 *)key; + const struct mnt_namespace *ns = node_to_mnt_ns(node); - if (mnt_ns_id <= n->seq) { - ret = node_to_mnt_ns(node); - if (mnt_ns_id == n->seq) - break; - node = node->rb_left; - } else { - node = node->rb_right; - } - } - return ret; + if (mnt_ns_id < ns->seq) + return -1; + if (mnt_ns_id > ns->seq) + return 1; + return 0; } /* @@ -194,18 +202,37 @@ static struct mnt_namespace *mnt_ns_find_id_at(u64 mnt_ns_id) * namespace the @namespace_sem must first be acquired. If the namespace has * already shut down before acquiring @namespace_sem, {list,stat}mount() will * see that the mount rbtree of the namespace is empty. + * + * Note the lookup is lockless protected by a sequence counter. We only + * need to guard against false negatives as false positives aren't + * possible. So if we didn't find a mount namespace and the sequence + * counter has changed we need to retry. If the sequence counter is + * still the same we know the search actually failed. */ static struct mnt_namespace *lookup_mnt_ns(u64 mnt_ns_id) { - struct mnt_namespace *ns; + struct mnt_namespace *ns; + struct rb_node *node; + unsigned int seq; + + guard(rcu)(); + do { + seq = read_seqcount_begin(&mnt_ns_tree_seqcount); + node = rb_find_rcu(&mnt_ns_id, &mnt_ns_tree, mnt_ns_find); + if (node) + break; + } while (read_seqcount_retry(&mnt_ns_tree_seqcount, seq)); - guard(read_lock)(&mnt_ns_tree_lock); - ns = mnt_ns_find_id_at(mnt_ns_id); - if (!ns || ns->seq != mnt_ns_id) - return NULL; + if (!node) + return NULL; - refcount_inc(&ns->passive); - return ns; + /* + * The last reference count is put with after RCU delay so we + * don't need to use refcount_inc_not_zero(). + */ + ns = node_to_mnt_ns(node); + refcount_inc(&ns->passive); + return ns; } static inline void lock_mount_hash(void) From patchwork Thu Dec 12 11:56:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905101 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F47D2153D5 for ; Thu, 12 Dec 2024 11:56:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004583; cv=none; b=TKkDbDBJKnrKqoymv2SYLCe9hyVGQ1sJaSZa6IDKBUPsjSb/p6Cg4EsDbznrMYMzH8uaMjZnjU56whO+IX5bQepmLLwWJP97ipcWKX+Frj3b0dZzw7JM17xUpxmTe6v/KstlpF/flkNqTcsHj0u8BhubdDEievjCTTLtOlQkPEA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004583; c=relaxed/simple; bh=cmLS3GlKdZz8MW8KJcFOeOz0qdAgcy+OM7/l7RvuS3g=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Vy+NtxbIZj3nhkJ+y1bhQTAlEif2/srqhtBP6G02e0UFWVoa7JaJT+Tq9gqO7UZ/mwbhKgsWWbmMpoM/k7Q+eGoseO/gDWugy4Q2KZE2qvqjCupA0Ke4PXAIuMF8nq10Uh0sRwziShPPlYGX9hFAubJq28mZLpvAC63es8xDplA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=XF2zBCVj; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="XF2zBCVj" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A92AC4CEDD; Thu, 12 Dec 2024 11:56:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004582; bh=cmLS3GlKdZz8MW8KJcFOeOz0qdAgcy+OM7/l7RvuS3g=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=XF2zBCVj5lchKRgDwUmwOvJTwp59nX8C07qD9DEGXjBtDc5L+EzivCTvSI6sGNwkp c3aG5SadNLRwI/Iy/BwFs8QiHs5gra7G5QQWLeUC1TT6LyoasX9Qdc7cyA+T45eWM8 6xkzPt1f0gQqEDreiqM9cXKoNBQPnD241eGLpyjpGLbbiJkYBM0HT3cxEJ6DlAE1qb ahkJigWBXxtd1VZy4UuN7t1cNRKuq+iqR9vBcZkuTtwmlLIMRhRdvzz53H2MoF5jDA KZ+xisLABZGrK+V/JkI/X+1bFGPJKY6qsg5dl4++FZIGpi9nYki94aSAACCKX+Wvzh g8qBX6Uaj/+NQ== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:03 +0100 Subject: [PATCH v2 4/8] rculist: add list_bidir_{del,prev}_rcu() Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-4-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=3022; i=brauner@kernel.org; h=from:subject:message-id; bh=cmLS3GlKdZz8MW8KJcFOeOz0qdAgcy+OM7/l7RvuS3g=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY/x3nPl0klftqOnFhkdSlz7fqLy08dl7UKMnh367 yQ/1BuxdpSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAExEnp3hf8CiL7c1fD6LaN1O nsdhLJukHXtuyvGjQqvT+lYu1QhzkGRk2PsuLN30omBFSVe/x4kg54kP7/89uzVRJsvea+37sFB +PgA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Currently there is no primite for retrieving the previous list member. To do this we need a new deletion primite that doesn't poison the prev pointer and a corresponding retrieval helper. Note that it is not valid to ues both list_del_rcu() and list_bidir_del_rcu() on the same list. Suggested-by: "Paul E. McKenney" Signed-off-by: Christian Brauner --- include/linux/rculist.h | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/include/linux/rculist.h b/include/linux/rculist.h index 14dfa6008467e803d57f98cfa0275569f1c6a181..c81f9e5a789928ae6825c89325396d638b3e48c5 100644 --- a/include/linux/rculist.h +++ b/include/linux/rculist.h @@ -30,6 +30,14 @@ static inline void INIT_LIST_HEAD_RCU(struct list_head *list) * way, we must not access it directly */ #define list_next_rcu(list) (*((struct list_head __rcu **)(&(list)->next))) +/* + * Return the ->prev pointer of a list_head in an rcu safe way. Don't + * access it directly. + * + * In order to use list_bidir_prev_rcu() deletions must only be done via + * list_bidir_del() to avoid poisoning the ->prev pointer. + */ +#define list_bidir_prev_rcu(list) (*((struct list_head __rcu **)(&(list)->prev))) /** * list_tail_rcu - returns the prev pointer of the head of the list @@ -158,6 +166,41 @@ static inline void list_del_rcu(struct list_head *entry) entry->prev = LIST_POISON2; } +/** + * list_bidir_del_rcu - deletes entry from list without re-initialization + * @entry: the element to delete from the list. + * + * In contrat to list_del_rcu() doesn't poison the previous pointer thus + * allowing to go backwards via list_prev_bidir_rcu(). + * + * Note: list_empty() on entry does not return true after this, + * the entry is in an undefined state. It is useful for RCU based + * lockfree traversal. + * + * In particular, it means that we can not poison the forward + * pointers that may still be used for walking the list. + * + * The caller must take whatever precautions are necessary + * (such as holding appropriate locks) to avoid racing + * with another list-mutation primitive, such as list_bidir_del_rcu() + * or list_add_rcu(), running on this same list. + * However, it is perfectly legal to run concurrently with + * the _rcu list-traversal primitives, such as + * list_for_each_entry_rcu(). + * + * Noe that the it is not allowed to use list_del_rcu() and + * list_bidir_del_rcu() on the same list. + * + * Note that the caller is not permitted to immediately free + * the newly deleted entry. Instead, either synchronize_rcu() + * or call_rcu() must be used to defer freeing until an RCU + * grace period has elapsed. + */ +static inline void list_bidir_del_rcu(struct list_head *entry) +{ + __list_del_entry(entry); +} + /** * hlist_del_init_rcu - deletes entry from hash list with re-initialization * @n: the element to delete from the hash list. From patchwork Thu Dec 12 11:56:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905102 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4147A212B09 for ; Thu, 12 Dec 2024 11:56:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004585; cv=none; b=o9wgz638CwEKKvoNNUmj2KjbB4Y9OElhTd9ENngjgHIvhPwSUDbtdWxRORVgY1KqupFxKT6S3TA5r28t3SLJ4ThRgwh1zNlSJlF+KM6ghVHKL9aGMchQo8jBq8cp6Oa0LXJEAKDz61i35fzEiFK74KRDl/KRnKqwrZHYt2/z1QY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004585; c=relaxed/simple; bh=xmuZyV5Vglw81sFyX91aDi/KlVW2DZtC+gI8wbR8FEg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=maMWDSsTVFIZNBtQmh1w0ZwJB9aSRG/NwTOGb/Qa4uNM9f5k1anq3ulp5Whq0H9OqBDjv3T2IPJ1CZCgN5QHFvAQnVTd/rctiioIVB79Xi62nzD/DiE8mc9sodSNHwy30YjlQOTSD7QgHMvKEBykEWW3ltl1Lgn1aPSOmtvQqI4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BUU1RRCs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BUU1RRCs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E3A2C4CED0; Thu, 12 Dec 2024 11:56:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004584; bh=xmuZyV5Vglw81sFyX91aDi/KlVW2DZtC+gI8wbR8FEg=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BUU1RRCsP5YvXICxDmKkLFs68YMwy9ulTI9xK1y+TR51tvjBP0IPl9usTua8uyNkx bv1Qu7UqT2/UEPSBdJZ4F3u1c5vi8EobnbNYESNBENb5PE29PCPjFNdHZUQ3nK1fzL 9AThSl/d19E9d6Q8ili0Bk0zl24ASgdNQIuHtyFKGmVzx7TyUgjOuC8BkEvQBCEyYh LZDDl3k/SSZGlO+cGopFgIenI6RXS+wnB8t5yfe7DygJcNQjKcn5HD6H/dm55IT8JQ 3p4yABWf36NQiQjBOYfwf5qMa3KZLnt/hrvRyGUmeXrGuhjp1nZxv3CDWlsNxM3MJE gnmOtkGiWMxYA== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:04 +0100 Subject: [PATCH v2 5/8] fs: lockless mntns lookup for nsfs Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-5-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=5823; i=brauner@kernel.org; h=from:subject:message-id; bh=xmuZyV5Vglw81sFyX91aDi/KlVW2DZtC+gI8wbR8FEg=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY+5XnV4s3BM4q6fc7Sn8nxhfzjhcEVcZXSAwQ2fl zrGnf8fdZSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAEyEXY3hv88M1luWl9sS5oTZ Np5cy6zM/O1q9gurc8xvMgQvhU1XE2Bk6LzyedrFi/eLdl2zPPhkGcfzs9dLTn3nuP5n7pTtEwR ZO5kA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 We already made the rbtree lookup lockless for the simple lookup case. However, walking the list of mount namespaces via nsfs still happens with taking the read lock blocking concurrent additions of new mount namespaces pointlessly. Plus, such additions are rare anyway so allow lockless lookup of the previous and next mount namespace by keeping a separate list. This also allows to make some things simpler in the code. Suggested-by: Peter Zijlstra Signed-off-by: Christian Brauner --- fs/mount.h | 17 +++++++---------- fs/namespace.c | 34 +++++++++++++++++++++------------- fs/nsfs.c | 5 +---- 3 files changed, 29 insertions(+), 27 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index 3c3763d8ae821d6a117c528808dbc94d0251f964..b7edb4034c2131b758f953cefbf47d060e27e03a 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -17,7 +17,10 @@ struct mnt_namespace { unsigned int nr_mounts; /* # of mounts in the namespace */ unsigned int pending_mounts; union { - struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ + struct { + struct list_head mnt_ns_list; + struct rb_node mnt_ns_tree_node; /* node in the mnt_ns_tree */ + }; struct rcu_head mnt_ns_rcu; }; refcount_t passive; /* number references not pinning @mounts */ @@ -157,15 +160,9 @@ static inline void move_from_ns(struct mount *mnt, struct list_head *dt_list) } bool has_locked_children(struct mount *mnt, struct dentry *dentry); -struct mnt_namespace *__lookup_next_mnt_ns(struct mnt_namespace *mnt_ns, bool previous); -static inline struct mnt_namespace *lookup_next_mnt_ns(struct mnt_namespace *mntns) -{ - return __lookup_next_mnt_ns(mntns, false); -} -static inline struct mnt_namespace *lookup_prev_mnt_ns(struct mnt_namespace *mntns) -{ - return __lookup_next_mnt_ns(mntns, true); -} +struct mnt_namespace *get_sequential_mnt_ns(struct mnt_namespace *mnt_ns, + bool previous); + static inline struct mnt_namespace *to_mnt_ns(struct ns_common *ns) { return container_of(ns, struct mnt_namespace, ns); diff --git a/fs/namespace.c b/fs/namespace.c index 9463b9ab95f0a5db32cfe5fc5564d7f25ce3e06f..a5e1b166be9430d47c295159292cb9028b2e2339 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -82,6 +82,7 @@ static DEFINE_RWLOCK(mnt_ns_tree_lock); static seqcount_rwlock_t mnt_ns_tree_seqcount = SEQCNT_RWLOCK_ZERO(mnt_ns_tree_seqcount, &mnt_ns_tree_lock); static struct rb_root mnt_ns_tree = RB_ROOT; /* protected by mnt_ns_tree_lock */ +static LIST_HEAD(mnt_ns_list); /* protected by mnt_ns_tree_lock */ struct mount_kattr { unsigned int attr_set; @@ -146,6 +147,7 @@ static void mnt_ns_tree_add(struct mnt_namespace *ns) mnt_ns_tree_write_lock(); node = rb_find_add_rcu(&ns->mnt_ns_tree_node, &mnt_ns_tree, mnt_ns_cmp); + list_add_tail_rcu(&ns->mnt_ns_list, &mnt_ns_list); mnt_ns_tree_write_unlock(); WARN_ON_ONCE(node); @@ -177,6 +179,7 @@ static void mnt_ns_tree_remove(struct mnt_namespace *ns) if (!is_anon_ns(ns)) { mnt_ns_tree_write_lock(); rb_erase(&ns->mnt_ns_tree_node, &mnt_ns_tree); + list_bidir_del_rcu(&ns->mnt_ns_list); mnt_ns_tree_write_unlock(); } @@ -2091,30 +2094,34 @@ struct ns_common *from_mnt_ns(struct mnt_namespace *mnt) return &mnt->ns; } -struct mnt_namespace *__lookup_next_mnt_ns(struct mnt_namespace *mntns, bool previous) +struct mnt_namespace *get_sequential_mnt_ns(struct mnt_namespace *mntns, bool previous) { - guard(read_lock)(&mnt_ns_tree_lock); - for (;;) { - struct rb_node *node; + struct list_head *list; + + guard(rcu)(); + for (;;) { if (previous) - node = rb_prev(&mntns->mnt_ns_tree_node); + list = rcu_dereference(list_bidir_prev_rcu(&mntns->mnt_ns_list)); else - node = rb_next(&mntns->mnt_ns_tree_node); - if (!node) + list = rcu_dereference(list_next_rcu(&mntns->mnt_ns_list)); + if (list_is_head(list, &mnt_ns_list)) return ERR_PTR(-ENOENT); - mntns = node_to_mnt_ns(node); - node = &mntns->mnt_ns_tree_node; + mntns = list_entry_rcu(list, struct mnt_namespace, mnt_ns_list); + /* + * The last passive reference count is put with RCU + * delay so accessing the mount namespace is not just + * safe it's members are all still valid. + */ if (!ns_capable_noaudit(mntns->user_ns, CAP_SYS_ADMIN)) continue; /* - * Holding mnt_ns_tree_lock prevents the mount namespace from - * being freed but it may well be on it's deathbed. We want an - * active reference, not just a passive one here as we're - * persisting the mount namespace. + * We need an active reference count as we're persisting + * the mount namespace and it might already be on its + * deathbed. */ if (!refcount_inc_not_zero(&mntns->ns.count)) continue; @@ -3931,6 +3938,7 @@ static struct mnt_namespace *alloc_mnt_ns(struct user_namespace *user_ns, bool a refcount_set(&new_ns->ns.count, 1); refcount_set(&new_ns->passive, 1); new_ns->mounts = RB_ROOT; + INIT_LIST_HEAD(&new_ns->mnt_ns_list); RB_CLEAR_NODE(&new_ns->mnt_ns_tree_node); init_waitqueue_head(&new_ns->poll); new_ns->user_ns = get_user_ns(user_ns); diff --git a/fs/nsfs.c b/fs/nsfs.c index c675fc40ce2dc674f0dafce5c4924b910a73a23f..663f8656158d52d391ba80ef1d320197d3d654e0 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -274,10 +274,7 @@ static long ns_ioctl(struct file *filp, unsigned int ioctl, if (usize < MNT_NS_INFO_SIZE_VER0) return -EINVAL; - if (previous) - mnt_ns = lookup_prev_mnt_ns(to_mnt_ns(ns)); - else - mnt_ns = lookup_next_mnt_ns(to_mnt_ns(ns)); + mnt_ns = get_sequential_mnt_ns(to_mnt_ns(ns), previous); if (IS_ERR(mnt_ns)) return PTR_ERR(mnt_ns); From patchwork Thu Dec 12 11:56:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905103 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DEB32153DA for ; Thu, 12 Dec 2024 11:56:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004588; cv=none; b=f5XnfUjohz5aYgzs9lKDTKQL8SL3MpAQwOB6ncUNVgT8zxkOmqOTG0bhLqkWbevPJJWRr/SOY0tuCFfQz8wGmEtySNYjGvpu3OatcvApSnWFH6x/Gq9bs9h7ybikDb6U8ZZNw7+07ZLXaoVfa6kQqmeqP5GB2d58Q7pN1VUHUGo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004588; c=relaxed/simple; bh=qV0Sv+xlrb/01s5FZpgdBuA+ON4p0+xW/V1y6x3Bcvc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=de+xuqV+jtNjTgwc8m2fSl2LyykXRbHDW31XXYVruUTdKVlUlTkAFjJ2+Oy0SdSnuOhY6paO9FhBX9HsxCtj/BjTAhhyPTdpMjB7EX70M4eN3LuevQ/qZ1GIAPB9rsoCllkJH/OhfJoraQGMN1Mf3cdO1Td8MvErX5IWREN05pY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=THknun5g; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="THknun5g" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 59F86C4CECE; Thu, 12 Dec 2024 11:56:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004586; bh=qV0Sv+xlrb/01s5FZpgdBuA+ON4p0+xW/V1y6x3Bcvc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=THknun5gYnQ7mfLbBcdQAplRuuXRCnG+3JTHZfvrslEpehCT3U9iDdGkgQ2BCBOIu lAISlSf1Pe0+FpVs4F/w5lLsVWMcjtlbixdDC0nO08mIz1MJdb4gXuLDWNRAcJm/rx 5a+OdWGVMoSEFgxbEYrILenKFHImlSb+clVFPT7SCri9VTtwescIuR6N/YO1oUna2O KDFVvfqJAr5TlZ4I/Opq0RZqlOrFSvSK2+mbOygAa88M+lpHfVE0LECCLx48j17ZDk 9WfGnBP4t8TfPzQ+5jo45Q9y0u223Ht5/oF8FnD4/sfSS2ZWC87YbSdb2dk8+psytL Ut2t2P7feXOXA== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:05 +0100 Subject: [PATCH v2 6/8] fs: simplify rwlock to spinlock Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-6-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=1665; i=brauner@kernel.org; h=from:subject:message-id; bh=qV0Sv+xlrb/01s5FZpgdBuA+ON4p0+xW/V1y6x3Bcvc=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY/h3zTt/frU82bfOh/vszwm8K9OTCHl+mLNr/9qW Kao/xfd21HKwiDGxSArpsji0G4SLrecp2KzUaYGzBxWJpAhDFycAjCR22sYGW7kJeW3bdiflnF0 X/nz7Pduvfv4V3ytKZE7bCVseW3dk0CG//58H4UZRdUS82M9ancpBHYtmb+l+MR3fqFfyuvtfpw y5QQA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 We're not taking the read_lock() anymore now that all lookup is losless. Just use a simple spinlock. Signed-off-by: Christian Brauner --- fs/namespace.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/fs/namespace.c b/fs/namespace.c index a5e1b166be9430d47c295159292cb9028b2e2339..984ec416ca7618260a38b86a16b66dcdba6e62ed 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -78,8 +78,8 @@ static struct kmem_cache *mnt_cache __ro_after_init; static DECLARE_RWSEM(namespace_sem); static HLIST_HEAD(unmounted); /* protected by namespace_sem */ static LIST_HEAD(ex_mountpoints); /* protected by namespace_sem */ -static DEFINE_RWLOCK(mnt_ns_tree_lock); -static seqcount_rwlock_t mnt_ns_tree_seqcount = SEQCNT_RWLOCK_ZERO(mnt_ns_tree_seqcount, &mnt_ns_tree_lock); +static DEFINE_SPINLOCK(mnt_ns_tree_lock); +static seqcount_spinlock_t mnt_ns_tree_seqcount = SEQCNT_SPINLOCK_ZERO(mnt_ns_tree_seqcount, &mnt_ns_tree_lock); static struct rb_root mnt_ns_tree = RB_ROOT; /* protected by mnt_ns_tree_lock */ static LIST_HEAD(mnt_ns_list); /* protected by mnt_ns_tree_lock */ @@ -131,14 +131,14 @@ static int mnt_ns_cmp(struct rb_node *a, const struct rb_node *b) static inline void mnt_ns_tree_write_lock(void) { - write_lock(&mnt_ns_tree_lock); + spin_lock(&mnt_ns_tree_lock); write_seqcount_begin(&mnt_ns_tree_seqcount); } static inline void mnt_ns_tree_write_unlock(void) { write_seqcount_end(&mnt_ns_tree_seqcount); - write_unlock(&mnt_ns_tree_lock); + spin_unlock(&mnt_ns_tree_lock); } static void mnt_ns_tree_add(struct mnt_namespace *ns) From patchwork Thu Dec 12 11:56:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905104 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDADE2153E3 for ; Thu, 12 Dec 2024 11:56:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004588; cv=none; b=eCMt6khjnrbaou/XSdi6s8Xd68Q0AvHteTXbB+3hKELKk6RCXN95JM6RunKY1fjMeVGRp9jtyTmQFgYc/GYwz5e4mPyCen/cYSPjJ8Oxs7inTPemWQshjsWiycIvqoStpT6OfTM7mIwS2ZVa1ojcConAkL0OQQE4QmPNsM+5J5I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004588; c=relaxed/simple; bh=7v4bOxKWz5L9zkkJvMX4FPAPpL80DWtTdu+GzAsXCsI=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=OgPaNp9w/FQmyBkymKEKY02EO3NjgZ2r6RU48Z3/cTyjzfdNYAeyYliV0VzuzmBiwJccQER6hnUFD+5MrjLsVIrN26cdMa/la8c21fEg7Td6lej7BjI9ekbB7LepVdwvGotlAi/qzMm99o4tDcYUe76IF/FYzUv77SpCOUMGKJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Gbh90pF/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Gbh90pF/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4FDD7C4CED0; Thu, 12 Dec 2024 11:56:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004588; bh=7v4bOxKWz5L9zkkJvMX4FPAPpL80DWtTdu+GzAsXCsI=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Gbh90pF/5WL3sVoSkydIiaWU83nAn/6xqFJvtj/E62JvnryTIEYgEaRqTD+RRo3hV Bcuc5/z+i0Na0qxH0w5Pl1VSHFpdnlVj14Va4bNtVm9xsJT8Qb9RrzcHn3X3HTrqrB 3cc7PjSSDJAI2HBlpb6/ZEreBRaxYDkBcVGQUKTd52HM+K1A92+1q720TUpHc5V0Kw 7xcgIfbYAs0FB5tnBMJxDmae83ikmSp5ldxAuLldxgWwGDHqwvXo0Z2TBkltzaOkIi FPDSAuEbNQACWqjRK7kCp2RMY5ngdqYZChVMT45brTXmaz+1ahoAWCdacmESK0EGOJ VupwS6DTeld9w== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:06 +0100 Subject: [PATCH v2 7/8] selftests: remove unneeded include Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-7-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=731; i=brauner@kernel.org; h=from:subject:message-id; bh=7v4bOxKWz5L9zkkJvMX4FPAPpL80DWtTdu+GzAsXCsI=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY+Zuze8XLJG75Teeo+7uoX+Uu/d5OXnz5Xe55O98 fi9Ew4HOkpZGMS4GGTFFFkc2k3C5ZbzVGw2ytSAmcPKBDKEgYtTACbi08nIsF8iR0ZWZJbDhqh0 PhdVZ82T/o+e/N25Mz4l5s3JswzPXBn+WV9b5cBe1V7qcO3JqZspS6e5tFp77Dx5X84w2FVT72Q XAwA= X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 The pidfd header will be included in a sample program and this pulls in all the mount definitions that would be causing problems. Signed-off-by: Christian Brauner --- tools/testing/selftests/pidfd/pidfd.h | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/pidfd/pidfd.h b/tools/testing/selftests/pidfd/pidfd.h index 88d6830ee004df3c7a9d3ebcdab89d5775e9ab9b..3a96053e52e7bbf5f7f85908c2093e9023b1d3d6 100644 --- a/tools/testing/selftests/pidfd/pidfd.h +++ b/tools/testing/selftests/pidfd/pidfd.h @@ -12,7 +12,6 @@ #include #include #include -#include #include #include From patchwork Thu Dec 12 11:56:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Brauner X-Patchwork-Id: 13905105 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09DDB212F86 for ; Thu, 12 Dec 2024 11:56:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004592; cv=none; b=tuSHN5TKLlZ4xYhe0X/CE1WA2oeig9URn4Yd55Xnm2/amykz+3GK6aPRYJleH8dUxZ3059l21gMbmGeU8z7xuva0msEDdbsVDr4XtoWgX5YPtAX1PHYKVqFimQqjp4nFgiPH1d4ELDbh1f3RZ/OXtjqe253jVfi0V5rfN8D27WI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734004592; c=relaxed/simple; bh=sml9215PPcFAlh8pcjjb9M8oaHdJe+E09N8aBMLleno=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Z3CQSyBGtDrS4kqKSLVrUl7CzmToOO23wzIjX4X4d1X1WoCJrGhmgKc1Y+s4emMCFAxeP+l0VIyOCEYPH+GGHemEB0yGYeFdc8O9wsrZQi9ZBJzE2m+NZIIn0dZ9ko03pY4wuO8g1GJ8/uPr2pZvIYN/IpMEOrZW+XBiZMIbzBY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=i/QmgRuL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="i/QmgRuL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4B26AC4CED0; Thu, 12 Dec 2024 11:56:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734004590; bh=sml9215PPcFAlh8pcjjb9M8oaHdJe+E09N8aBMLleno=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=i/QmgRuLtL9kDPwVH5x0FF6l1xOq9VmvLV/UUlBRN2CLf/xfAv9a4NY9VqPANb8Xm Zl2VaqbiJT7eycjT+a0w9+s02DacbD0/OXA784CdGH8b8H/yWuTDVh6HpJOEZbUG97 5zyxbSTFJghUukTjyjWatd3NoOn8VTWzJZaLH2INwCKAt4Y2gMnhFM+pMB9k3I83h9 IVR9yE7SDcGkDFqy9W7jmgLYsNL4UlkstyguZoBXqUL1TUmSmAH4Qwrs0OlOFBe94+ rWsPj/aNXZl0sis0OzT7FsxU7cvdpegOItRepbaejL1o3LrTzdPCfHGW3a5yP+r/UU RjeIcC5QNtP/Q== From: Christian Brauner Date: Thu, 12 Dec 2024 12:56:07 +0100 Subject: [PATCH v2 8/8] samples: add test-list-all-mounts Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241212-work-mount-rbtree-lockless-v2-8-4fe6cef02534@kernel.org> References: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> In-Reply-To: <20241212-work-mount-rbtree-lockless-v2-0-4fe6cef02534@kernel.org> To: Josef Bacik , Jeff Layton Cc: "Paul E. McKenney" , Peter Ziljstra , linux-fsdevel@vger.kernel.org, Christian Brauner X-Mailer: b4 0.15-dev-355e8 X-Developer-Signature: v=1; a=openpgp-sha256; l=7958; i=brauner@kernel.org; h=from:subject:message-id; bh=sml9215PPcFAlh8pcjjb9M8oaHdJe+E09N8aBMLleno=; b=owGbwMvMwCU28Zj0gdSKO4sYT6slMaRHnY9JaXmUWi9rf7Hq6IJot/ot/4o/+d9hunvi/f6is 10zQ26ZdZSyMIhxMciKKbI4tJuEyy3nqdhslKkBM4eVCWQIAxenAEwkW42R4WXnKdcfpp/EJ15L +lvzbeGbE2ezUg31L1bPesHP3q+8PIyR4cAE5hnfTVNuMHusk22arNLF+cpIeOYyXWmJe7Lc+2Z u4gEA X-Developer-Key: i=brauner@kernel.org; a=openpgp; fpr=4880B8C9BD0E5106FC070F4F7B3C391EFEA93624 Add a sample program illustrating how to list all mounts in all mount namespaces. Signed-off-by: Christian Brauner --- samples/vfs/.gitignore | 1 + samples/vfs/Makefile | 2 +- samples/vfs/test-list-all-mounts.c | 235 +++++++++++++++++++++++++++++++++++++ 3 files changed, 237 insertions(+), 1 deletion(-) diff --git a/samples/vfs/.gitignore b/samples/vfs/.gitignore index 79212d91285bca72b0ff85f28aaccd2e803ac092..8694dd17b318768b975ece5c7cd450c2cca67318 100644 --- a/samples/vfs/.gitignore +++ b/samples/vfs/.gitignore @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only /test-fsmount +/test-list-all-mounts /test-statx diff --git a/samples/vfs/Makefile b/samples/vfs/Makefile index 6377a678134acf0d682151d751d2f5042dbf5e0a..301be72a52a0e376c7ebe235cc2058992919cc78 100644 --- a/samples/vfs/Makefile +++ b/samples/vfs/Makefile @@ -1,4 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only -userprogs-always-y += test-fsmount test-statx +userprogs-always-y += test-fsmount test-statx test-list-all-mounts userccflags += -I usr/include diff --git a/samples/vfs/test-list-all-mounts.c b/samples/vfs/test-list-all-mounts.c new file mode 100644 index 0000000000000000000000000000000000000000..f372d5aea4717fd1ab3d4b3f9af79316cd5dd3d3 --- /dev/null +++ b/samples/vfs/test-list-all-mounts.c @@ -0,0 +1,235 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +// Copyright (c) 2024 Christian Brauner + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include + +#include "../../tools/testing/selftests/pidfd/pidfd.h" + +#define die_errno(format, ...) \ + do { \ + fprintf(stderr, "%m | %s: %d: %s: " format "\n", __FILE__, \ + __LINE__, __func__, ##__VA_ARGS__); \ + exit(EXIT_FAILURE); \ + } while (0) + +/* Get the id for a mount namespace */ +#define NS_GET_MNTNS_ID _IO(0xb7, 0x5) +/* Get next mount namespace. */ + +struct mnt_ns_info { + __u32 size; + __u32 nr_mounts; + __u64 mnt_ns_id; +}; + +#define MNT_NS_INFO_SIZE_VER0 16 /* size of first published struct */ + +/* Get information about namespace. */ +#define NS_MNT_GET_INFO _IOR(0xb7, 10, struct mnt_ns_info) +/* Get next namespace. */ +#define NS_MNT_GET_NEXT _IOR(0xb7, 11, struct mnt_ns_info) +/* Get previous namespace. */ +#define NS_MNT_GET_PREV _IOR(0xb7, 12, struct mnt_ns_info) + +#define PIDFD_GET_MNT_NAMESPACE _IO(0xFF, 3) + +#ifndef __NR_listmount +#define __NR_listmount 458 +#endif + +#ifndef __NR_statmount +#define __NR_statmount 457 +#endif + +/* @mask bits for statmount(2) */ +#define STATMOUNT_SB_BASIC 0x00000001U /* Want/got sb_... */ +#define STATMOUNT_MNT_BASIC 0x00000002U /* Want/got mnt_... */ +#define STATMOUNT_PROPAGATE_FROM 0x00000004U /* Want/got propagate_from */ +#define STATMOUNT_MNT_ROOT 0x00000008U /* Want/got mnt_root */ +#define STATMOUNT_MNT_POINT 0x00000010U /* Want/got mnt_point */ +#define STATMOUNT_FS_TYPE 0x00000020U /* Want/got fs_type */ +#define STATMOUNT_MNT_NS_ID 0x00000040U /* Want/got mnt_ns_id */ +#define STATMOUNT_MNT_OPTS 0x00000080U /* Want/got mnt_opts */ + +#define STATX_MNT_ID_UNIQUE 0x00004000U /* Want/got extended stx_mount_id */ + +struct statmount { + __u32 size; + __u32 mnt_opts; + __u64 mask; + __u32 sb_dev_major; + __u32 sb_dev_minor; + __u64 sb_magic; + __u32 sb_flags; + __u32 fs_type; + __u64 mnt_id; + __u64 mnt_parent_id; + __u32 mnt_id_old; + __u32 mnt_parent_id_old; + __u64 mnt_attr; + __u64 mnt_propagation; + __u64 mnt_peer_group; + __u64 mnt_master; + __u64 propagate_from; + __u32 mnt_root; + __u32 mnt_point; + __u64 mnt_ns_id; + __u64 __spare2[49]; + char str[]; +}; + +struct mnt_id_req { + __u32 size; + __u32 spare; + __u64 mnt_id; + __u64 param; + __u64 mnt_ns_id; +}; + +#define MNT_ID_REQ_SIZE_VER1 32 /* sizeof second published struct */ + +#define LSMT_ROOT 0xffffffffffffffff /* root mount */ + +static int __statmount(__u64 mnt_id, __u64 mnt_ns_id, __u64 mask, + struct statmount *stmnt, size_t bufsize, + unsigned int flags) +{ + struct mnt_id_req req = { + .size = MNT_ID_REQ_SIZE_VER1, + .mnt_id = mnt_id, + .param = mask, + .mnt_ns_id = mnt_ns_id, + }; + + return syscall(__NR_statmount, &req, stmnt, bufsize, flags); +} + +static struct statmount *sys_statmount(__u64 mnt_id, __u64 mnt_ns_id, + __u64 mask, unsigned int flags) +{ + size_t bufsize = 1 << 15; + struct statmount *stmnt = NULL, *tmp = NULL; + int ret; + + for (;;) { + tmp = realloc(stmnt, bufsize); + if (!tmp) + goto out; + + stmnt = tmp; + ret = __statmount(mnt_id, mnt_ns_id, mask, stmnt, bufsize, flags); + if (!ret) + return stmnt; + + if (errno != EOVERFLOW) + goto out; + + bufsize <<= 1; + if (bufsize >= UINT_MAX / 2) + goto out; + } + +out: + free(stmnt); + return NULL; +} + +static ssize_t sys_listmount(__u64 mnt_id, __u64 last_mnt_id, __u64 mnt_ns_id, + __u64 list[], size_t num, unsigned int flags) +{ + struct mnt_id_req req = { + .size = MNT_ID_REQ_SIZE_VER1, + .mnt_id = mnt_id, + .param = last_mnt_id, + .mnt_ns_id = mnt_ns_id, + }; + + return syscall(__NR_listmount, &req, list, num, flags); +} + +int main(int argc, char *argv[]) +{ +#define LISTMNT_BUFFER 10 + __u64 list[LISTMNT_BUFFER], last_mnt_id = 0; + int ret, pidfd, fd_mntns; + struct mnt_ns_info info = {}; + + pidfd = sys_pidfd_open(getpid(), 0); + if (pidfd < 0) + die_errno("pidfd_open failed"); + + fd_mntns = ioctl(pidfd, PIDFD_GET_MNT_NAMESPACE, 0); + if (fd_mntns < 0) + die_errno("ioctl(PIDFD_GET_MNT_NAMESPACE) failed"); + + ret = ioctl(fd_mntns, NS_MNT_GET_INFO, &info); + if (ret < 0) + die_errno("ioctl(NS_GET_MNTNS_ID) failed"); + + printf("Listing %u mounts for mount namespace %llu\n", + info.nr_mounts, info.mnt_ns_id); + for (;;) { + ssize_t nr_mounts; +next: + nr_mounts = sys_listmount(LSMT_ROOT, last_mnt_id, + info.mnt_ns_id, list, LISTMNT_BUFFER, + 0); + if (nr_mounts <= 0) { + int fd_mntns_next; + + printf("Finished listing %u mounts for mount namespace %llu\n\n", + info.nr_mounts, info.mnt_ns_id); + fd_mntns_next = ioctl(fd_mntns, NS_MNT_GET_NEXT, &info); + if (fd_mntns_next < 0) { + if (errno == ENOENT) { + printf("Finished listing all mount namespaces\n"); + exit(0); + } + die_errno("ioctl(NS_MNT_GET_NEXT) failed"); + } + close(fd_mntns); + fd_mntns = fd_mntns_next; + last_mnt_id = 0; + printf("Listing %u mounts for mount namespace %llu\n", + info.nr_mounts, info.mnt_ns_id); + goto next; + } + + for (size_t cur = 0; cur < nr_mounts; cur++) { + struct statmount *stmnt; + + last_mnt_id = list[cur]; + + stmnt = sys_statmount(last_mnt_id, info.mnt_ns_id, + STATMOUNT_SB_BASIC | + STATMOUNT_MNT_BASIC | + STATMOUNT_MNT_ROOT | + STATMOUNT_MNT_POINT | + STATMOUNT_MNT_NS_ID | + STATMOUNT_MNT_OPTS | + STATMOUNT_FS_TYPE, 0); + if (!stmnt) { + printf("Failed to statmount(%llu) in mount namespace(%llu)\n", + last_mnt_id, info.mnt_ns_id); + continue; + } + + printf("mnt_id:\t\t%llu\nmnt_parent_id:\t%llu\nfs_type:\t%s\nmnt_root:\t%s\nmnt_point:\t%s\nmnt_opts:\t%s\n\n", + stmnt->mnt_id, + stmnt->mnt_parent_id, + stmnt->str + stmnt->fs_type, + stmnt->str + stmnt->mnt_root, + stmnt->str + stmnt->mnt_point, + stmnt->str + stmnt->mnt_opts); + free(stmnt); + } + } + + exit(0); +}