[09/22] fast_dput(): handle underflows gracefully

Message ID	20231109062056.3181775-9-viro@zeniv.linux.org.uk (mailing list archive)
State	New, archived
Headers	show Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2A07DDB9 for <linux-fsdevel@vger.kernel.org>; Thu, 9 Nov 2023 06:21:01 +0000 (UTC) From: Al Viro <viro@zeniv.linux.org.uk> To: Linus Torvalds <torvalds@linux-foundation.org> Cc: linux-fsdevel@vger.kernel.org, Christian Brauner <brauner@kernel.org> Subject: [PATCH 09/22] fast_dput(): handle underflows gracefully Date: Thu, 9 Nov 2023 06:20:43 +0000 Message-Id: <20231109062056.3181775-9-viro@zeniv.linux.org.uk> In-Reply-To: <20231109062056.3181775-1-viro@zeniv.linux.org.uk> References: <20231109061932.GA3181489@ZenIV> <20231109062056.3181775-1-viro@zeniv.linux.org.uk> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: Al Viro <viro@ftp.linux.org.uk>
Series	[01/22] struct dentry: get rid of randomize_layout idiocy \| expand [01/22] struct dentry: get rid of randomize_layout idiocy [02/22] switch nfsd_client_rmdir() to use of simple_recursive_removal() [03/22] coda_flag_children(): cope with dentries turning negative [04/22] dentry: switch the lists of children to hlist [05/22] centralize killing dentry from shrink list [06/22] get rid of __dget() [07/22] shrink_dentry_list(): no need to check that dentry refcount is marked dead [08/22] fast_dput(): having ->d_delete() is not reason to delay refcount decrement [09/22] fast_dput(): handle underflows gracefully [10/22] fast_dput(): new rules for refcount [11/22] __dput_to_list(): do decrement of refcount in the callers [12/22] Make retain_dentry() neutral with respect to refcounting [13/22] __dentry_kill(): get consistent rules for victim's refcount [14/22] dentry_kill(): don't bother with retain_dentry() on slow path [15/22] Call retain_dentry() with refcount 0 [16/22] fold the call of retain_dentry() into fast_dput() [17/22] don't try to cut corners in shrink_lock_dentry() [18/22] fold dentry_kill() into dput() [19/22] to_shrink_list(): call only if refcount is 0 [20/22] switch select_collect{,2}() to use of to_shrink_list() [21/22] d_prune_aliases(): use a shrink list [22/22] __dentry_kill(): new locking scheme

Message ID

20231109062056.3181775-9-viro@zeniv.linux.org.uk (mailing list archive)

State

New, archived

Headers

From: Al Viro <viro@zeniv.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel@vger.kernel.org,
	Christian Brauner <brauner@kernel.org>
Subject: [PATCH 09/22] fast_dput(): handle underflows gracefully
Date: Thu,  9 Nov 2023 06:20:43 +0000
Message-Id: <20231109062056.3181775-9-viro@zeniv.linux.org.uk>
In-Reply-To: <20231109062056.3181775-1-viro@zeniv.linux.org.uk>
References: <20231109061932.GA3181489@ZenIV>
 <20231109062056.3181775-1-viro@zeniv.linux.org.uk>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Sender: Al Viro <viro@ftp.linux.org.uk>

Series

[01/22] struct dentry: get rid of randomize_layout idiocy | expand

Commit Message

Al Viro Nov. 9, 2023, 6:20 a.m. UTC

If refcount is less than 1, we should just warn, unlock dentry and
return true, so that the caller doesn't try to do anything else.

Taking care of that leaves the rest of "lockref_put_return() has
failed" case equivalent to "decrement refcount and rejoin the
normal slow path after the point where we grab ->d_lock".

NOTE: lockref_put_return() is strictly a fastpath thing - unlike
the rest of lockref primitives, it does not contain a fallback.
Caller (and it looks like fast_dput() is the only legitimate one
in the entire kernel) has to do that itself.  Reasons for
lockref_put_return() failures:
	* ->d_lock held by somebody
	* refcount <= 0
	* ... or an architecture not supporting lockref use of
cmpxchg - sparc, anything non-SMP, config with spinlock debugging...

We could add a fallback, but it would be a clumsy API - we'd have
to distinguish between:
	(1) refcount > 1 - decremented, lock not held on return
	(2) refcount < 1 - left alone, probably no sense to hold the lock
	(3) refcount is 1, no cmphxcg - decremented, lock held on return
	(4) refcount is 1, cmphxcg supported - decremented, lock *NOT* held
	    on return.
We want to return with no lock held in case (4); that's the whole point of that
thing.  We very much do not want to have the fallback in case (3) return without
a lock, since the caller might have to retake it in that case.
So it wouldn't be more convenient than doing the fallback in the caller and
it would be very easy to screw up, especially since the test coverage would
suck - no way to test (3) and (4) on the same kernel build.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/dcache.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Comments

Christian Brauner Nov. 9, 2023, 2:46 p.m. UTC | #1

On Thu, Nov 09, 2023 at 06:20:43AM +0000, Al Viro wrote:
> If refcount is less than 1, we should just warn, unlock dentry and
> return true, so that the caller doesn't try to do anything else.

That's effectively to guard against bugs in filesystems, not in dcache
itself, right? Have we observed this frequently?

> 
> Taking care of that leaves the rest of "lockref_put_return() has
> failed" case equivalent to "decrement refcount and rejoin the
> normal slow path after the point where we grab ->d_lock".
> 
> NOTE: lockref_put_return() is strictly a fastpath thing - unlike
> the rest of lockref primitives, it does not contain a fallback.
> Caller (and it looks like fast_dput() is the only legitimate one
> in the entire kernel) has to do that itself.  Reasons for
> lockref_put_return() failures:
> 	* ->d_lock held by somebody
> 	* refcount <= 0
> 	* ... or an architecture not supporting lockref use of
> cmpxchg - sparc, anything non-SMP, config with spinlock debugging...
> 
> We could add a fallback, but it would be a clumsy API - we'd have
> to distinguish between:
> 	(1) refcount > 1 - decremented, lock not held on return
> 	(2) refcount < 1 - left alone, probably no sense to hold the lock
> 	(3) refcount is 1, no cmphxcg - decremented, lock held on return
> 	(4) refcount is 1, cmphxcg supported - decremented, lock *NOT* held
> 	    on return.
> We want to return with no lock held in case (4); that's the whole point of that
> thing.  We very much do not want to have the fallback in case (3) return without
> a lock, since the caller might have to retake it in that case.
> So it wouldn't be more convenient than doing the fallback in the caller and
> it would be very easy to screw up, especially since the test coverage would
> suck - no way to test (3) and (4) on the same kernel build.
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---

Looks like a good idea,
Reviewed-by: Christian Brauner <brauner@kernel.org>

Al Viro Nov. 9, 2023, 8:39 p.m. UTC | #2

On Thu, Nov 09, 2023 at 03:46:21PM +0100, Christian Brauner wrote:
> On Thu, Nov 09, 2023 at 06:20:43AM +0000, Al Viro wrote:
> > If refcount is less than 1, we should just warn, unlock dentry and
> > return true, so that the caller doesn't try to do anything else.
> 
> That's effectively to guard against bugs in filesystems, not in dcache
> itself, right? Have we observed this frequently?

Hard to tell - it doesn't happen often, but... extra dput() somewhere
is not an impossible class of bugs.  I remember running into that
while doing work in namei.c, I certainly have seen failure exits in
random places that fucked refcounting up by dropping the wrong things.

diff --git a/fs/dcache.c b/fs/dcache.c
index 0d15e8852ac1..e02b3c81bc02 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -779,12 +779,12 @@  static inline bool fast_dput(struct dentry *dentry)
 	 */
 	if (unlikely(ret < 0)) {
 		spin_lock(&dentry->d_lock);
-		if (dentry->d_lockref.count > 1) {
-			dentry->d_lockref.count--;
+		if (WARN_ON_ONCE(dentry->d_lockref.count <= 0)) {
 			spin_unlock(&dentry->d_lock);
 			return true;
 		}
-		return false;
+		dentry->d_lockref.count--;
+		goto locked;
 	}
 
 	/*
@@ -842,6 +842,7 @@  static inline bool fast_dput(struct dentry *dentry)
 	 * else could have killed it and marked it dead. Either way, we
 	 * don't need to do anything else.
 	 */
+locked:
 	if (dentry->d_lockref.count) {
 		spin_unlock(&dentry->d_lock);
 		return true;

[09/22] fast_dput(): handle underflows gracefully

Commit Message

Comments

Patch