From patchwork Fri Jan 24 19:19:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949872 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3489CA64; Fri, 24 Jan 2025 19:19:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746392; cv=none; b=D4JH6EGPJ7aKlCZtCyGYnUp/GL+Rd2mEXZn68vZiZz4TDK17o5KkxWKaWAoLGDv4WwIR0LxKQNCpYsuyZBL6Y/ZkduEqztBlGaIw/wuDU+dBDk+XEvWZ3ikWPQnqqCYN1NDXyHe8zBhOdPeyDSrtMAVovPSgkRgfiM4SwRDl+Js= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746392; c=relaxed/simple; bh=AAimS1DXsD+1vqFtQxl+lQZkXctbDP9Ux3Ct0lAIRKY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VUYpOkNq4/AhQmzoDmVUycljUJNVGe38lYsbD0r/wZgSGYdjwGDvgHMCn2nRszCaGpUeBRMExUhf40TA9vD20805wAbFsRxivNU/w/2AGMiNKUn5Yrx/hHT0q60o45iD0fqg8zQLZAr0nCEwrJPfLBO+Gtqb81MAoUgh6eFzFSg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QKo7eiYs; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QKo7eiYs" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2853FC4CEE3; Fri, 24 Jan 2025 19:19:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746391; bh=AAimS1DXsD+1vqFtQxl+lQZkXctbDP9Ux3Ct0lAIRKY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QKo7eiYsDqb+VJ1oghRzIW9zWNc0+ukRB/vdi13rpjkOm/x9/En0p5WvGX9nbJBoE HKKpJWQKWEq9pjE+a5Gj5Vkn3ZJDWMNASXHPi/KNRrCFjDrc2JApS7dOBmshLFifcJ kQHpa+wwadPT1PZ35KhcpESCLAeXiAYkgJOpq7xYRwZjpO+AjnvksUFltUOinBZ/ys oDlVcHtv4t6rEoisItohXmncZXTf5yIiyu2U8gGGuNLFtRu8nCIFZZfjxilbyW7BD2 Sp6Py8Q3zyWbOYEDHjsfHF6GR4d6tifGaCioJUdrvRKF7xpSpxttbURQsEQhr1uQCZ QgUKGWdEXQJlQ== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever , "Liam R. Howlett" , Jan Kara Subject: [RFC PATCH v6.6 01/10] libfs: Re-arrange locking in offset_iterate_dir() Date: Fri, 24 Jan 2025 14:19:36 -0500 Message-ID: <20250124191946.22308-2-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 3f6d810665dfde0d33785420618ceb03fba0619d ] Liam and Matthew say that once the RCU read lock is released, xa_state is not safe to re-use for the next xas_find() call. But the RCU read lock must be released on each loop iteration so that dput(), which might_sleep(), can be called safely. Thus we are forced to walk the offset tree with fresh state for each directory entry. xa_find() can do this for us, though it might be a little less efficient than maintaining xa_state locally. We believe that in the current code base, inode->i_rwsem provides protection for the xa_state maintained in offset_iterate_dir(). However, there is no guarantee that will continue to be the case in the future. Since offset_iterate_dir() doesn't build xa_state locally any more, there's no longer a strong need for offset_find_next(). Clean up by rolling these two helpers together. Suggested-by: Liam R. Howlett Message-ID: <170785993027.11135.8830043889278631735.stgit@91.116.238.104.host.secureserver.net> Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/170820142021.6328.15047865406275957018.stgit@91.116.238.104.host.secureserver.net Reviewed-by: Jan Kara Signed-off-by: Christian Brauner Signed-off-by: Chuck Lever --- fs/libfs.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index dc0f7519045f..430f7c95336c 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -401,12 +401,13 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return vfs_setpos(file, offset, U32_MAX); } -static struct dentry *offset_find_next(struct xa_state *xas) +static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) { struct dentry *child, *found = NULL; + XA_STATE(xas, &octx->xa, offset); rcu_read_lock(); - child = xas_next_entry(xas, U32_MAX); + child = xas_next_entry(&xas, U32_MAX); if (!child) goto out; spin_lock(&child->d_lock); @@ -429,12 +430,11 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) { - struct offset_ctx *so_ctx = inode->i_op->get_offset_ctx(inode); - XA_STATE(xas, &so_ctx->xa, ctx->pos); + struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); struct dentry *dentry; while (true) { - dentry = offset_find_next(&xas); + dentry = offset_find_next(octx, ctx->pos); if (!dentry) return ERR_PTR(-ENOENT); @@ -443,8 +443,8 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) break; } + ctx->pos = dentry2offset(dentry) + 1; dput(dentry); - ctx->pos = xas.xa_index + 1; } return NULL; } From patchwork Fri Jan 24 19:19:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949873 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41F51CA64; Fri, 24 Jan 2025 19:19:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746393; cv=none; b=A0EikhfvuKKLkFqxj452TmMZNz+KuKhXdGhLLYS7PAac2n4dVbAuFgaQPyobFkaxb7Wl2B8VuTrs7aJAb8s3fgGKFr4YuZr12X6NH+IV1kUDgeJOeqE9jMzl3T5tyEauGJM63OzMsT81nxggFZ/7Bmo9J6Y05ICaud9EPju0JiY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746393; c=relaxed/simple; bh=aTa/oPIRhiZqsiUQhL063GDzxrb6flC1H3iG2TVuGAg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tDUfD5+hJpUzNqkYjAIiZxRSTrU0omRohVxZj532QzvrRCOYTvx6Wmee0G+bKtt7S8X7FrQWkbfNAL7ErHxGJQ7QitzkkS4jN0EPwtgs5CXshN197xaHVMG0Ulwl4gP38OrMNpQv5B/uLBt3miawit7pbUOTZqobVuRz6oL3fEw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=RcLi3zNw; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="RcLi3zNw" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 972E7C4CEE1; Fri, 24 Jan 2025 19:19:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746392; bh=aTa/oPIRhiZqsiUQhL063GDzxrb6flC1H3iG2TVuGAg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RcLi3zNwUmn6+D7krNKMAsLu2+/e4oFjxsu0wxzCUBNfUbOBKRndiLIbvyh6gRVcK yvusJypoKUf3J07VZh/SGZ+ayRV9WfralmbkSzFaoaksAUv1YRtX70/pM43krhTpDs 3LYGPKs4zAtTllP4ZJs8pYaJNDQP8XUQkefM/Gs9hoBZhAwdzz/HfOTl+HY71Wli4x wsUg86mAIbqmASezLl0yPIkg+kzWr1fDNMm11d9FizatAu9iaQOefDGJj7l5qCO4+2 60xlPNVkfUYE2gUfVk5DkErDmcoOHDwLKFacFwmKPwyOcHcU1cz+UGzOG3sgYKKPT5 OiduZIrMtg68Q== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever , Jan Kara Subject: [RFC PATCH v6.6 02/10] libfs: Define a minimum directory offset Date: Fri, 24 Jan 2025 14:19:37 -0500 Message-ID: <20250124191946.22308-3-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 7beea725a8ca412c6190090ce7c3a13b169592a1 ] This value is used in several places, so make it a symbolic constant. Reviewed-by: Jan Kara Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/170820142741.6328.12428356024575347885.stgit@91.116.238.104.host.secureserver.net Signed-off-by: Christian Brauner Stable-dep-of: ecba88a3b32d ("libfs: Add simple_offset_empty()") Signed-off-by: Chuck Lever --- fs/libfs.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 430f7c95336c..c3dc58e776f9 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -239,6 +239,11 @@ const struct inode_operations simple_dir_inode_operations = { }; EXPORT_SYMBOL(simple_dir_inode_operations); +/* 0 is '.', 1 is '..', so always start with offset 2 or more */ +enum { + DIR_OFFSET_MIN = 2, +}; + static void offset_set(struct dentry *dentry, u32 offset) { dentry->d_fsdata = (void *)((uintptr_t)(offset)); @@ -260,9 +265,7 @@ void simple_offset_init(struct offset_ctx *octx) { xa_init_flags(&octx->xa, XA_FLAGS_ALLOC1); lockdep_set_class(&octx->xa.xa_lock, &simple_offset_xa_lock); - - /* 0 is '.', 1 is '..', so always start with offset 2 */ - octx->next_offset = 2; + octx->next_offset = DIR_OFFSET_MIN; } /** @@ -275,7 +278,7 @@ void simple_offset_init(struct offset_ctx *octx) */ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) { - static const struct xa_limit limit = XA_LIMIT(2, U32_MAX); + static const struct xa_limit limit = XA_LIMIT(DIR_OFFSET_MIN, U32_MAX); u32 offset; int ret; @@ -480,7 +483,7 @@ static int offset_readdir(struct file *file, struct dir_context *ctx) return 0; /* In this case, ->private_data is protected by f_pos_lock */ - if (ctx->pos == 2) + if (ctx->pos == DIR_OFFSET_MIN) file->private_data = NULL; else if (file->private_data == ERR_PTR(-ENOENT)) return 0; From patchwork Fri Jan 24 19:19:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949874 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A2EF224B07; Fri, 24 Jan 2025 19:19:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746394; cv=none; b=bj+aM8BZNokbbj/RcHKltxjTteM8+WfecOE2rftKfgvm1b8BY8kHcHFEzm5rKAnMKRWLZhUrknL3ZjYir+Re3S6UwcaB2Ymu41i0fddE/wz1hPLrQccjDOxzGZX2V2v0Ce0ENUKsjkj8xUvjmLYrVxZjVwpnFg8qt2cUMBjENq8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746394; c=relaxed/simple; bh=kzHPpa2KwO/4QYlO3ISbItr5ch0/fuaXPOLFlhxSI4Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=S8hjFCvHxuc/DxqtoXa9X/8TQM66bPNtz/iOYAabuicvyGu8wXsWzrFhramtpyhwH2OjE3YjmuLx0Va6XDRJ6uh4Mjn5x2xkqOOg+qG7abXnawIhF7J4ePvRGnMSf7UJ9BU5XjxebLK6iUZ7f4AUVc0y1/mTh2MUSwVS5H3ls7w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=kzuTtwb5; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="kzuTtwb5" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F03A9C4CEE5; Fri, 24 Jan 2025 19:19:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746394; bh=kzHPpa2KwO/4QYlO3ISbItr5ch0/fuaXPOLFlhxSI4Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=kzuTtwb5nFfTsInS53xDa6b64DCNsdRWaMcj6hDJUkWKBlrZXHEIYasBzIi5+sp/5 fmAWr5c1weaSlZ/q1je+J9GixE8pbr/1FxZbOTsA6hAEK3lVBPi7SdrYOpEzo/R0Vw 1FLASw+xfWjztHLBBsn3VGYvCAfh0sRdBni7A2mN9INCkcCDuo1HjjsFAYVuO40T9J GRwbxR48W5ymurrPq6KoNOpua101EBRDD3GaNGpo11XzXkG6zQZHTJUayLHa+ialwG DgaroVl03FtpTzP7mgPmDvki+KMdGcc9xnnzjJbz2XbTve+BLXiOR4AS9pyDT8mxRS cR/RyT1qdB5AQ== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever , Jan Kara Subject: [RFC PATCH v6.6 03/10] libfs: Add simple_offset_empty() Date: Fri, 24 Jan 2025 14:19:38 -0500 Message-ID: <20250124191946.22308-4-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit ecba88a3b32d733d41e27973e25b2bc580f64281 ] For simple filesystems that use directory offset mapping, rely strictly on the directory offset map to tell when a directory has no children. After this patch is applied, the emptiness test holds only the RCU read lock when the directory being tested has no children. In addition, this adds another layer of confirmation that simple_offset_add/remove() are working as expected. Reviewed-by: Jan Kara Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/170820143463.6328.7872919188371286951.stgit@91.116.238.104.host.secureserver.net Signed-off-by: Christian Brauner Stable-dep-of: 5a1a25be995e ("libfs: Add simple_offset_rename() API") Signed-off-by: Chuck Lever --- fs/libfs.c | 32 ++++++++++++++++++++++++++++++++ include/linux/fs.h | 1 + mm/shmem.c | 4 ++-- 3 files changed, 35 insertions(+), 2 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index c3dc58e776f9..d7b901cb9af4 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -312,6 +312,38 @@ void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry) offset_set(dentry, 0); } +/** + * simple_offset_empty - Check if a dentry can be unlinked + * @dentry: dentry to be tested + * + * Returns 0 if @dentry is a non-empty directory; otherwise returns 1. + */ +int simple_offset_empty(struct dentry *dentry) +{ + struct inode *inode = d_inode(dentry); + struct offset_ctx *octx; + struct dentry *child; + unsigned long index; + int ret = 1; + + if (!inode || !S_ISDIR(inode->i_mode)) + return ret; + + index = DIR_OFFSET_MIN; + octx = inode->i_op->get_offset_ctx(inode); + xa_for_each(&octx->xa, index, child) { + spin_lock(&child->d_lock); + if (simple_positive(child)) { + spin_unlock(&child->d_lock); + ret = 0; + break; + } + spin_unlock(&child->d_lock); + } + + return ret; +} + /** * simple_offset_rename_exchange - exchange rename with directory offsets * @old_dir: parent of dentry being moved diff --git a/include/linux/fs.h b/include/linux/fs.h index 6c3d86532e3f..5104405ce3e6 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3197,6 +3197,7 @@ struct offset_ctx { void simple_offset_init(struct offset_ctx *octx); int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry); void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry); +int simple_offset_empty(struct dentry *dentry); int simple_offset_rename_exchange(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, diff --git a/mm/shmem.c b/mm/shmem.c index db7dd45c9181..aaf679976f3b 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3368,7 +3368,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) static int shmem_rmdir(struct inode *dir, struct dentry *dentry) { - if (!simple_empty(dentry)) + if (!simple_offset_empty(dentry)) return -ENOTEMPTY; drop_nlink(d_inode(dentry)); @@ -3425,7 +3425,7 @@ static int shmem_rename2(struct mnt_idmap *idmap, return simple_offset_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); - if (!simple_empty(new_dentry)) + if (!simple_offset_empty(new_dentry)) return -ENOTEMPTY; if (flags & RENAME_WHITEOUT) { From patchwork Fri Jan 24 19:19:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949875 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D976B224B07; Fri, 24 Jan 2025 19:19:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746395; cv=none; b=Kty8EPiyPmT3C7OuyMCIeV3WPLxZmuxaQeNqbn7rEcwds+uM+oaVdTdqZNwa92n9kj+ZFP7QQ0XVtXYPALYs/fowmLcFXXZMBG7I10N4LrFR/eWFU+Piibb9cfCMHlwfW/GBLuAo/FNtlT90SEPPUNSnrsVl3VSOTsNNSBATyDw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746395; c=relaxed/simple; bh=S3TpyVqlQ6GbQ6WEv5W4bM5DS5m/bnNZcBueUm6fZKc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mYklvEKYTucUXLaR2zdeyB/1c4SYQwmGJEydw+oJtb8nRERA2HP0wjOrem2B+N39fJhC5uiE4gDpD7WekaMZTm2tRarAZYzCLRxWUoH1ZfOTznxUgfUjExIR6EXSLjw3Z+FYtCuWcQQ0eVpPqGxlu5PXzDvfKs5QfftQPrX38XQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Bw2F45L6; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Bw2F45L6" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 55E93C4CED2; Fri, 24 Jan 2025 19:19:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746395; bh=S3TpyVqlQ6GbQ6WEv5W4bM5DS5m/bnNZcBueUm6fZKc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Bw2F45L6uyN95VvlhmGZ+g4u/m5LECYA5/LAsY1EmhxFlU2jq6MfD+cMGacR+Ld2v p0aTtj40pSk6001/wmYSKhZDxqdzalUgg29lktQoX7WM/It7P5JeL0DDTbi2BgarK1 aqg3rsJYNXlDSgEfdlA9HPITqPGdQ7b6a5B9okkPLIl12h8XQFGY32nmKeBdFDlJR/ eeS/6GfYh+zNGGZkiMitIsaKbpA9TBHhGrY0MZt6ziQZlAFXK7Qx27N/G5+G3q8TDh MFhqHTARUZwQCvDprCMXDiIXPzmJ/nbhOrwDP/bJk9U/7jIRCPya2sMAV9/1DS3yvb HfkpCW6XtoGBg== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 04/10] libfs: Fix simple_offset_rename_exchange() Date: Fri, 24 Jan 2025 14:19:39 -0500 Message-ID: <20250124191946.22308-5-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 23cdd0eed3f1fff3af323092b0b88945a7950d8e ] User space expects the replacement (old) directory entry to have the same directory offset after the rename. Suggested-by: Christian Brauner Fixes: a2e459555c5f ("shmem: stable directory offsets") Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20240415152057.4605-2-cel@kernel.org Signed-off-by: Christian Brauner [ cel: adjusted to apply to origin/linux-6.6.y ] Signed-off-by: Chuck Lever --- fs/libfs.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index d7b901cb9af4..2029cb6a0e15 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -294,6 +294,18 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) return 0; } +static int simple_offset_replace(struct offset_ctx *octx, struct dentry *dentry, + long offset) +{ + void *ret; + + ret = xa_store(&octx->xa, offset, dentry, GFP_KERNEL); + if (xa_is_err(ret)) + return xa_err(ret); + offset_set(dentry, offset); + return 0; +} + /** * simple_offset_remove - Remove an entry to a directory's offset map * @octx: directory offset ctx to be updated @@ -351,6 +363,9 @@ int simple_offset_empty(struct dentry *dentry) * @new_dir: destination parent * @new_dentry: destination dentry * + * This API preserves the directory offset values. Caller provides + * appropriate serialization. + * * Returns zero on success. Otherwise a negative errno is returned and the * rename is rolled back. */ @@ -368,11 +383,11 @@ int simple_offset_rename_exchange(struct inode *old_dir, simple_offset_remove(old_ctx, old_dentry); simple_offset_remove(new_ctx, new_dentry); - ret = simple_offset_add(new_ctx, old_dentry); + ret = simple_offset_replace(new_ctx, old_dentry, new_index); if (ret) goto out_restore; - ret = simple_offset_add(old_ctx, new_dentry); + ret = simple_offset_replace(old_ctx, new_dentry, old_index); if (ret) { simple_offset_remove(new_ctx, old_dentry); goto out_restore; @@ -387,10 +402,8 @@ int simple_offset_rename_exchange(struct inode *old_dir, return 0; out_restore: - offset_set(old_dentry, old_index); - xa_store(&old_ctx->xa, old_index, old_dentry, GFP_KERNEL); - offset_set(new_dentry, new_index); - xa_store(&new_ctx->xa, new_index, new_dentry, GFP_KERNEL); + (void)simple_offset_replace(old_ctx, old_dentry, old_index); + (void)simple_offset_replace(new_ctx, new_dentry, new_index); return ret; } From patchwork Fri Jan 24 19:19:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949876 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE017224B1C; Fri, 24 Jan 2025 19:19:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746396; cv=none; b=qNthHGdKi6BC13mC1kGAm17ESdjnFzgN+5gMz2f+vXsQ9f0cOggx9UrODED8QJ7Ursr/w1keqKln10628DCLkkigX559rv9daBVDzozhOfX7asaLBOQjSeuY8O86FmBZ6ARCQRwAbDg+WP6VnkP1NioWOmd6ChLDMwROUFrrgR4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746396; c=relaxed/simple; bh=zxSHvTXBTLPfQg38mkpreNxvil2Vzc9/HC9eEuTBUzs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=txk1st3Q4kEIa4xv3+EUCQvqtYe8huivdDn6y72Nno4dsQCoMFYlNkdizQZAUD+bFDFB8n5RsOsO0ECuZAyxOiwFeBpErTGCz/qSC9pVWYaqZfjwrbr9sHEZPGi5vscK9aIvPjg2dlvGsJNiCrSpSpFCPux8xzTGz0eatRte0u4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Cv+scLBr; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Cv+scLBr" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9E72CC4CEE5; Fri, 24 Jan 2025 19:19:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746396; bh=zxSHvTXBTLPfQg38mkpreNxvil2Vzc9/HC9eEuTBUzs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Cv+scLBrqN+k9/W+qvAwD8sQxTVRLnaM5jdmvabrIKB/9GAQAsSBHaUyeMBWok4SX JcGAbXH636Br7W7QbTZB5O1npwacuax3JCvZ8HD3Vsh4diKEMf/zo0iJlzPaQ7oqh9 rqxvcBOrc9XobhsksfGqQcKnQ75rkDVAFFmIyOdRDHCoI89AUDP/rqWRBZOzm85UO3 hIHijzNeUhk8Aify8yQIke4Gix3/3m6hGBMUeWkWMPWFDnKen/vUt9M6xQxxyZcJv7 Bw1xfRP2jntdXYlZDRJz+8DZoUArKtMzWuXbnIGsDRfHnTu0+mKKj8meHKmCiqoKvk kaFysGbrahW4w== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 05/10] libfs: Add simple_offset_rename() API Date: Fri, 24 Jan 2025 14:19:40 -0500 Message-ID: <20250124191946.22308-6-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 5a1a25be995e1014abd01600479915683e356f5c ] I'm about to fix a tmpfs rename bug that requires the use of internal simple_offset helpers that are not available in mm/shmem.c Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20240415152057.4605-3-cel@kernel.org Signed-off-by: Christian Brauner Signed-off-by: Chuck Lever --- fs/libfs.c | 21 +++++++++++++++++++++ include/linux/fs.h | 2 ++ mm/shmem.c | 3 +-- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 2029cb6a0e15..15d8c3300dae 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -356,6 +356,27 @@ int simple_offset_empty(struct dentry *dentry) return ret; } +/** + * simple_offset_rename - handle directory offsets for rename + * @old_dir: parent directory of source entry + * @old_dentry: dentry of source entry + * @new_dir: parent_directory of destination entry + * @new_dentry: dentry of destination + * + * Caller provides appropriate serialization. + * + * Returns zero on success, a negative errno value on failure. + */ +int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry) +{ + struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir); + struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir); + + simple_offset_remove(old_ctx, old_dentry); + return simple_offset_add(new_ctx, old_dentry); +} + /** * simple_offset_rename_exchange - exchange rename with directory offsets * @old_dir: parent of dentry being moved diff --git a/include/linux/fs.h b/include/linux/fs.h index 5104405ce3e6..e4d139fcaad0 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3198,6 +3198,8 @@ void simple_offset_init(struct offset_ctx *octx); int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry); void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry); int simple_offset_empty(struct dentry *dentry); +int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry); int simple_offset_rename_exchange(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, diff --git a/mm/shmem.c b/mm/shmem.c index aaf679976f3b..ab2b0e87b051 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3434,8 +3434,7 @@ static int shmem_rename2(struct mnt_idmap *idmap, return error; } - simple_offset_remove(shmem_get_offset_ctx(old_dir), old_dentry); - error = simple_offset_add(shmem_get_offset_ctx(new_dir), old_dentry); + error = simple_offset_rename(old_dir, old_dentry, new_dir, new_dentry); if (error) return error; From patchwork Fri Jan 24 19:19:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949877 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28180224B1C; Fri, 24 Jan 2025 19:19:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746398; cv=none; b=fpaz2Ym/G7yYX0zHMb+SRMHwqPJdi/m6NWaGwqYX+PmUEgw7n3uslFw5lD/0s5JornEdK0Bk75RWvilz4ET18ztTQBa+qtSYoduCVdXRDE0R3XWHOoSC5qU1myQpHhsfMfOMwZK1s26wQV+zbsiNYDGQMOp9yq5m10/V1Wh++7Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746398; c=relaxed/simple; bh=PSeXT5OPmZ3CLHOWnsn06fMWMBlx3JfvOW356lmNS5Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FU5gs9NA+dwMUtltLR3zDB+kuAsrlqSZNIFhiyiMWDeSL9FcwZBoCD22869CmFPvZxk/kiPiDD05ud5T1D5MG4zdFBYHp6JiduBaYGaJe4/O+HA+ivqq+Cy1QvH9QCHgnyvzliJri+pRztGE3QwwbHiEE0b1dd2FC0hIOUl1EMA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=JmvAjPom; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JmvAjPom" Received: by smtp.kernel.org (Postfix) with ESMTPSA id EB29EC4CEE6; Fri, 24 Jan 2025 19:19:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746398; bh=PSeXT5OPmZ3CLHOWnsn06fMWMBlx3JfvOW356lmNS5Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JmvAjPomIA5ROix25gQqohqFQGadayGDEb/GVgwrMLYV3LfblJ1SQbdIFz86sl0xC XL/9Ei90N5aEa39nCIzzxj/gvmPgeB8HdFiaQJKSnRrkjt3FG4yrd2FEi/POCCrLx3 C5YgCEt9JpNtn6eTApz2lQKkl9g96Ff7iOmm4/1EW+fZKPQW3M1s7mvf9dbyk1TTsR bbmFDhNn0JK1sPZqpnwHe5IahIHFxnYxbO+mBOIaOO/SNFhdmkoWGKeTdEqdEmDN1G oh88dnCbSO0Dcnnxx+arF0++YHwKEVvPPOk/3mjyyu17Uv6ygJ1Qpv5H5BvT03Pcjy tw9+K+hGvT0jw== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 06/10] shmem: Fix shmem_rename2() Date: Fri, 24 Jan 2025 14:19:41 -0500 Message-ID: <20250124191946.22308-7-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit ad191eb6d6942bb835a0b20b647f7c53c1d99ca4 ] When renaming onto an existing directory entry, user space expects the replacement entry to have the same directory offset as the original one. Link: https://gitlab.alpinelinux.org/alpine/aports/-/issues/15966 Fixes: a2e459555c5f ("shmem: stable directory offsets") Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20240415152057.4605-4-cel@kernel.org Signed-off-by: Christian Brauner Signed-off-by: Chuck Lever --- fs/libfs.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/fs/libfs.c b/fs/libfs.c index 15d8c3300dae..ab9fc182fd22 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -365,6 +365,9 @@ int simple_offset_empty(struct dentry *dentry) * * Caller provides appropriate serialization. * + * User space expects the directory offset value of the replaced + * (new) directory entry to be unchanged after a rename. + * * Returns zero on success, a negative errno value on failure. */ int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, @@ -372,8 +375,14 @@ int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, { struct offset_ctx *old_ctx = old_dir->i_op->get_offset_ctx(old_dir); struct offset_ctx *new_ctx = new_dir->i_op->get_offset_ctx(new_dir); + long new_offset = dentry2offset(new_dentry); simple_offset_remove(old_ctx, old_dentry); + + if (new_offset) { + offset_set(new_dentry, 0); + return simple_offset_replace(new_ctx, old_dentry, new_offset); + } return simple_offset_add(new_ctx, old_dentry); } From patchwork Fri Jan 24 19:19:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949878 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DDB6A2248B2; Fri, 24 Jan 2025 19:19:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746400; cv=none; b=Ew0cDwiAuUzriOXiVPtfb0np7DZHc7IyUJI853xf8FNhtW8BrMTMRxVMLQrkFMQ//JFOKOn5UfnC/fV3bY/R3zewb392nf4xdHybwtC5JRefPcsLAxwk+VnzkXU7yapk8jFSyJAPPwVqywEAAEIONAXM3b06zHKhchRgIFIK+f8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746400; c=relaxed/simple; bh=+MlNwIFBadfJD8EMoOx2TqTDKMIrB7l68m+npdFSw20=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Nprlxv6ZeFBKEP1dFyQLVDMad4TlhFIUZ33rYKfzBmzlkIuScj+SGC9WB65KCbTquy0hw3/3RKstuF93/OOtYDJVvQyTw/ssPFeEJyFq2PhrIFxU4tviAakXcV/rdCChZxWn0fYKMNT6f+OB7sKwOjYoAvZGjcBtjmAMPBOmDRM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BUQULhBP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BUQULhBP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 423B6C4CEE0; Fri, 24 Jan 2025 19:19:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746399; bh=+MlNwIFBadfJD8EMoOx2TqTDKMIrB7l68m+npdFSw20=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BUQULhBPqFOT/WhqvnY3sV0lAtBbbNP/wiyOPU3EEo9WSLCOAzMoLbGTftvUmLQ2Z olyO1ZMSyx1YOFhYOC2eOXp8d4oFAVYf78UT4oPTKwI/6olDxREnW2Xb2BNyMngNS1 8O13HK60trrGWcOzqhAv8yz5htOozX5ENWUczZaN1d8xzR55sFG0CBUFzI5gV+Vl6S 0rY/vX/heKu7yIJcmj9kWXV30qy7N1/tNV5cz4bRRp0MxxNWVxETCnW/R55Ob7TJsS 2Fij7yqZtieFohKbzUMBN6EpvjRpg75WykhEzrNDMUYir8bWF++XqIZIiacoikLoEO 8e8Xq3gMXpJoA== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever , Jeff Layton Subject: [RFC PATCH v6.6 07/10] libfs: Return ENOSPC when the directory offset range is exhausted Date: Fri, 24 Jan 2025 14:19:42 -0500 Message-ID: <20250124191946.22308-8-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 903dc9c43a155e0893280c7472d4a9a3a83d75a6 ] Testing shows that the EBUSY error return from mtree_alloc_cyclic() leaks into user space. The ERRORS section of "man creat(2)" says: > EBUSY O_EXCL was specified in flags and pathname refers > to a block device that is in use by the system > (e.g., it is mounted). ENOSPC is closer to what applications expect in this situation. Note that the normal range of simple directory offset values is 2..2^63, so hitting this error is going to be rare to impossible. Fixes: 6faddda69f62 ("libfs: Add directory operations for stable offsets") Cc: stable@vger.kernel.org # v6.9+ Reviewed-by: Jeff Layton Reviewed-by: Yang Erkun Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20241228175522.1854234-2-cel@kernel.org Signed-off-by: Christian Brauner [ cel: adjusted to apply to origin/linux-6.6.y ] Signed-off-by: Chuck Lever --- fs/libfs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index ab9fc182fd22..200bcfc2ac34 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -287,8 +287,8 @@ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) ret = xa_alloc_cyclic(&octx->xa, &offset, dentry, limit, &octx->next_offset, GFP_KERNEL); - if (ret < 0) - return ret; + if (unlikely(ret < 0)) + return ret == -EBUSY ? -ENOSPC : ret; offset_set(dentry, offset); return 0; From patchwork Fri Jan 24 19:19:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949879 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A2C92248B2; Fri, 24 Jan 2025 19:20:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746401; cv=none; b=mT96s3TEeELIE8laYjWmtZiJC+ug+ls/KaxTcl9l3oprcDzCRfX5tomrrufA3UAdZltdQQUqpWJo23Bd0ILp0bTIe2sy8CJjxL/KFKlqkfvpbfqnZSsy5fIQm4o+Uz971pP9RIs1LBN8xL2n4yuH3MNshdbWnctLtadrev+UykQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746401; c=relaxed/simple; bh=7i8ieMEYnSbXE7wG5048yyfg5P+RPVEiCcn7OJpp8XM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GXbUzcqdM8W+d9ZJM8JDn6cPaDX49W6QTzqCyYl1aQhVk7hSp+H/Lfyy6r49I7QEibzNGxUmRwCWPhtusBOmHyGuUjUFTPCD00QggOSsuqv5g1KWnS44u1yAs7e6AuEL85rQ8qpzezWwjav1IBqV8YYK0/SMXXwkV8tTtt5cPY0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Q306l4ws; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Q306l4ws" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1325C4CEE3; Fri, 24 Jan 2025 19:19:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746400; bh=7i8ieMEYnSbXE7wG5048yyfg5P+RPVEiCcn7OJpp8XM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q306l4wsGt82CKQzJvzhCo3HcT3FOVHMDa4ucpXOi/awls8OhBeRzHF/cvjCiN/es TpjuBhJs8QKINqg3tPofapF/U7S59qv64UiRwDNps4Rn+6oEbcAONf+scmmvYG4mNe nEfrbvG9XNRWRWNLjoXHoBtXHpu3kAhE8RFN1bOacH8yxGWMfvwNggTrD4lUdBPv7R k31E1KBrrisj3Vo0UfhlGhwQ+3t7EgN9Z674QXx6Ss2S9DM6gYboYkLaS+pHszU2ih mNVa3Qn5iyQh7toBcbBlE15mLKn9yoHQ0DU2Z4aNaTrgOz0Ur0D6WmjFylrTUOtK9n 4BS4b24PehbQQ== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 08/10] Revert "libfs: Add simple_offset_empty()" Date: Fri, 24 Jan 2025 14:19:43 -0500 Message-ID: <20250124191946.22308-9-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit d7bde4f27ceef3dc6d72010a20d4da23db835a32 ] simple_empty() and simple_offset_empty() perform the same task. The latter's use as a canary to find bugs has not found any new issues. A subsequent patch will remove the use of the mtree for iterating directory contents, so revert back to using a similar mechanism for determining whether a directory is indeed empty. Only one such mechanism is ever needed. Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20241228175522.1854234-3-cel@kernel.org Reviewed-by: Yang Erkun Signed-off-by: Christian Brauner [ cel: adjusted to apply to origin/linux-6.6.y ] Signed-off-by: Chuck Lever --- fs/libfs.c | 32 -------------------------------- include/linux/fs.h | 1 - mm/shmem.c | 4 ++-- 3 files changed, 2 insertions(+), 35 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 200bcfc2ac34..082bacf0b9e6 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -324,38 +324,6 @@ void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry) offset_set(dentry, 0); } -/** - * simple_offset_empty - Check if a dentry can be unlinked - * @dentry: dentry to be tested - * - * Returns 0 if @dentry is a non-empty directory; otherwise returns 1. - */ -int simple_offset_empty(struct dentry *dentry) -{ - struct inode *inode = d_inode(dentry); - struct offset_ctx *octx; - struct dentry *child; - unsigned long index; - int ret = 1; - - if (!inode || !S_ISDIR(inode->i_mode)) - return ret; - - index = DIR_OFFSET_MIN; - octx = inode->i_op->get_offset_ctx(inode); - xa_for_each(&octx->xa, index, child) { - spin_lock(&child->d_lock); - if (simple_positive(child)) { - spin_unlock(&child->d_lock); - ret = 0; - break; - } - spin_unlock(&child->d_lock); - } - - return ret; -} - /** * simple_offset_rename - handle directory offsets for rename * @old_dir: parent directory of source entry diff --git a/include/linux/fs.h b/include/linux/fs.h index e4d139fcaad0..e47596d354ff 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3197,7 +3197,6 @@ struct offset_ctx { void simple_offset_init(struct offset_ctx *octx); int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry); void simple_offset_remove(struct offset_ctx *octx, struct dentry *dentry); -int simple_offset_empty(struct dentry *dentry); int simple_offset_rename(struct inode *old_dir, struct dentry *old_dentry, struct inode *new_dir, struct dentry *new_dentry); int simple_offset_rename_exchange(struct inode *old_dir, diff --git a/mm/shmem.c b/mm/shmem.c index ab2b0e87b051..283fb62084d4 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3368,7 +3368,7 @@ static int shmem_unlink(struct inode *dir, struct dentry *dentry) static int shmem_rmdir(struct inode *dir, struct dentry *dentry) { - if (!simple_offset_empty(dentry)) + if (!simple_empty(dentry)) return -ENOTEMPTY; drop_nlink(d_inode(dentry)); @@ -3425,7 +3425,7 @@ static int shmem_rename2(struct mnt_idmap *idmap, return simple_offset_rename_exchange(old_dir, old_dentry, new_dir, new_dentry); - if (!simple_offset_empty(new_dentry)) + if (!simple_empty(new_dentry)) return -ENOTEMPTY; if (flags & RENAME_WHITEOUT) { From patchwork Fri Jan 24 19:19:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949880 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E5B92248B2; Fri, 24 Jan 2025 19:20:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746402; cv=none; b=PGTgh6+6Mnb1oD5H5X5GWQjJHZ8PO7yGoxs4v3+lkH3iH4A6Iild2gnrnqO3UdGNjtbrnKMg//eVmEku54jYWW3lpcD1mQ8nFyBsw+vrM7OIIys9k+BATcCVp1PIppViVhAnUcwvx0GaEJLuRxS0IU14D/mH6JJ+3tWNiI6ERBQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746402; c=relaxed/simple; bh=gDiEr1u2XuRYm0Qux04y3g9m3w7wietr1o69aqd9elQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QAog5H4GZo60kKPlB0I2CMkOZB9rrcYbf/t1LJtCogJseR2o0jWhQpmTFQmbISq5xj2ZFDRGRALCqbdmwUFF3uvrQuUnOURpG710VBDVLZGotXhq/TDYYL16FreyAwYjPj0+SHJS/Ln5ol/KeWiqobvNozFMMHK3olBQlQFPj3g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IeyqBdZ1; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IeyqBdZ1" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E97BAC4CEE6; Fri, 24 Jan 2025 19:20:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746402; bh=gDiEr1u2XuRYm0Qux04y3g9m3w7wietr1o69aqd9elQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IeyqBdZ1u/OlXH/wzcuLOlGVvF6NtUyseSXU0gU7bu1wtCR98kWWfpPic68HjQu0N R9/NVgfv3AYfFQFs+NDWK4cQfm3LX5fgVanT1ozNYY9ltKdlMIt5kr8EOelN++/LKh Mfn5o9G6FMfVW7uwXPwYI6Ax73bna2tYyvhQ+JpPgJDx7rrOZ1dIz+doaecS1E5QVJ TrbxK+U9noFDj2kwPku8Prdn6hazAkfN+E8IIfIvcr1MRYza51RDIyg5TanCOG9b2y LiXpUuhVyyZNC4Kk2qYVI70rcbQ2ZjXItQSsHWrzeweJJ83UzNa8WJs3VdAwOfLr+X 3X9WtcPAHgFvw== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 09/10] libfs: Replace simple_offset end-of-directory detection Date: Fri, 24 Jan 2025 14:19:44 -0500 Message-ID: <20250124191946.22308-10-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit 68a3a65003145644efcbb651e91db249ccd96281 ] According to getdents(3), the d_off field in each returned directory entry points to the next entry in the directory. The d_off field in the last returned entry in the readdir buffer must contain a valid offset value, but if it points to an actual directory entry, then readdir/getdents can loop. This patch introduces a specific fixed offset value that is placed in the d_off field of the last entry in a directory. Some user space applications assume that the EOD offset value is larger than the offsets of real directory entries, so the largest valid offset value is reserved for this purpose. This new value is never allocated by simple_offset_add(). When ->iterate_dir() returns, getdents{64} inserts the ctx->pos value into the d_off field of the last valid entry in the readdir buffer. When it hits EOD, offset_readdir() sets ctx->pos to the EOD offset value so the last entry is updated to point to the EOD marker. When trying to read the entry at the EOD offset, offset_readdir() terminates immediately. It is worth noting that using a Maple tree for directory offset value allocation does not guarantee a 63-bit range of values -- on platforms where "long" is a 32-bit type, the directory offset value range is still 0..(2^31 - 1). For broad compatibility with 32-bit user space, the largest tmpfs directory cookie value is now S32_MAX. Fixes: 796432efab1e ("libfs: getdents() should return 0 after reaching EOD") Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20241228175522.1854234-5-cel@kernel.org Signed-off-by: Christian Brauner [ cel: adjusted to apply to origin/linux-6.6.y ] Signed-off-by: Chuck Lever --- fs/libfs.c | 37 +++++++++++++++++++++---------------- 1 file changed, 21 insertions(+), 16 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index 082bacf0b9e6..d546f3f0c036 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -239,9 +239,15 @@ const struct inode_operations simple_dir_inode_operations = { }; EXPORT_SYMBOL(simple_dir_inode_operations); -/* 0 is '.', 1 is '..', so always start with offset 2 or more */ +/* simple_offset_add() never assigns these to a dentry */ enum { - DIR_OFFSET_MIN = 2, + DIR_OFFSET_EOD = S32_MAX, +}; + +/* simple_offset_add() allocation range */ +enum { + DIR_OFFSET_MIN = 2, + DIR_OFFSET_MAX = DIR_OFFSET_EOD - 1, }; static void offset_set(struct dentry *dentry, u32 offset) @@ -278,7 +284,8 @@ void simple_offset_init(struct offset_ctx *octx) */ int simple_offset_add(struct offset_ctx *octx, struct dentry *dentry) { - static const struct xa_limit limit = XA_LIMIT(DIR_OFFSET_MIN, U32_MAX); + static const struct xa_limit limit = XA_LIMIT(DIR_OFFSET_MIN, + DIR_OFFSET_MAX); u32 offset; int ret; @@ -442,8 +449,6 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return -EINVAL; } - /* In this case, ->private_data is protected by f_pos_lock */ - file->private_data = NULL; return vfs_setpos(file, offset, U32_MAX); } @@ -453,7 +458,7 @@ static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) XA_STATE(xas, &octx->xa, offset); rcu_read_lock(); - child = xas_next_entry(&xas, U32_MAX); + child = xas_next_entry(&xas, DIR_OFFSET_MAX); if (!child) goto out; spin_lock(&child->d_lock); @@ -474,7 +479,7 @@ static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } -static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) +static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx) { struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); struct dentry *dentry; @@ -482,7 +487,7 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) while (true) { dentry = offset_find_next(octx, ctx->pos); if (!dentry) - return ERR_PTR(-ENOENT); + goto out_eod; if (!offset_dir_emit(ctx, dentry)) { dput(dentry); @@ -492,7 +497,10 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) ctx->pos = dentry2offset(dentry) + 1; dput(dentry); } - return NULL; + return; + +out_eod: + ctx->pos = DIR_OFFSET_EOD; } /** @@ -512,6 +520,8 @@ static void *offset_iterate_dir(struct inode *inode, struct dir_context *ctx) * * On return, @ctx->pos contains an offset that will read the next entry * in this directory when offset_readdir() is called again with @ctx. + * Caller places this value in the d_off field of the last entry in the + * user's buffer. * * Return values: * %0 - Complete @@ -524,13 +534,8 @@ static int offset_readdir(struct file *file, struct dir_context *ctx) if (!dir_emit_dots(file, ctx)) return 0; - - /* In this case, ->private_data is protected by f_pos_lock */ - if (ctx->pos == DIR_OFFSET_MIN) - file->private_data = NULL; - else if (file->private_data == ERR_PTR(-ENOENT)) - return 0; - file->private_data = offset_iterate_dir(d_inode(dir), ctx); + if (ctx->pos != DIR_OFFSET_EOD) + offset_iterate_dir(d_inode(dir), ctx); return 0; } From patchwork Fri Jan 24 19:19:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuck Lever X-Patchwork-Id: 13949881 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7930B225409; Fri, 24 Jan 2025 19:20:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746403; cv=none; b=avsFPbxhszcvLoQlXYSo/eo4SuvaFJNZzX9Vmoj6fN6d2Rya2Gi63V8JANSRxhy9szVKAGl/dGLwq3rR3C8d6FO8xebjLP/wIDgHxp1qfT5k5UupNK9ef8+V65yYPDK46scbnE+LABuHXI8OYWv66IjwhbAIaAhBfssd1S74HnE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1737746403; c=relaxed/simple; bh=pXmfTZBQd8c0darSl23XS7KeuERbHI7o2nY6VDQrO7E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QTCCyOAgsaNCMeblpY+XA8/vK/9Giap0KendUIwP4zC/+oDy20t0XNbiA7h9+2QRNtuEDX4Rz1mJ6kRtwkVlrIX3oCU0+WpmnycEOu4H4dExYy4iid2W6N7lNHWiWSBU9BJWpOdUVIqRuGBntFFwHLE+L2PxRNY0y3jqMcaj7ck= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=EhNPcXtL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="EhNPcXtL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 432B7C4CEE4; Fri, 24 Jan 2025 19:20:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1737746403; bh=pXmfTZBQd8c0darSl23XS7KeuERbHI7o2nY6VDQrO7E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EhNPcXtLJspy0nprHTzSerMolchY0HHaEn7sAVhiyh/ixaJi7T7DX3CuJDHV/4/qQ 8RJKbgdDrLBvcrRI+iFITTG3+SZ6Mp4iLhxmMukjRcKd6Im+fmJisZe7tW29CqJ72P JOqeObPF/+FSbxtrEfNt+pxdkf0wW0kFqqM4CoIZkGcCWlfi406hpCLKIYc1AUMGGQ k5h1VvyWHg9z2bxpd/wf4LOUjfG+2jRK/TixlVua0IYfh+XuVmLHbWciTJdkmHgMlj vOY16NanQBHZURS9Mab22beJeXw8hMgxGlsJ/xNHZD2bsGju3Kc/gKITapmHe9NWNc lyjyFjQDNcbtA== From: cel@kernel.org To: Hugh Dickins , Andrew Morten , Christian Brauner , Al Viro , Greg Kroah-Hartman , Sasha Levin Cc: , , , yukuai3@huawei.com, yangerkun@huawei.com, Chuck Lever Subject: [RFC PATCH v6.6 10/10] libfs: Use d_children list to iterate simple_offset directories Date: Fri, 24 Jan 2025 14:19:45 -0500 Message-ID: <20250124191946.22308-11-cel@kernel.org> X-Mailer: git-send-email 2.47.0 In-Reply-To: <20250124191946.22308-1-cel@kernel.org> References: <20250124191946.22308-1-cel@kernel.org> Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chuck Lever [ Upstream commit b9b588f22a0c049a14885399e27625635ae6ef91 ] The mtree mechanism has been effective at creating directory offsets that are stable over multiple opendir instances. However, it has not been able to handle the subtleties of renames that are concurrent with readdir. Instead of using the mtree to emit entries in the order of their offset values, use it only to map incoming ctx->pos to a starting entry. Then use the directory's d_children list, which is already maintained properly by the dcache, to find the next child to emit. One of the sneaky things about this is that when the mtree-allocated offset value wraps (which is very rare), looking up ctx->pos++ is not going to find the next entry; it will return NULL. Instead, by following the d_children list, the offset values can appear in any order but all of the entries in the directory will be visited eventually. Note also that the readdir() is guaranteed to reach the tail of this list. Entries are added only at the head of d_children, and readdir walks from its current position in that list towards its tail. Signed-off-by: Chuck Lever Link: https://lore.kernel.org/r/20241228175522.1854234-6-cel@kernel.org Signed-off-by: Christian Brauner [ cel: adjusted to apply to origin/linux-6.6.y ] Signed-off-by: Chuck Lever --- fs/libfs.c | 84 ++++++++++++++++++++++++++++++++++++++---------------- 1 file changed, 59 insertions(+), 25 deletions(-) diff --git a/fs/libfs.c b/fs/libfs.c index d546f3f0c036..f5566964aa7d 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -241,12 +241,13 @@ EXPORT_SYMBOL(simple_dir_inode_operations); /* simple_offset_add() never assigns these to a dentry */ enum { + DIR_OFFSET_FIRST = 2, /* Find first real entry */ DIR_OFFSET_EOD = S32_MAX, }; /* simple_offset_add() allocation range */ enum { - DIR_OFFSET_MIN = 2, + DIR_OFFSET_MIN = DIR_OFFSET_FIRST + 1, DIR_OFFSET_MAX = DIR_OFFSET_EOD - 1, }; @@ -452,51 +453,84 @@ static loff_t offset_dir_llseek(struct file *file, loff_t offset, int whence) return vfs_setpos(file, offset, U32_MAX); } -static struct dentry *offset_find_next(struct offset_ctx *octx, loff_t offset) +static struct dentry *find_positive_dentry(struct dentry *parent, + struct dentry *dentry, + bool next) { + struct dentry *found = NULL; + + spin_lock(&parent->d_lock); + if (next) + dentry = list_next_entry(dentry, d_child); + else if (!dentry) + dentry = list_first_entry_or_null(&parent->d_subdirs, + struct dentry, d_child); + for (; dentry && !list_entry_is_head(dentry, &parent->d_subdirs, d_child); + dentry = list_next_entry(dentry, d_child)) { + if (!simple_positive(dentry)) + continue; + spin_lock_nested(&dentry->d_lock, DENTRY_D_LOCK_NESTED); + if (simple_positive(dentry)) + found = dget_dlock(dentry); + spin_unlock(&dentry->d_lock); + if (likely(found)) + break; + } + spin_unlock(&parent->d_lock); + return found; +} + +static noinline_for_stack struct dentry * +offset_dir_lookup(struct dentry *parent, loff_t offset) +{ + struct inode *inode = d_inode(parent); + struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); struct dentry *child, *found = NULL; + XA_STATE(xas, &octx->xa, offset); - rcu_read_lock(); - child = xas_next_entry(&xas, DIR_OFFSET_MAX); - if (!child) - goto out; - spin_lock(&child->d_lock); - if (simple_positive(child)) - found = dget_dlock(child); - spin_unlock(&child->d_lock); -out: - rcu_read_unlock(); + if (offset == DIR_OFFSET_FIRST) + found = find_positive_dentry(parent, NULL, false); + else { + rcu_read_lock(); + child = xas_next_entry(&xas, DIR_OFFSET_MAX); + found = find_positive_dentry(parent, child, false); + rcu_read_unlock(); + } return found; } static bool offset_dir_emit(struct dir_context *ctx, struct dentry *dentry) { - u32 offset = dentry2offset(dentry); struct inode *inode = d_inode(dentry); - return ctx->actor(ctx, dentry->d_name.name, dentry->d_name.len, offset, - inode->i_ino, fs_umode_to_dtype(inode->i_mode)); + return dir_emit(ctx, dentry->d_name.name, dentry->d_name.len, + inode->i_ino, fs_umode_to_dtype(inode->i_mode)); } -static void offset_iterate_dir(struct inode *inode, struct dir_context *ctx) +static void offset_iterate_dir(struct file *file, struct dir_context *ctx) { - struct offset_ctx *octx = inode->i_op->get_offset_ctx(inode); + struct dentry *dir = file->f_path.dentry; struct dentry *dentry; + dentry = offset_dir_lookup(dir, ctx->pos); + if (!dentry) + goto out_eod; while (true) { - dentry = offset_find_next(octx, ctx->pos); - if (!dentry) - goto out_eod; + struct dentry *next; - if (!offset_dir_emit(ctx, dentry)) { - dput(dentry); + ctx->pos = dentry2offset(dentry); + if (!offset_dir_emit(ctx, dentry)) break; - } - ctx->pos = dentry2offset(dentry) + 1; + next = find_positive_dentry(dir, dentry, true); dput(dentry); + + if (!next) + goto out_eod; + dentry = next; } + dput(dentry); return; out_eod: @@ -535,7 +569,7 @@ static int offset_readdir(struct file *file, struct dir_context *ctx) if (!dir_emit_dots(file, ctx)) return 0; if (ctx->pos != DIR_OFFSET_EOD) - offset_iterate_dir(d_inode(dir), ctx); + offset_iterate_dir(file, ctx); return 0; }