diff mbox series

libfs: Fix duplicate directory entry in offset_dir_lookup

Message ID 20250320034417.555810-1-sunyongjian@huaweicloud.com (mailing list archive)
State New
Headers show
Series libfs: Fix duplicate directory entry in offset_dir_lookup | expand

Commit Message

Yongjian Sun March 20, 2025, 3:44 a.m. UTC
From: Yongjian Sun <sunyongjian1@huawei.com>

There is an issue in the kernel:

In tmpfs, when using the "ls" command to list the contents
of a directory with a large number of files, glibc performs
the getdents call in multiple rounds. If a concurrent unlink
occurs between these getdents calls, it may lead to duplicate
directory entries in the ls output. One possible reproduction
scenario is as follows:

Create 1026 files and execute ls and rm concurrently:

for i in {1..1026}; do
    echo "This is file $i" > /tmp/dir/file$i
done

ls /tmp/dir				rm /tmp/dir/file4
	->getdents(file1026-file5)
						->unlink(file4)

	->getdents(file5,file3,file2,file1)

It is expected that the second getdents call to return file3
through file1, but instead it returns an extra file5.

The root cause of this problem is in the offset_dir_lookup
function. It uses mas_find to determine the starting position
for the current getdents call. Since mas_find locates the first
position that is greater than or equal to mas->index, when file4
is deleted, it ends up returning file5.

It can be fixed by replacing mas_find with mas_find_rev, which
finds the first position that is less than or equal to mas->index.

Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories")
Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
---
 fs/libfs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Chuck Lever March 20, 2025, 1:12 p.m. UTC | #1
On 3/19/25 11:44 PM, Yongjian Sun wrote:
> From: Yongjian Sun <sunyongjian1@huawei.com>
> 
> There is an issue in the kernel:
> 
> In tmpfs, when using the "ls" command to list the contents
> of a directory with a large number of files, glibc performs
> the getdents call in multiple rounds. If a concurrent unlink
> occurs between these getdents calls, it may lead to duplicate
> directory entries in the ls output. One possible reproduction
> scenario is as follows:
> 
> Create 1026 files and execute ls and rm concurrently:
> 
> for i in {1..1026}; do
>     echo "This is file $i" > /tmp/dir/file$i
> done
> 
> ls /tmp/dir				rm /tmp/dir/file4
> 	->getdents(file1026-file5)
> 						->unlink(file4)
> 
> 	->getdents(file5,file3,file2,file1)
> 
> It is expected that the second getdents call to return file3
> through file1, but instead it returns an extra file5.
> 
> The root cause of this problem is in the offset_dir_lookup
> function. It uses mas_find to determine the starting position
> for the current getdents call. Since mas_find locates the first
> position that is greater than or equal to mas->index, when file4
> is deleted, it ends up returning file5.
> 
> It can be fixed by replacing mas_find with mas_find_rev, which
> finds the first position that is less than or equal to mas->index.
> 
> Fixes: b9b588f22a0c ("libfs: Use d_children list to iterate simple_offset directories")
> Signed-off-by: Yongjian Sun <sunyongjian1@huawei.com>
> ---
>  fs/libfs.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/libfs.c b/fs/libfs.c
> index 8444f5cc4064..dc042a975a56 100644
> --- a/fs/libfs.c
> +++ b/fs/libfs.c
> @@ -496,7 +496,7 @@ offset_dir_lookup(struct dentry *parent, loff_t offset)
>  		found = find_positive_dentry(parent, NULL, false);
>  	else {
>  		rcu_read_lock();
> -		child = mas_find(&mas, DIR_OFFSET_MAX);
> +		child = mas_find_rev(&mas, DIR_OFFSET_MIN);
>  		found = find_positive_dentry(parent, child, false);
>  		rcu_read_unlock();
>  	}

Reviewed-by: Chuck Lever <chuck.lever@oracle.com>

Not an objection, but only an observation: This does not help the
(hopefully exceptionally rare) case when the next entry has an offset
value /larger/ than the current entry because the offset values have
wrapped. I don't have any good ideas about how to deal with that.
Christian Brauner March 20, 2025, 1:28 p.m. UTC | #2
On Thu, 20 Mar 2025 11:44:17 +0800, Yongjian Sun wrote:
> There is an issue in the kernel:
> 
> In tmpfs, when using the "ls" command to list the contents
> of a directory with a large number of files, glibc performs
> the getdents call in multiple rounds. If a concurrent unlink
> occurs between these getdents calls, it may lead to duplicate
> directory entries in the ls output. One possible reproduction
> scenario is as follows:
> 
> [...]

Applied to the vfs.fixes branch of the vfs/vfs.git tree.
Patches in the vfs.fixes branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.fixes

[1/1] libfs: Fix duplicate directory entry in offset_dir_lookup
      https://git.kernel.org/vfs/vfs/c/f70681e9e606
diff mbox series

Patch

diff --git a/fs/libfs.c b/fs/libfs.c
index 8444f5cc4064..dc042a975a56 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -496,7 +496,7 @@  offset_dir_lookup(struct dentry *parent, loff_t offset)
 		found = find_positive_dentry(parent, NULL, false);
 	else {
 		rcu_read_lock();
-		child = mas_find(&mas, DIR_OFFSET_MAX);
+		child = mas_find_rev(&mas, DIR_OFFSET_MIN);
 		found = find_positive_dentry(parent, child, false);
 		rcu_read_unlock();
 	}