From patchwork Wed May 10 10:52:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236763 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 792ECC77B7D for ; Wed, 10 May 2023 10:53:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236672AbjEJKxc (ORCPT ); Wed, 10 May 2023 06:53:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230100AbjEJKxT (ORCPT ); Wed, 10 May 2023 06:53:19 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B1B244A1; Wed, 10 May 2023 03:53:18 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id 12D38C01C; Wed, 10 May 2023 12:53:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683715997; bh=ztcZe/9kU1q9dB4s88Yjc8V4rSthT6QyJ5cIynFlzKw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=dRQjo2SbXZ2nmnK5uDuEWedfAO9qGSSkll/uUdixQA/vsFyboCwQfL/1a8fCoapyz alRLv4cx89zct1pfd2eOhZEcwfcY0A2YERDUmB+/45N4GQl+yMhoHfPZXyRF3rpC9d v6MvgiHt66HqrSd8Ehalh/Oqc1PmFhWmGj3/kwW3uAay+Itr2AOJBWzX5q5fRPNChw olDLr5qjTT/CnsZOYVS+cuGy0GSuKX0cLJeiIG6iDxKzAoP9hbzOmAIfX9ZrILfKxT YhdenZmOQyWgIQVE1rHQtZfo+RbCqY20xxsmyqr+aoS11AnNkB0WUI80MDlOrtk8Ey tYKTDzeGb4XTQ== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id 494DFC01B; Wed, 10 May 2023 12:53:09 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683715996; bh=ztcZe/9kU1q9dB4s88Yjc8V4rSthT6QyJ5cIynFlzKw=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=iSq3RbaWdxp6erflei1gNI5HFYgOC8tJRBS/nYb1wVHDjVrCED1WafkDeyPBj8Nao favpyCotJnSvi+vsmHkLQW54cKoBGOd/e7hRkQmtLeZoLGXNRPrazFywsw8mnIcG/z mHiEv8SRSlD/YDMpn7t3+sVrGV0SHyJQEREJ4N1fGv+eITbvioh/vd9ftFxbKctQh7 YJF3s/cNJhWHmDOuS6woLNhTPbJpsxwqZoUWlR/KpXNRzvu/4j7ebIYhCS+quBjvPc 5++pM5PaOxsVuTss7GF9trnPKC18XUsXNbOgnfNJIVixjxgo65oZlPgmqxNBmSxYox 7xka8YuNqb7bw== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id 32fdd01d; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:49 +0900 Subject: [PATCH v2 1/6] fs: split off vfs_getdents function of getdents64 syscall MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-1-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=2636; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=cPTv2wL6wLKLM00aZ75XOdO1cPc08BAWHjE8CCzXRtA=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eNkBvllWp6P///MucyMoGfFTUCk+oG7WhgQ Cwcvy8zZDmJAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jQAKCRCrTpvsapjm cP66EACqjpqxA0uN1MwDeFmeHID8g+co9EP4zXCLFdgQvmryvZQQsRX5V/VlGkP8bOHZN8yrc9P YOzfKu12GCzRSq4cITlVJgsPJYdLihMMEjEnplkANHcD/IRcjkQlqX9PBXY/BRbZlIZY8oYWh60 tctT84YTmoGQuplQq4eJom8aART9GwKMnkmmpSOuY5F6wZ3uRQhNudgkKmPtSu/nR5k3FKsN0sc qjtvDuf+t2PbjQQGvuaRm0g7HAIyyEO8pyyeGO0EATRDJbObCm+3jCsyfCUu4phMCwIjRuiNP20 w4MxPuVd4g2S5UmIcMaYn75mW4YZTi4pEUWz6qCKhwEv7OMrElxJPY33TFcGfFGwxIRkcxWiaJb puCAkRuw6JjLeS8uVrCw+A0nEU5xZMJMqyYEOM3/m8F+k41vdwThZlghwZZpgEg13p6Lv00WQ8G a29NMPNpn8l2bKo6oEHkSSB/S501HV05QrRTiyWqf5ikoKz74+GdfFmjwUdvZG/ySBxGBWU//+O whtGGdPM3QBIHVJyKD814QB7YIlu95nycpQStQab8THI3zTpaExu2mmbZ4si4yahLxpIx/odxbw 2vq6jqPh+flvsMRpNtfh+h5J7ka3aJkpt7RCi9BBaay6yeGeD4pvE5SLespezkM0Cb/ESZty285 m8/ek1NXPthTVqg== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This splits off the vfs_getdents function from the getdents64 system call. This will allow io_uring to call the vfs_getdents function. Co-authored-by: Stefan Roesch Signed-off-by: Dominique Martinet --- fs/internal.h | 8 ++++++++ fs/readdir.c | 34 ++++++++++++++++++++++++++-------- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index bd3b2810a36b..e8ca000e6613 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -260,3 +260,11 @@ ssize_t __kernel_write_iter(struct file *file, struct iov_iter *from, loff_t *po struct mnt_idmap *alloc_mnt_idmap(struct user_namespace *mnt_userns); struct mnt_idmap *mnt_idmap_get(struct mnt_idmap *idmap); void mnt_idmap_put(struct mnt_idmap *idmap); + +/* + * fs/readdir.c + */ +struct linux_dirent64; + +int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, + unsigned int count); diff --git a/fs/readdir.c b/fs/readdir.c index 9c53edb60c03..ed0803d0011e 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -21,6 +21,7 @@ #include #include #include +#include "internal.h" #include @@ -351,10 +352,16 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen, return false; } -SYSCALL_DEFINE3(getdents64, unsigned int, fd, - struct linux_dirent64 __user *, dirent, unsigned int, count) + +/** + * vfs_getdents - getdents without fdget + * @file : pointer to file struct of directory + * @dirent : pointer to user directory structure + * @count : size of buffer + */ +int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, + unsigned int count) { - struct fd f; struct getdents_callback64 buf = { .ctx.actor = filldir64, .count = count, @@ -362,11 +369,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, }; int error; - f = fdget_pos(fd); - if (!f.file) - return -EBADF; - - error = iterate_dir(f.file, &buf.ctx); + error = iterate_dir(file, &buf.ctx); if (error >= 0) error = buf.error; if (buf.prev_reclen) { @@ -379,6 +382,21 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, else error = count - buf.count; } + return error; +} + +SYSCALL_DEFINE3(getdents64, unsigned int, fd, + struct linux_dirent64 __user *, dirent, unsigned int, count) +{ + struct fd f; + int error; + + f = fdget_pos(fd); + if (!f.file) + return -EBADF; + + error = vfs_getdents(f.file, dirent, count); + fdput_pos(f); return error; } From patchwork Wed May 10 10:52:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236765 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 441DBC77B7D for ; Wed, 10 May 2023 10:53:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236670AbjEJKxf (ORCPT ); Wed, 10 May 2023 06:53:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236604AbjEJKxV (ORCPT ); Wed, 10 May 2023 06:53:21 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDFF57284; Wed, 10 May 2023 03:53:19 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id 8D6EEC025; Wed, 10 May 2023 12:53:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683715998; bh=Su8eZJ2qoUmmfP+Csyh7fEepnzBLn1U8CMdrHQiSywQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=Qb+3YVDjWhUvItj0olOQOcYmZ+4SLJqW8Q+o/0SgSNzgEeO5qgfD/eGTNe6avol7I Abfz5THYWoFpx/PYXR0lwU499hXLdIOj12S8YlSDC2FU5iq0IHaV0l+HWFne8I4wI4 hytX/9KjaYaoNHijvPKE5m0S1pU+bSyUtK1AhWGr1JoJ9lA35sz6EK0CRQo1dc3x7a 3wWu9hT0HDAFQCt+ei0mEJxayMwbALBVkceP/xj6w4wtgsfmos99PTL1ECKvzgUkxC wjFQ886lJE2owczhvDAnfoISFaKuKu9i8CeePSOufXHoxSOBZT8fgC4gdmUeV2feMG OQEvNoHzvP8zw== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id 24E43C009; Wed, 10 May 2023 12:53:11 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683715997; bh=Su8eZJ2qoUmmfP+Csyh7fEepnzBLn1U8CMdrHQiSywQ=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=k10H8T4hwcG1piRv8tvJe47oCmGc7WbXQZZrQ2KxIOmcrVUzAAXEEniU4OVUWbTrS TMMlMUDYSE71SSoR1oGTr3HphKF51lQ7DgiLeaELjwSl/V86FfZYDfTfIQzJ7FfzCd O45tL9L+s3HnYvRj6YF4yMv1m72Huq18m+fVWqljgRJgAiMwv47ygMwJORC92dmzWg ainvu5tFdA+xI408lDd9w3gF9zRZpoksKc8GbzvJHFLL6gOKYQEoH5XJp8Pctg9Pb8 1AjD6pO7ULsLPNf1PcYgXxktZeP0X3ipAvwkogXO9uKCJhGxGqkNryyPNxJhotzaIN jB+yjZVWc0jXg== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id fee99f7d; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:50 +0900 Subject: [PATCH v2 2/6] vfs_getdents/struct dir_context: add flags field MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-2-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=2686; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=nNsGnPD2PxOG3GhcmpTd07ibIqX2g1FmG81LOmB1oSI=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eNkJj7HaLZY/LqWPlbwN8L1PpEPw0D7pA5T D9jz+d8VCKJAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jQAKCRCrTpvsapjm cD06D/9CZfBLETGNann/siG9m1DVR2a2B8saSu6y0XNqusMCdXLyZU0VTWXkvv+U2GhNItVaZ7B 85Vmv6btA0DMBEiul7aeIV9uL+Doi0AnjKGhHMAOgAhVqlvKjlpCtzHZ3m6o8rDDRDEIVs0slUw +aehwVwjjHIq40dR8O3d9Vu/2xhaEo8SLqRi7p8IKW5DkjXsbXOYqFLgNzHa16LJ4zGhQQb5RuD B0k6qYDazIKXRYeVfAlVwYIH3Ogb4abof7q7Kekrxsod1dDf6LRdtGB7b1gXI7xs98RoZ6e4HVi iPV8ffTbgRhcUX3UvBcfYtZVmwZMQZNGbrlai2qhOuqmtAjmAWFLTaFgv9Ojjlp1emAXros8wGk 7uUJTDeujO3w2WqXyltIeACymjHdTl+8Ql4Bz7qjKMHFUSAWX2/+B+mQ5/POew2LjspcL5V0sPd 50+eqpXNr6pJlkQyXk+kBoPs5bN8IWPE78VIXK6/hu2acqrgD3E+cDjhDGE10w1K4PG5QaaAMy7 j9J7SBnmJcy/GcOrJna/ttR1LVjS2ghYgf0v/PXC5AQTAcrsd0fVT+dwf4bEN1U2O6+2Z2w61AV H4aOmW2X0pxgTSYbOdi/KDAjSv8nqh56gMY32I0ckHUqIA93hJJEWPkXTZSgVhFVnKTnWAiVtbm XVjPZe39nDkw0rA== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The flags will allow passing DIR_CONTEXT_F_NOWAIT to iterate() implementations that support it (as signaled through FMODE_NWAIT in file->f_mode) Notes: - considered using IOCB_NOWAIT but if we add more flags later it would be confusing to keep track of which values are valid, use dedicated flags - might want to check ctx.flags & DIR_CONTEXT_F_NOWAIT is only set when file->f_mode & FMODE_NOWAIT in iterate_dir() as e.g. WARN_ONCE? Signed-off-by: Dominique Martinet --- fs/internal.h | 2 +- fs/readdir.c | 6 ++++-- include/linux/fs.h | 8 ++++++++ 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index e8ca000e6613..0264b001d99a 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -267,4 +267,4 @@ void mnt_idmap_put(struct mnt_idmap *idmap); struct linux_dirent64; int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count); + unsigned int count, unsigned long flags); diff --git a/fs/readdir.c b/fs/readdir.c index ed0803d0011e..1311b89d75e1 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -358,12 +358,14 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen, * @file : pointer to file struct of directory * @dirent : pointer to user directory structure * @count : size of buffer + * @flags : additional dir_context flags */ int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count) + unsigned int count, unsigned long flags) { struct getdents_callback64 buf = { .ctx.actor = filldir64, + .ctx.flags = flags, .count = count, .current_dir = dirent }; @@ -395,7 +397,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, if (!f.file) return -EBADF; - error = vfs_getdents(f.file, dirent, count); + error = vfs_getdents(f.file, dirent, count, 0); fdput_pos(f); return error; diff --git a/include/linux/fs.h b/include/linux/fs.h index 21a981680856..f7de2b5ca38e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1716,8 +1716,16 @@ typedef bool (*filldir_t)(struct dir_context *, const char *, int, loff_t, u64, struct dir_context { filldir_t actor; loff_t pos; + unsigned long flags; }; +/* + * flags for dir_context flags + * DIR_CONTEXT_F_NOWAIT: Request non-blocking iterate + * (requires file->f_mode & FMODE_NOWAIT) + */ +#define DIR_CONTEXT_F_NOWAIT 0x1 + /* * These flags let !MMU mmap() govern direct device mapping vs immediate * copying more easily for MAP_PRIVATE, especially for ROM filesystems. From patchwork Wed May 10 10:52:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236764 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D434C7EE2A for ; Wed, 10 May 2023 10:53:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236526AbjEJKxj (ORCPT ); Wed, 10 May 2023 06:53:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236631AbjEJKxb (ORCPT ); Wed, 10 May 2023 06:53:31 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E272D7AA4; Wed, 10 May 2023 03:53:24 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id 8F293C009; Wed, 10 May 2023 12:53:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716003; bh=6QxWGUx7JcbfJLzcgzoVTnfXsX0NRwgrGu+B7GzT3oE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=frnSqCTSHQv1C0QPaHCsmLdJhmn6iL3fb13CmnPoQtm7Ig2HE3VU2kpyo25bcFB1z IxW7BCRfVG2/S+rFGNBnAbUjLoTgD53kauNfV0KfwXidvNOQLnFdcKE95Xrr36Yav5 zv3FljxHoZMetPkBdXQnTTpQdb+rblz+dhTE5Q+YUWjT216z5IbnK2Bz/oKRWNzJX6 qxWe1G28PTDGMtmetta44c2ggjlV1oTYIzdRRnjahwxNcg4gpugbIM55c8xOWX+3Xt A5JR851bXCyPusWrJ3jGHF2TMz+fbaAdJH68sSgvuKZdU8nPH/v1XYl/yDFVan4QPJ z0KR61BBTkMHQ== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id 2BDE3C01F; Wed, 10 May 2023 12:53:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716003; bh=6QxWGUx7JcbfJLzcgzoVTnfXsX0NRwgrGu+B7GzT3oE=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=frnSqCTSHQv1C0QPaHCsmLdJhmn6iL3fb13CmnPoQtm7Ig2HE3VU2kpyo25bcFB1z IxW7BCRfVG2/S+rFGNBnAbUjLoTgD53kauNfV0KfwXidvNOQLnFdcKE95Xrr36Yav5 zv3FljxHoZMetPkBdXQnTTpQdb+rblz+dhTE5Q+YUWjT216z5IbnK2Bz/oKRWNzJX6 qxWe1G28PTDGMtmetta44c2ggjlV1oTYIzdRRnjahwxNcg4gpugbIM55c8xOWX+3Xt A5JR851bXCyPusWrJ3jGHF2TMz+fbaAdJH68sSgvuKZdU8nPH/v1XYl/yDFVan4QPJ z0KR61BBTkMHQ== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id a820605c; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:51 +0900 Subject: [PATCH v2 3/6] io_uring: add support for getdents MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-3-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=5527; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=lY9CKW6ZADkqYeJH4NCrnDklEklhKGsfq33VwRZEwdo=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eOC1zQrTHzBzXLLRmTJc07x9GKHD6rFw5kJ 2/VulrZy52JAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jgAKCRCrTpvsapjm cP3RD/9B1dAYjRkQDf7/mjiqhh4CCu0Zc0OYp3EAOvIKqhAyW0yjs9xCBXRRnxlwx//3BDdpPa2 wU33hVMl9a1fYNpCPX3Kx5ZUPSAmQSCi7ujIs4Pfwb3muf3if4fCQVyzX9HTmugtLmTXVh3Zdtr 2BZb2vYMh469TNQefGbbJ1JSQIZzNpz7QBHL0wbizxuYKTo2QN8lfkKmDS1b01MiEBppy8BUYn4 YAR85UYr2U/SIOrg7kV8a1gDsC9IeFe87EP0GRbLyIKPAJfD/sLUN3rI6i8K6JN5EA7f9SgbRI2 KivEW9h47VAAWUZxEXkBU/1s0uxyhZVtjPeRnttqaCbQkCU2b7MEObdkti2AyyTHVfpQhWpYOyb rNCl1kLVVpeYd6JHiq06rzgf+Rom8xoYzZMFujFu401izGz1Szbz/3sh3OhrihOGwWmQjdauc1/ K1M4aLEjbCvTmXBS4CHEeoLglEBQmnB8QCPpGmHLzLVB0NJmR1PIlyZAaYKREesnON5h1MZY3P4 6slhYtaE3U5sBVT4h3j51oJZNRmW7s2g0E/eOcHSBjn4CdfQaloK/9zVC9D8knUUVoGC1ucsNts LoECZ7ctTtQNywACGHbMQwIYTlcnBx4o2KHcbNEV59r2OyqUz1BKtpxHeaA+5F1ORZN0XZO79K/ zsLgsMyj0lwBebg== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This add support for getdents64 to io_uring, acting exactly like the syscall: the directory is iterated from it's current's position as stored in the file struct, and the file's position is updated exactly as if getdents64 had been called. Additionally, since io_uring has no way of issuing a seek, a flag IORING_GETDENTS_REWIND has been added that will seek to the start of the directory like rewinddir(3) for users that might require such a thing. This will act exactly as if seek then getdents64 have been called sequentially with no atomicity guarantee: if this wasn't clear it is the responsibility of the caller to not use getdents multiple time on a single file in parallel if they want useful results, as is currently the case when using the syscall from multiple threads. For filesystems that support NOWAIT in iterate_shared(), try to use it first; if a user already knows the filesystem they use do not support nowait they can force async through IOSQE_ASYNC in the sqe flags, avoiding the need to bounce back through a useless EAGAIN return. (Note we already do that in prep if rewind is requested) Signed-off-by: Dominique Martinet --- include/uapi/linux/io_uring.h | 7 ++++++ io_uring/fs.c | 57 +++++++++++++++++++++++++++++++++++++++++++ io_uring/fs.h | 3 +++ io_uring/opdef.c | 8 ++++++ 4 files changed, 75 insertions(+) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 0716cb17e436..35d0de18d893 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -65,6 +65,7 @@ struct io_uring_sqe { __u32 xattr_flags; __u32 msg_ring_flags; __u32 uring_cmd_flags; + __u32 getdents_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -223,6 +224,7 @@ enum io_uring_op { IORING_OP_URING_CMD, IORING_OP_SEND_ZC, IORING_OP_SENDMSG_ZC, + IORING_OP_GETDENTS, /* this goes last, obviously */ IORING_OP_LAST, @@ -259,6 +261,11 @@ enum io_uring_op { */ #define SPLICE_F_FD_IN_FIXED (1U << 31) /* the last bit of __u32 */ +/* + * sqe->getdents_flags + */ +#define IORING_GETDENTS_REWIND (1U << 0) + /* * POLL_ADD flags. Note that since sqe->poll_events is the flag space, the * command flags for POLL_ADD are stored in sqe->len. diff --git a/io_uring/fs.c b/io_uring/fs.c index f6a69a549fd4..b15ec81c1ed2 100644 --- a/io_uring/fs.c +++ b/io_uring/fs.c @@ -47,6 +47,13 @@ struct io_link { int flags; }; +struct io_getdents { + struct file *file; + struct linux_dirent64 __user *dirent; + unsigned int count; + int flags; +}; + int io_renameat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_rename *ren = io_kiocb_to_cmd(req, struct io_rename); @@ -291,3 +298,53 @@ void io_link_cleanup(struct io_kiocb *req) putname(sl->oldpath); putname(sl->newpath); } + +int io_getdents_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_getdents *gd = io_kiocb_to_cmd(req, struct io_getdents); + + if (READ_ONCE(sqe->off) != 0) + return -EINVAL; + + gd->dirent = u64_to_user_ptr(READ_ONCE(sqe->addr)); + gd->count = READ_ONCE(sqe->len); + gd->flags = READ_ONCE(sqe->getdents_flags); + if (gd->flags & ~IORING_GETDENTS_REWIND) + return -EINVAL; + /* rewind cannot be nowait */ + if (gd->flags & IORING_GETDENTS_REWIND) + req->flags |= REQ_F_FORCE_ASYNC; + + return 0; +} + +int io_getdents(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_getdents *gd = io_kiocb_to_cmd(req, struct io_getdents); + unsigned long getdents_flags = 0; + int ret; + + if (issue_flags & IO_URING_F_NONBLOCK) { + if (!(req->file->f_mode & FMODE_NOWAIT)) + return -EAGAIN; + + getdents_flags = DIR_CONTEXT_F_NOWAIT; + } + if ((gd->flags & IORING_GETDENTS_REWIND)) { + WARN_ON_ONCE(issue_flags & IO_URING_F_NONBLOCK); + + ret = vfs_llseek(req->file, 0, SEEK_SET); + if (ret < 0) + goto out; + } + + ret = vfs_getdents(req->file, gd->dirent, gd->count, getdents_flags); +out: + if (ret == -EAGAIN && + (issue_flags & IO_URING_F_NONBLOCK)) + return -EAGAIN; + + io_req_set_res(req, ret, 0); + return 0; +} + diff --git a/io_uring/fs.h b/io_uring/fs.h index 0bb5efe3d6bb..f83a6f3a678d 100644 --- a/io_uring/fs.h +++ b/io_uring/fs.h @@ -18,3 +18,6 @@ int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags); int io_linkat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_linkat(struct io_kiocb *req, unsigned int issue_flags); void io_link_cleanup(struct io_kiocb *req); + +int io_getdents_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_getdents(struct io_kiocb *req, unsigned int issue_flags); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index cca7c5b55208..8f770c582ab3 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -428,6 +428,11 @@ const struct io_issue_def io_issue_defs[] = { .prep = io_eopnotsupp_prep, #endif }, + [IORING_OP_GETDENTS] = { + .needs_file = 1, + .prep = io_getdents_prep, + .issue = io_getdents, + }, }; @@ -648,6 +653,9 @@ const struct io_cold_def io_cold_defs[] = { .fail = io_sendrecv_fail, #endif }, + [IORING_OP_GETDENTS] = { + .name = "GETDENTS", + }, }; const char *io_uring_get_opcode(u8 opcode) From patchwork Wed May 10 10:52:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236767 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 502CAC77B7D for ; Wed, 10 May 2023 10:53:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236753AbjEJKxo (ORCPT ); Wed, 10 May 2023 06:53:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56016 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236671AbjEJKxc (ORCPT ); Wed, 10 May 2023 06:53:32 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73E417D90; Wed, 10 May 2023 03:53:26 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id DCEBEC026; Wed, 10 May 2023 12:53:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716004; bh=yLt30ccVZYSI/65st0oAv5X6KGUwx7EeVr+r1ZsxAqc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nobaswzK0qfQPonWpcsjEiwfLGSAeY/3jSJ2fP8YGX57ybA+L6xdLHkTbU6bAHq+b SFT4kFIn+IYL6yXq53Xp0OgekF4092FihADD55MAE79hT3Vise67ogY2fL6V4CzCet tEBrrkDWDC/xd+ji4PL2+oFhfkGPWLzXtBatLaqcUBlCGxV6E/k9AfraXji8WQ5eky Gw7Q50oxdYUXrrzwqh63c62U08xXhwczsKKe9GVrSnBd6RXY7TvOPtmfZAmJbBsMdj 8rV8qF8Jvat3CvioscCx1sQJdirv5T0cKHhFNVQ4tHEJGNNqwyyM/ii9tHDhNPl8O0 Vk8J9+2GqtkOg== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id 6A376C01E; Wed, 10 May 2023 12:53:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716003; bh=yLt30ccVZYSI/65st0oAv5X6KGUwx7EeVr+r1ZsxAqc=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=EZGCrcnQphjBkKcNP7P612MZj7jJINdX9s/+jITHBbjs+BkUeSU7sH1ufIMokm2u1 thA80G7tsP417bFjDh+D1CPVmisFDeUu4x1HeJVo1j0AcmKskw8GKQGsStv1ZPUL30 8kjELVSkpoYnnQmRWPcj+0viaIfRXh/m6X6eCJcAD0JB4LOHROPeatG3npesCeMI7o 34yQWD1723BHa4FLqdSjvHQRv15bYWZr6ETf0XWzj2yp0HXwGUQehT5XPgrxI56Zzr 91F6JLZIRx9SyS6ab7Kkt3LTmAKNrEARE2E8ZCOnAQHnsgxH5FdWdJVBY9dOWgn+wH 1L4RBtOt/X6iw== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id 7b5bfb63; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:52 +0900 Subject: [PATCH v2 4/6] kernfs: implement readdir FMODE_NOWAIT MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-4-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=1751; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=wZy/djlXvmoURd9RMjzidbOvF/oy0mSJeWlqAdKVlLs=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eOjSYuNu0/+DbVaVJAMlyNeKQRwVDOytJtX K2al+2U3eWJAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jgAKCRCrTpvsapjm cHz/EACzgdngFCVzZHoRfItwuuqAMsrGmLCG07aQFKsIX2bDy4EuMQvz252mAWhajgMdRA4lWLJ pFMAzzzgOR0Z5/tyl2ROWfboSWqwo6eNlBc/iMeMUc2oaPw+Ermndd++p3zWDMSlPV8Fw4sW99U 1NKEAY02oT7DqtfbqrKLZ39iNz9AdPc7HKbQevknz6zatsvL0hQvXITzL5FAuLiQ1LxmO0BpTl+ HRbMjIPfNCyMT3VbUvQcBAaaLutQ8QPTgfWvty0Cl4ehyvPmgub5/EdG/Y2XYB/ci2ho96jlpj1 hkB+K//2mlW2yyTbT6buUmg+p0B3BDmoS9KxxsvOTpEspwCNWvFqS0mkeDbnBj5wmjgXh+L+Vlv aKMlVe0Gw9RJm7Hnc3TsoiSAq+QXZyYEWvxRVCK8wcJDxJ4RsDl2WIKqpH/lxb3jf55NAbzlIoH FQNHZwNkXc9t9qRgwobztHeRcqR2EqlQTt+8rbYWf4V1PNxU9b5aZDBbcLY4dTQ3tlcE1PjiEo2 ODBpGlv13ladtn014MvpLnLndyeoIc2klueNdTbTxa3OZaxH1yY3i0LlZu4Gmg87dQjFxn+i49W LI2yEE/p1ow9gPxHLRmOOfE0s7cQYCUb+/rmpxGcuAw864lnKbWPcU7iNSIK96UM07T3MGI3mrv cGheGwzHYpOpL5Q== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Since down_read can block, use the _trylock variant if NOWAIT variant has been requested. (can probably do a little bit better style-wise) Signed-off-by: Dominique Martinet --- fs/kernfs/dir.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 45b6919903e6..5a5b3e7881bf 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -1824,7 +1824,12 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx) return 0; root = kernfs_root(parent); - down_read(&root->kernfs_rwsem); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + if (!down_read_trylock(&root->kernfs_rwsem)) + return -EAGAIN; + } else { + down_read(&root->kernfs_rwsem); + } if (kernfs_ns_enabled(parent)) ns = kernfs_info(dentry->d_sb)->ns; @@ -1845,6 +1850,12 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx) if (!dir_emit(ctx, name, len, ino, type)) return 0; down_read(&root->kernfs_rwsem); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + if (!down_read_trylock(&root->kernfs_rwsem)) + return 0; + } else { + down_read(&root->kernfs_rwsem); + } } up_read(&root->kernfs_rwsem); file->private_data = NULL; @@ -1852,7 +1863,14 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx) return 0; } +static int kernfs_fop_dir_open(struct inode *inode, struct file *file) +{ + file->f_mode |= FMODE_NOWAIT; + return 0; +} + const struct file_operations kernfs_dir_fops = { + .open = kernfs_fop_dir_open, .read = generic_read_dir, .iterate_shared = kernfs_fop_readdir, .release = kernfs_dir_fop_release, From patchwork Wed May 10 10:52:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236766 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60D36C7EE26 for ; Wed, 10 May 2023 10:53:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236681AbjEJKxm (ORCPT ); Wed, 10 May 2023 06:53:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236673AbjEJKxd (ORCPT ); Wed, 10 May 2023 06:53:33 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1A1F844A7; Wed, 10 May 2023 03:53:27 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id DAFFEC02C; Wed, 10 May 2023 12:53:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716005; bh=H1p0zMwAlRrg5np5bavMOgRLY0r4yCyDEhIWpYZ9jIk=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=wr/7pEsNWxvv/ypWSARsqMWjZ7CyJqFtljRgR7vhvcmoyty/cFktdeJ00mcmT5t/n TEygVH3ADJg6cYnWPpJsgsDGsvBUvXU+4UzD9SZdDbm5SAU4bZCZ4r3ABfU2ofTG18 3KdI04or4NXQ1Q4QDnvGFIPjq8lAMGvX04Ykxnne5OY0emx/RIaSEUvPUtQxOX95x7 zmsmRv9dPCi6aya9YKb3kUL8J0CbF360vE6pSdhqK27IfnE8g5CjBIlZQoVVn30GF5 K9zqAnD7iCmwVMkMCSg1E3NhneCPa8lAxok2uvaYdGKipqWfb/EwUGX/2xI6VdIpto ArqUd49sEKB7g== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id F37E9C02A; Wed, 10 May 2023 12:53:18 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716005; bh=H1p0zMwAlRrg5np5bavMOgRLY0r4yCyDEhIWpYZ9jIk=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=wr/7pEsNWxvv/ypWSARsqMWjZ7CyJqFtljRgR7vhvcmoyty/cFktdeJ00mcmT5t/n TEygVH3ADJg6cYnWPpJsgsDGsvBUvXU+4UzD9SZdDbm5SAU4bZCZ4r3ABfU2ofTG18 3KdI04or4NXQ1Q4QDnvGFIPjq8lAMGvX04Ykxnne5OY0emx/RIaSEUvPUtQxOX95x7 zmsmRv9dPCi6aya9YKb3kUL8J0CbF360vE6pSdhqK27IfnE8g5CjBIlZQoVVn30GF5 K9zqAnD7iCmwVMkMCSg1E3NhneCPa8lAxok2uvaYdGKipqWfb/EwUGX/2xI6VdIpto ArqUd49sEKB7g== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id 4d95a4fa; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:53 +0900 Subject: [PATCH v2 5/6] libfs: set FMODE_NOWAIT on dir open MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-5-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=682; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=3L8S3Tjkbud7xRcndhKjERxB0eImBSqy3egrOvcuhks=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eODbVbo0nPigVnkAT9G9sUTfJOIukU9jBqk Bmh59UjcPWJAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jgAKCRCrTpvsapjm cOnpD/0VBTippxAMZnE4RXIlb2lkEJCv8U83LZ4j3IL2quetqUlhcx5V1oVb6nY+da3YXEZw/UJ 15qUHuJYnivWMHoTBr7xXrlsWBve95rY6b2UqAE/eDhQUHOAwnQSVpHUWN2g+Bc+o2u2Adpdik4 XrQgyb++SQLBrb/GxNuZR9jnI38R5zHPJ958spsQcdyY6uXoQiNYJsHslyw5Z6nIVgakKG00dQI +mM4BM8jtGfoC1qYOE3/Ls7g6XbH7ezUCJ08j7c+2tHlsG8F2gbYnMVdFeVTnk5kEOl6meGR70o LghN/HDN/la+E52OWEy5DX8Q9rQGO+paubIb1ncrrTCXaGKyhXRWXekMKA3qTihDdWTU0TuK3xR IsZ/PjRVkSjd3pYCuEsncbSfKuj4YXLmXrv5Fm658klTSeOTMeKsRg0DYvunZLbTXqFLb/WAmH6 zyDwx8XsPyg578LSQSE/kocc7lO2Jbg2V52IV6EvnzIiu8g2RQoDk1Bcz5it41LbZXHDRyIRE1J Q/bTJTh+5Y/E4nTrK1xBuDg3SBJpAP3UJLkL01EHCNte33hAQc1vegwdWhXbjhyAPGmzgeXu06K fri1/9X/fCg2yHA8rb/Pi8azSDQZ7YdtFeOg+M5tt1xwAg7c+IqbsnzdPwTBiJ8GNeudqXrYUVX pItUy84Xv8fSsJg== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org the readdir can technically wait a bit on a spinlock, but that should never wait for long enough to return EAGAIN -- just set the capability flag on directories f_mode Signed-off-by: Dominique Martinet --- fs/libfs.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/libfs.c b/fs/libfs.c index 89cf614a3271..a3c7e42d90a7 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -81,6 +81,7 @@ EXPORT_SYMBOL(simple_lookup); int dcache_dir_open(struct inode *inode, struct file *file) { file->private_data = d_alloc_cursor(file->f_path.dentry); + file->f_mode |= FMODE_NOWAIT; return file->private_data ? 0 : -ENOMEM; } From patchwork Wed May 10 10:52:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dominique Martinet X-Patchwork-Id: 13236768 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DBD7C77B7C for ; Wed, 10 May 2023 10:54:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236727AbjEJKyY (ORCPT ); Wed, 10 May 2023 06:54:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236748AbjEJKxm (ORCPT ); Wed, 10 May 2023 06:53:42 -0400 Received: from nautica.notk.org (ipv6.notk.org [IPv6:2001:41d0:1:7a93::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3B96D7AA5; Wed, 10 May 2023 03:53:33 -0700 (PDT) Received: by nautica.notk.org (Postfix, from userid 108) id 98B63C01E; Wed, 10 May 2023 12:53:31 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716011; bh=5MT6qB5ApvVN1dNUb6Euqb9qa0bf88BcEZ2ppeeLcL8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=0cpQFerv+bBMhI7Mb3hxcD2jLWNdUZV+9n39ITvB/6lA7Y70SBv2uKqD1+eXnYFzN f4szV6HMVik/dxrZ8eRC19+xN+8nN72L2FgDbwuFOzThU2VfDBffmOiP3nSG+5JEi1 ld7YT/Z+pBgAT05iNTI5LH/5NEazE5xSTTCmP0NrFkLoDAdpBxYTqXTa2qBuEsVYf1 tjflXygr/8A4XdJ4XBsGYS/lAPeSGa2l52/S3SP3T+xmCyyhXLLlCYOw1V+qTYV0BO OOnURhTOU7aSd8qqdxvYDgLgsv47aCf5htq0GPCFPF3iZBqj0bbIemggqEC0euAIM+ mjhyTyl5yQaow== Received: from odin.codewreck.org (localhost [127.0.0.1]) by nautica.notk.org (Postfix) with ESMTPS id 38023C021; Wed, 10 May 2023 12:53:23 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=codewreck.org; s=2; t=1683716010; bh=5MT6qB5ApvVN1dNUb6Euqb9qa0bf88BcEZ2ppeeLcL8=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=LEwGAG+S61KRMI5strlApS4avLmZSfP9A4cU+u7+5zkn/0uCOxTnfnZ0AM92+L3QR 73XnFzUmr0n/+u3A586LsJpfvVKIsy456BHiIbQU8k08YGIonSTgfMG0A18I1pJ/pW /vi8alO4kG6E5aIoQvGPCvk38ZVI5LNg74zk2D8mms/gTpEC1YR+yqB/rhas/GplqQ S7fmSVFHyztq5o6cf0DmxUKnL+tV+mjtXAQqYR+aEsFzjto9jsILR+96q5+PwEoeDZ 77Sj+PAqAAGZ6hXNPIS7RqWiP9sZcjDRE0JoZQ40xmFAXzi2vOKhhfo2DjnQN0gXL3 VRDYqcDYS3M+g== Received: from [127.0.0.2] (localhost [::1]) by odin.codewreck.org (OpenSMTPD) with ESMTP id 6d09a1f5; Wed, 10 May 2023 10:53:02 +0000 (UTC) From: Dominique Martinet Date: Wed, 10 May 2023 19:52:54 +0900 Subject: [PATCH v2 6/6] RFC: io_uring getdents: test returning an EOF flag in CQE MIME-Version: 1.0 Message-Id: <20230422-uring-getdents-v2-6-2db1e37dc55e@codewreck.org> References: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> In-Reply-To: <20230422-uring-getdents-v2-0-2db1e37dc55e@codewreck.org> To: Alexander Viro , Christian Brauner , Jens Axboe , Pavel Begunkov , Stefan Roesch Cc: Clay Harris , Dave Chinner , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Dominique Martinet X-Mailer: b4 0.13-dev-f371f X-Developer-Signature: v=1; a=openpgp-sha256; l=6420; i=asmadeus@codewreck.org; h=from:subject:message-id; bh=EOQY5TIMFcFLfz+pvWNR1o8nxPHoG0dUpAU+Ei2h49c=; b=owEBbQKS/ZANAwAIAatOm+xqmOZwAcsmYgBkW3eOP/DmSN+k75DDH8LGzVVwQYQ04IO47OygJ fqAOQAsdMaJAjMEAAEIAB0WIQT8g9txgG5a3TOhiE6rTpvsapjmcAUCZFt3jgAKCRCrTpvsapjm cNpcD/0WCbelxKAz8+4oVltN6dyFWbD5b4RLLKuWU6kuMeN9iOvEMiKwNyxhrZ9XnLrzJ/j7B8Y YHSO63fVZeCImAKYUsaDSWwBRjISBkCoa7xd4SPCTOBd6h0TK62epEDA/gLyDC+ySv/1UUkXfQb l88gwbj0P0WO6b6IG5SniuV5bQc6HxzgbuBsPlrkSQhE639aeN0tJYJ/X/3Do6WmTW0r0F8tki5 a0mxZ3xBU7lKpI1+miIpn+B/1764Vr6f39yeGQVTt3FDv917NbrCjDxUX3/iD8zaCT2N60rVJB2 kl6jSuetvLqjO42gLII0sVVsLLA4kKe6BhkHWhrFhWJ8Xd4CmW/bhUadzMdX9tvLy6rKLc7+U3I Lm88y6PFHt9LGqpn+yEA+nfXel5jYDWfv8zs5frMW47X/K9ERDYplwF8e0UyiTx30ojtAEAzoYe 7+LXwhFYku7qDFE+arKPjHQXrlQxCOF26/tx9FfRkNFRuo3l9Rvq91UHiFqEW23m5BGSeuGVw/3 nD5F6qp0uP7ysB3qSQhZycbUR7SS8dSLD5801FVmScYg0sd/q8C03PYRSiyZIGVbWzOxJrHqfAE Zk7dWZXT95EvMJzeohRle0HGtl2yVU7ANPDf1stJR2uo/7LkCqhh4Aqvt2+XnRqMeQxeFHM+Bhq zWw/Kngpf219KFg== X-Developer-Key: i=asmadeus@codewreck.org; a=openpgp; fpr=B894379F662089525B3FB1B9333F1F391BBBB00A Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This turns out to be very slightly faster than an extra call to getdents, but in practice it doesn't seem to be such an improvement as the trailing getdents will return almost immediately be absorbed by the scheduling noise in a find-like context (my ""server"" is too noisy to get proper benchmarks out, but results look slightly better with this in async mode, and almost identical in the NOWAIT path) If the user is waiting the end of a single directory though it might be worth it, so including the patch for comments. (in particular I'm not really happy that the flag has become in-out for vfs_getdents, especially when the getdents64 syscall does not use it, but I don't see much other way around it) If this approach is acceptable/wanted then this patch will be split down further (at least dir_context/vfs_getdents, kernfs, libfs, uring in four separate commits) Signed-off-by: Dominique Martinet --- fs/internal.h | 2 +- fs/kernfs/dir.c | 1 + fs/libfs.c | 9 ++++++--- fs/readdir.c | 10 ++++++---- include/linux/fs.h | 2 ++ include/uapi/linux/io_uring.h | 2 ++ io_uring/fs.c | 8 ++++++-- 7 files changed, 24 insertions(+), 10 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index 0264b001d99a..0b1552c7a870 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -267,4 +267,4 @@ void mnt_idmap_put(struct mnt_idmap *idmap); struct linux_dirent64; int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count, unsigned long flags); + unsigned int count, unsigned long *flags); diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 5a5b3e7881bf..53a6b4804c34 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -1860,6 +1860,7 @@ static int kernfs_fop_readdir(struct file *file, struct dir_context *ctx) up_read(&root->kernfs_rwsem); file->private_data = NULL; ctx->pos = INT_MAX; + ctx->flags |= DIR_CONTEXT_F_EOD; return 0; } diff --git a/fs/libfs.c b/fs/libfs.c index a3c7e42d90a7..b2a95dadffbd 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -208,10 +208,12 @@ int dcache_readdir(struct file *file, struct dir_context *ctx) p = &next->d_child; } spin_lock(&dentry->d_lock); - if (next) + if (next) { list_move_tail(&cursor->d_child, &next->d_child); - else + } else { list_del_init(&cursor->d_child); + ctx->flags |= DIR_CONTEXT_F_EOD; + } spin_unlock(&dentry->d_lock); dput(next); @@ -1347,7 +1349,8 @@ static loff_t empty_dir_llseek(struct file *file, loff_t offset, int whence) static int empty_dir_readdir(struct file *file, struct dir_context *ctx) { - dir_emit_dots(file, ctx); + if (dir_emit_dots(file, ctx)) + ctx->flags |= DIR_CONTEXT_F_EOD; return 0; } diff --git a/fs/readdir.c b/fs/readdir.c index 1311b89d75e1..be75a2154b4f 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -358,14 +358,14 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen, * @file : pointer to file struct of directory * @dirent : pointer to user directory structure * @count : size of buffer - * @flags : additional dir_context flags + * @flags : pointer to additional dir_context flags */ int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count, unsigned long flags) + unsigned int count, unsigned long *flags) { struct getdents_callback64 buf = { .ctx.actor = filldir64, - .ctx.flags = flags, + .ctx.flags = flags ? *flags : 0, .count = count, .current_dir = dirent }; @@ -384,6 +384,8 @@ int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, else error = count - buf.count; } + if (flags) + *flags = buf.ctx.flags; return error; } @@ -397,7 +399,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, if (!f.file) return -EBADF; - error = vfs_getdents(f.file, dirent, count, 0); + error = vfs_getdents(f.file, dirent, count, NULL); fdput_pos(f); return error; diff --git a/include/linux/fs.h b/include/linux/fs.h index f7de2b5ca38e..d1e31bccfb4f 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1723,8 +1723,10 @@ struct dir_context { * flags for dir_context flags * DIR_CONTEXT_F_NOWAIT: Request non-blocking iterate * (requires file->f_mode & FMODE_NOWAIT) + * DIR_CONTEXT_F_EOD: Signal directory has been fully iterated, set by the fs */ #define DIR_CONTEXT_F_NOWAIT 0x1 +#define DIR_CONTEXT_F_EOD 0x2 /* * These flags let !MMU mmap() govern direct device mapping vs immediate diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 35d0de18d893..35877132027e 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -381,11 +381,13 @@ struct io_uring_cqe { * IORING_CQE_F_SOCK_NONEMPTY If set, more data to read after socket recv * IORING_CQE_F_NOTIF Set for notification CQEs. Can be used to distinct * them from sends. + * IORING_CQE_F_EOF If set, file or directory has reached end of file. */ #define IORING_CQE_F_BUFFER (1U << 0) #define IORING_CQE_F_MORE (1U << 1) #define IORING_CQE_F_SOCK_NONEMPTY (1U << 2) #define IORING_CQE_F_NOTIF (1U << 3) +#define IORING_CQE_F_EOF (1U << 4) enum { IORING_CQE_BUFFER_SHIFT = 16, diff --git a/io_uring/fs.c b/io_uring/fs.c index b15ec81c1ed2..f6222b0148ef 100644 --- a/io_uring/fs.c +++ b/io_uring/fs.c @@ -322,6 +322,7 @@ int io_getdents(struct io_kiocb *req, unsigned int issue_flags) { struct io_getdents *gd = io_kiocb_to_cmd(req, struct io_getdents); unsigned long getdents_flags = 0; + u32 cqe_flags = 0; int ret; if (issue_flags & IO_URING_F_NONBLOCK) { @@ -338,13 +339,16 @@ int io_getdents(struct io_kiocb *req, unsigned int issue_flags) goto out; } - ret = vfs_getdents(req->file, gd->dirent, gd->count, getdents_flags); + ret = vfs_getdents(req->file, gd->dirent, gd->count, &getdents_flags); out: if (ret == -EAGAIN && (issue_flags & IO_URING_F_NONBLOCK)) return -EAGAIN; - io_req_set_res(req, ret, 0); + if (getdents_flags & DIR_CONTEXT_F_EOD) + cqe_flags |= IORING_CQE_F_EOF; + + io_req_set_res(req, ret, cqe_flags); return 0; }