From patchwork Sun Aug 27 13:28:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367150 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14463C83F1B for ; Sun, 27 Aug 2023 13:31:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230298AbjH0NbE (ORCPT ); Sun, 27 Aug 2023 09:31:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230255AbjH0Nah (ORCPT ); Sun, 27 Aug 2023 09:30:37 -0400 Received: from out-248.mta1.migadu.com (out-248.mta1.migadu.com [95.215.58.248]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4CCA132; Sun, 27 Aug 2023 06:30:33 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143032; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=M7eywqhb1fMAXTeyjh6BJsPONBLehGvo8wJ7nQSmBJs=; b=khOm1xqjzQOlsKc47XQ2wnLOjuLxGwc+XfDxixj6hJjWBxivRVL9fw6eYA8IJoeNHelD6C SUgrSKu6uB34cMFyQmJmkAayvfASH0HpKYFjusAGKkd85NiwXc4z0BEmZwpbN0iGNr9iqa ll1Q4czDD51HD91ihhmFa2NNXU3DhMY= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 01/11] fs: split off vfs_getdents function of getdents64 syscall Date: Sun, 27 Aug 2023 21:28:25 +0800 Message-Id: <20230827132835.1373581-2-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Dominique Martinet This splits off the vfs_getdents function from the getdents64 system call. This will allow io_uring to call the vfs_getdents function. Co-developed-by: Stefan Roesch Signed-off-by: Stefan Roesch Signed-off-by: Dominique Martinet Signed-off-by: Hao Xu --- fs/internal.h | 8 ++++++++ fs/readdir.c | 34 ++++++++++++++++++++++++++-------- 2 files changed, 34 insertions(+), 8 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index f7a3dc111026..b1f66e52d61b 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -304,3 +304,11 @@ ssize_t __kernel_write_iter(struct file *file, struct iov_iter *from, loff_t *po struct mnt_idmap *alloc_mnt_idmap(struct user_namespace *mnt_userns); struct mnt_idmap *mnt_idmap_get(struct mnt_idmap *idmap); void mnt_idmap_put(struct mnt_idmap *idmap); + +/* + * fs/readdir.c + */ +struct linux_dirent64; + +int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, + unsigned int count); diff --git a/fs/readdir.c b/fs/readdir.c index b264ce60114d..9592259b7e7f 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -21,6 +21,7 @@ #include #include #include +#include "internal.h" #include @@ -351,10 +352,16 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen, return false; } -SYSCALL_DEFINE3(getdents64, unsigned int, fd, - struct linux_dirent64 __user *, dirent, unsigned int, count) + +/** + * vfs_getdents - getdents without fdget + * @file : pointer to file struct of directory + * @dirent : pointer to user directory structure + * @count : size of buffer + */ +int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, + unsigned int count) { - struct fd f; struct getdents_callback64 buf = { .ctx.actor = filldir64, .count = count, @@ -362,11 +369,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, }; int error; - f = fdget_pos(fd); - if (!f.file) - return -EBADF; - - error = iterate_dir(f.file, &buf.ctx); + error = iterate_dir(file, &buf.ctx); if (error >= 0) error = buf.error; if (buf.prev_reclen) { @@ -379,6 +382,21 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, else error = count - buf.count; } + return error; +} + +SYSCALL_DEFINE3(getdents64, unsigned int, fd, + struct linux_dirent64 __user *, dirent, unsigned int, count) +{ + struct fd f; + int error; + + f = fdget_pos(fd); + if (!f.file) + return -EBADF; + + error = vfs_getdents(f.file, dirent, count); + fdput_pos(f); return error; } From patchwork Sun Aug 27 13:28:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43EC9C83F1E for ; Sun, 27 Aug 2023 13:32:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229986AbjH0Nbf (ORCPT ); Sun, 27 Aug 2023 09:31:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36150 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230268AbjH0NbD (ORCPT ); Sun, 27 Aug 2023 09:31:03 -0400 Received: from out-248.mta1.migadu.com (out-248.mta1.migadu.com [95.215.58.248]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0AA68DE for ; Sun, 27 Aug 2023 06:30:57 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143055; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hTZJGzowdwZotMVwWddSwVfC1FzrP/Lkz2Y0ayQQQXA=; b=L3dZ2wnhLl3MpFT8Fx3GtXFd2RCwphC3uDYko8Q0nKz7ONNiGRoAamaCC5cUw2SONEU+W+ FNti/TRpT0IYzrVdjkthri+O6/3aV24bEu1xrJpC4BygFoLKHcj9ftj4kgaNqPcbLz1Ibc FihN2zv2jXx86v+rCxyuW2cfhV75bQE= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 02/11] xfs: add NOWAIT semantics for readdir Date: Sun, 27 Aug 2023 21:28:26 +0800 Message-Id: <20230827132835.1373581-3-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Implement NOWAIT semantics for readdir. Return EAGAIN error to the caller if it would block, like failing to get locks, or going to do IO. Co-developed-by: Dave Chinner Signed-off-by: Dave Chinner Signed-off-by: Hao Xu [fixes deadlock issue, tweak code style] --- fs/xfs/libxfs/xfs_da_btree.c | 16 +++++++++++ fs/xfs/libxfs/xfs_da_btree.h | 1 + fs/xfs/libxfs/xfs_dir2_block.c | 7 ++--- fs/xfs/libxfs/xfs_dir2_priv.h | 2 +- fs/xfs/scrub/dir.c | 2 +- fs/xfs/scrub/readdir.c | 2 +- fs/xfs/xfs_dir2_readdir.c | 49 ++++++++++++++++++++++++++-------- fs/xfs/xfs_inode.c | 27 +++++++++++++++++++ fs/xfs/xfs_inode.h | 17 +++++++----- 9 files changed, 99 insertions(+), 24 deletions(-) diff --git a/fs/xfs/libxfs/xfs_da_btree.c b/fs/xfs/libxfs/xfs_da_btree.c index e576560b46e9..7a1a0af24197 100644 --- a/fs/xfs/libxfs/xfs_da_btree.c +++ b/fs/xfs/libxfs/xfs_da_btree.c @@ -2643,16 +2643,32 @@ xfs_da_read_buf( struct xfs_buf_map map, *mapp = ↦ int nmap = 1; int error; + int buf_flags = 0; *bpp = NULL; error = xfs_dabuf_map(dp, bno, flags, whichfork, &mapp, &nmap); if (error || !nmap) goto out_free; + /* + * NOWAIT semantics mean we don't wait on the buffer lock nor do we + * issue IO for this buffer if it is not already in memory. Caller will + * retry. This will return -EAGAIN if the buffer is in memory and cannot + * be locked, and no buffer and no error if it isn't in memory. We + * translate both of those into a return state of -EAGAIN and *bpp = + * NULL. + */ + if (flags & XFS_DABUF_NOWAIT) + buf_flags |= XBF_TRYLOCK | XBF_INCORE; error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0, &bp, ops); if (error) goto out_free; + if (!bp) { + ASSERT(flags & XFS_DABUF_NOWAIT); + error = -EAGAIN; + goto out_free; + } if (whichfork == XFS_ATTR_FORK) xfs_buf_set_ref(bp, XFS_ATTR_BTREE_REF); diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h index ffa3df5b2893..32e7b1cca402 100644 --- a/fs/xfs/libxfs/xfs_da_btree.h +++ b/fs/xfs/libxfs/xfs_da_btree.h @@ -205,6 +205,7 @@ int xfs_da3_node_read_mapped(struct xfs_trans *tp, struct xfs_inode *dp, */ #define XFS_DABUF_MAP_HOLE_OK (1u << 0) +#define XFS_DABUF_NOWAIT (1u << 1) int xfs_da_grow_inode(xfs_da_args_t *args, xfs_dablk_t *new_blkno); int xfs_da_grow_inode_int(struct xfs_da_args *args, xfs_fileoff_t *bno, diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c index 00f960a703b2..59b24a594add 100644 --- a/fs/xfs/libxfs/xfs_dir2_block.c +++ b/fs/xfs/libxfs/xfs_dir2_block.c @@ -135,13 +135,14 @@ int xfs_dir3_block_read( struct xfs_trans *tp, struct xfs_inode *dp, + unsigned int flags, struct xfs_buf **bpp) { struct xfs_mount *mp = dp->i_mount; xfs_failaddr_t fa; int err; - err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, 0, bpp, + err = xfs_da_read_buf(tp, dp, mp->m_dir_geo->datablk, flags, bpp, XFS_DATA_FORK, &xfs_dir3_block_buf_ops); if (err || !*bpp) return err; @@ -380,7 +381,7 @@ xfs_dir2_block_addname( tp = args->trans; /* Read the (one and only) directory block into bp. */ - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; @@ -695,7 +696,7 @@ xfs_dir2_block_lookup_int( dp = args->dp; tp = args->trans; - error = xfs_dir3_block_read(tp, dp, &bp); + error = xfs_dir3_block_read(tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/libxfs/xfs_dir2_priv.h b/fs/xfs/libxfs/xfs_dir2_priv.h index 7404a9ff1a92..7d4cf8a0f15b 100644 --- a/fs/xfs/libxfs/xfs_dir2_priv.h +++ b/fs/xfs/libxfs/xfs_dir2_priv.h @@ -51,7 +51,7 @@ extern int xfs_dir_cilookup_result(struct xfs_da_args *args, /* xfs_dir2_block.c */ extern int xfs_dir3_block_read(struct xfs_trans *tp, struct xfs_inode *dp, - struct xfs_buf **bpp); + unsigned int flags, struct xfs_buf **bpp); extern int xfs_dir2_block_addname(struct xfs_da_args *args); extern int xfs_dir2_block_lookup(struct xfs_da_args *args); extern int xfs_dir2_block_removename(struct xfs_da_args *args); diff --git a/fs/xfs/scrub/dir.c b/fs/xfs/scrub/dir.c index 0b491784b759..5cc51f201bd7 100644 --- a/fs/xfs/scrub/dir.c +++ b/fs/xfs/scrub/dir.c @@ -313,7 +313,7 @@ xchk_directory_data_bestfree( /* dir block format */ if (lblk != XFS_B_TO_FSBT(mp, XFS_DIR2_DATA_OFFSET)) xchk_fblock_set_corrupt(sc, XFS_DATA_FORK, lblk); - error = xfs_dir3_block_read(sc->tp, sc->ip, &bp); + error = xfs_dir3_block_read(sc->tp, sc->ip, 0, &bp); } else { /* dir data format */ error = xfs_dir3_data_read(sc->tp, sc->ip, lblk, 0, &bp); diff --git a/fs/xfs/scrub/readdir.c b/fs/xfs/scrub/readdir.c index e51c1544be63..f0a727311632 100644 --- a/fs/xfs/scrub/readdir.c +++ b/fs/xfs/scrub/readdir.c @@ -101,7 +101,7 @@ xchk_dir_walk_block( unsigned int off, next_off, end; int error; - error = xfs_dir3_block_read(sc->tp, dp, &bp); + error = xfs_dir3_block_read(sc->tp, dp, 0, &bp); if (error) return error; diff --git a/fs/xfs/xfs_dir2_readdir.c b/fs/xfs/xfs_dir2_readdir.c index 9f3ceb461515..dcdbd26e0402 100644 --- a/fs/xfs/xfs_dir2_readdir.c +++ b/fs/xfs/xfs_dir2_readdir.c @@ -149,6 +149,7 @@ xfs_dir2_block_getdents( struct xfs_da_geometry *geo = args->geo; unsigned int offset, next_offset; unsigned int end; + unsigned int flags = 0; /* * If the block number in the offset is out of range, we're done. @@ -156,7 +157,9 @@ xfs_dir2_block_getdents( if (xfs_dir2_dataptr_to_db(geo, ctx->pos) > geo->datablk) return 0; - error = xfs_dir3_block_read(args->trans, dp, &bp); + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) + flags |= XFS_DABUF_NOWAIT; + error = xfs_dir3_block_read(args->trans, dp, flags, &bp); if (error) return error; @@ -240,6 +243,7 @@ xfs_dir2_block_getdents( STATIC int xfs_dir2_leaf_readbuf( struct xfs_da_args *args, + struct dir_context *ctx, size_t bufsize, xfs_dir2_off_t *cur_off, xfs_dablk_t *ra_blk, @@ -258,10 +262,15 @@ xfs_dir2_leaf_readbuf( struct xfs_iext_cursor icur; int ra_want; int error = 0; - - error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); - if (error) - goto out; + unsigned int flags = 0; + + if (ctx->flags & DIR_CONTEXT_F_NOWAIT) { + flags |= XFS_DABUF_NOWAIT; + } else { + error = xfs_iread_extents(args->trans, dp, XFS_DATA_FORK); + if (error) + goto out; + } /* * Look for mapped directory blocks at or above the current offset. @@ -280,7 +289,7 @@ xfs_dir2_leaf_readbuf( new_off = xfs_dir2_da_to_byte(geo, map.br_startoff); if (new_off > *cur_off) *cur_off = new_off; - error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, 0, &bp); + error = xfs_dir3_data_read(args->trans, dp, map.br_startoff, flags, &bp); if (error) goto out; @@ -360,6 +369,7 @@ xfs_dir2_leaf_getdents( int byteoff; /* offset in current block */ unsigned int offset = 0; int error = 0; /* error return value */ + int written = 0; /* * If the offset is at or past the largest allowed value, @@ -391,10 +401,17 @@ xfs_dir2_leaf_getdents( bp = NULL; } - if (*lock_mode == 0) - *lock_mode = xfs_ilock_data_map_shared(dp); - error = xfs_dir2_leaf_readbuf(args, bufsize, &curoff, - &rablk, &bp); + if (*lock_mode == 0) { + *lock_mode = + xfs_ilock_data_map_shared_generic(dp, + ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (!*lock_mode) { + error = -EAGAIN; + break; + } + } + error = xfs_dir2_leaf_readbuf(args, ctx, bufsize, + &curoff, &rablk, &bp); if (error || !bp) break; @@ -479,6 +496,7 @@ xfs_dir2_leaf_getdents( */ offset += length; curoff += length; + written += length; /* bufsize may have just been a guess; don't go negative */ bufsize = bufsize > length ? bufsize - length : 0; } @@ -492,6 +510,8 @@ xfs_dir2_leaf_getdents( ctx->pos = xfs_dir2_byte_to_dataptr(curoff) & 0x7fffffff; if (bp) xfs_trans_brelse(args->trans, bp); + if (error == -EAGAIN && written > 0) + error = 0; return error; } @@ -514,6 +534,7 @@ xfs_readdir( unsigned int lock_mode; bool isblock; int error; + bool nowait; trace_xfs_readdir(dp); @@ -531,7 +552,11 @@ xfs_readdir( if (dp->i_df.if_format == XFS_DINODE_FMT_LOCAL) return xfs_dir2_sf_getdents(&args, ctx); - lock_mode = xfs_ilock_data_map_shared(dp); + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + lock_mode = xfs_ilock_data_map_shared_generic(dp, nowait); + if (!lock_mode) + return -EAGAIN; + error = xfs_dir2_isblock(&args, &isblock); if (error) goto out_unlock; @@ -546,5 +571,7 @@ xfs_readdir( out_unlock: if (lock_mode) xfs_iunlock(dp, lock_mode); + if (error == -EAGAIN) + ASSERT(nowait); return error; } diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index 9e62cc500140..d088f7d0c23a 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -120,6 +120,33 @@ xfs_ilock_data_map_shared( return lock_mode; } +/* + * Similar to xfs_ilock_data_map_shared(), except that it will only try to lock + * the inode in shared mode if the extents are already in memory. If it fails to + * get the lock or has to do IO to read the extent list, fail the operation by + * returning 0 as the lock mode. + */ +uint +xfs_ilock_data_map_shared_nowait( + struct xfs_inode *ip) +{ + if (xfs_need_iread_extents(&ip->i_df)) + return 0; + if (!xfs_ilock_nowait(ip, XFS_ILOCK_SHARED)) + return 0; + return XFS_ILOCK_SHARED; +} + +int +xfs_ilock_data_map_shared_generic( + struct xfs_inode *dp, + bool nowait) +{ + if (nowait) + return xfs_ilock_data_map_shared_nowait(dp); + return xfs_ilock_data_map_shared(dp); +} + uint xfs_ilock_attr_map_shared( struct xfs_inode *ip) diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h index 7547caf2f2ab..ea206a5a27df 100644 --- a/fs/xfs/xfs_inode.h +++ b/fs/xfs/xfs_inode.h @@ -490,13 +490,16 @@ int xfs_rename(struct mnt_idmap *idmap, struct xfs_name *target_name, struct xfs_inode *target_ip, unsigned int flags); -void xfs_ilock(xfs_inode_t *, uint); -int xfs_ilock_nowait(xfs_inode_t *, uint); -void xfs_iunlock(xfs_inode_t *, uint); -void xfs_ilock_demote(xfs_inode_t *, uint); -bool xfs_isilocked(struct xfs_inode *, uint); -uint xfs_ilock_data_map_shared(struct xfs_inode *); -uint xfs_ilock_attr_map_shared(struct xfs_inode *); +void xfs_ilock(struct xfs_inode *ip, uint lockmode); +int xfs_ilock_nowait(struct xfs_inode *ip, uint lockmode); +void xfs_iunlock(struct xfs_inode *ip, uint lockmode); +void xfs_ilock_demote(struct xfs_inode *ip, uint lockmode); +bool xfs_isilocked(struct xfs_inode *ip, uint lockmode); +uint xfs_ilock_data_map_shared(struct xfs_inode *ip); +uint xfs_ilock_data_map_shared_nowait(struct xfs_inode *ip); +int xfs_ilock_data_map_shared_generic(struct xfs_inode *ip, + bool nowait); +uint xfs_ilock_attr_map_shared(struct xfs_inode *ip); uint xfs_ip2xflags(struct xfs_inode *); int xfs_ifree(struct xfs_trans *, struct xfs_inode *); From patchwork Sun Aug 27 13:28:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 730B0C83F18 for ; Sun, 27 Aug 2023 13:33:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230297AbjH0Nch (ORCPT ); Sun, 27 Aug 2023 09:32:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45452 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230378AbjH0NcV (ORCPT ); Sun, 27 Aug 2023 09:32:21 -0400 Received: from out-242.mta1.migadu.com (out-242.mta1.migadu.com [95.215.58.242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90A7C195; Sun, 27 Aug 2023 06:32:16 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143134; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=e5tdpTadoLYdZJB8cQ8K1ymkQcAcLx9XJrtCjJGJaVs=; b=w2Ynrlkl0lrPBgOEZvZgZsAroI/SroVLk9P1ruoBqvhkruN+yY2MpcktHEbfNgoWJ1BBu4 4ubUBExoqQaaP8jDC6A2V4vWqXAbneLZ+BhAJSfP0lzTzlCZEFFPuxh0PF54TbtDSMupEM g0UMdnFp1J3UGmAIMamiGdfpc/ghYvs= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 03/11] vfs: add nowait flag for struct dir_context Date: Sun, 27 Aug 2023 21:28:27 +0800 Message-Id: <20230827132835.1373581-4-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu The flags will allow passing DIR_CONTEXT_F_NOWAIT to iterate() implementations that support it (as signaled through FMODE_NWAIT in file->f_mode) Notes: - considered using IOCB_NOWAIT but if we add more flags later it would be confusing to keep track of which values are valid, use dedicated flags - might want to check ctx.flags & DIR_CONTEXT_F_NOWAIT is only set when file->f_mode & FMODE_NOWAIT in iterate_dir() as e.g. WARN_ONCE? Co-developed-by: Dominique Martinet Signed-off-by: Dominique Martinet Signed-off-by: Hao Xu --- fs/internal.h | 2 +- fs/readdir.c | 6 ++++-- include/linux/fs.h | 8 ++++++++ 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/fs/internal.h b/fs/internal.h index b1f66e52d61b..7508d485c655 100644 --- a/fs/internal.h +++ b/fs/internal.h @@ -311,4 +311,4 @@ void mnt_idmap_put(struct mnt_idmap *idmap); struct linux_dirent64; int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count); + unsigned int count, unsigned long flags); diff --git a/fs/readdir.c b/fs/readdir.c index 9592259b7e7f..b80caf4c9321 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -358,12 +358,14 @@ static bool filldir64(struct dir_context *ctx, const char *name, int namlen, * @file : pointer to file struct of directory * @dirent : pointer to user directory structure * @count : size of buffer + * @flags : additional dir_context flags */ int vfs_getdents(struct file *file, struct linux_dirent64 __user *dirent, - unsigned int count) + unsigned int count, unsigned long flags) { struct getdents_callback64 buf = { .ctx.actor = filldir64, + .ctx.flags = flags, .count = count, .current_dir = dirent }; @@ -395,7 +397,7 @@ SYSCALL_DEFINE3(getdents64, unsigned int, fd, if (!f.file) return -EBADF; - error = vfs_getdents(f.file, dirent, count); + error = vfs_getdents(f.file, dirent, count, 0); fdput_pos(f); return error; diff --git a/include/linux/fs.h b/include/linux/fs.h index 6867512907d6..f3e315e8efdd 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1719,8 +1719,16 @@ typedef bool (*filldir_t)(struct dir_context *, const char *, int, loff_t, u64, struct dir_context { filldir_t actor; loff_t pos; + unsigned long flags; }; +/* + * flags for dir_context flags + * DIR_CONTEXT_F_NOWAIT: Request non-blocking iterate + * (requires file->f_mode & FMODE_NOWAIT) + */ +#define DIR_CONTEXT_F_NOWAIT (1 << 0) + /* * These flags let !MMU mmap() govern direct device mapping vs immediate * copying more easily for MAP_PRIVATE, especially for ROM filesystems. From patchwork Sun Aug 27 13:28:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367226 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39A7AC83F14 for ; Sun, 27 Aug 2023 13:33:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230335AbjH0NdH (ORCPT ); Sun, 27 Aug 2023 09:33:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230443AbjH0Nc4 (ORCPT ); Sun, 27 Aug 2023 09:32:56 -0400 Received: from out-248.mta1.migadu.com (out-248.mta1.migadu.com [95.215.58.248]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 985981A6; Sun, 27 Aug 2023 06:32:46 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=W04QOWrrBuMFm18D/+g+MI2uZNnKBIsbwxq2wO1dKKw=; b=Yw+5XvRr3vDIU02wPbTBy+P7Vb3r5vANDo9ahLXEToHmV4GE1+GjTI2/FiGvzS/72CcLGE wNMghjMXDTPCiQyUzrxQFpeftIBQyMH0jGDsayXuE4xMGusaZd0pBKhpckAeDFF1jY1yin IKCCEyk0xEU76DQXzGM7PUWjnMG6kt0= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 04/11] vfs: add a vfs helper for io_uring file pos lock Date: Sun, 27 Aug 2023 21:28:28 +0800 Message-Id: <20230827132835.1373581-5-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Add a vfs helper file_pos_lock_nowait() for io_uring usage. The function have conditional nowait logic, i.e. if nowait is needed, return -EAGAIN when trylock fails. Signed-off-by: Hao Xu --- fs/file.c | 13 +++++++++++++ include/linux/file.h | 2 ++ 2 files changed, 15 insertions(+) diff --git a/fs/file.c b/fs/file.c index 35c62b54c9d6..8e5c38f5db52 100644 --- a/fs/file.c +++ b/fs/file.c @@ -1053,6 +1053,19 @@ void __f_unlock_pos(struct file *f) mutex_unlock(&f->f_pos_lock); } +int file_pos_lock_nowait(struct file *file, bool nowait) +{ + if (!(file->f_mode & FMODE_ATOMIC_POS)) + return 0; + + if (!nowait) + mutex_lock(&file->f_pos_lock); + else if (!mutex_trylock(&file->f_pos_lock)) + return -EAGAIN; + + return 1; +} + /* * We only lock f_pos if we have threads or if the file might be * shared with another process. In both cases we'll have an elevated diff --git a/include/linux/file.h b/include/linux/file.h index 6e9099d29343..bcc6ba0aec50 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -81,6 +81,8 @@ static inline void fdput_pos(struct fd f) fdput(f); } +extern int file_pos_lock_nowait(struct file *file, bool nowait); + DEFINE_CLASS(fd, struct fd, fdput(_T), fdget(fd), int fd) extern int f_dupfd(unsigned int from, struct file *file, unsigned flags); From patchwork Sun Aug 27 13:28:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4AD00C83F16 for ; Sun, 27 Aug 2023 13:34:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230350AbjH0Ndl (ORCPT ); Sun, 27 Aug 2023 09:33:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230398AbjH0NdR (ORCPT ); Sun, 27 Aug 2023 09:33:17 -0400 Received: from out-250.mta1.migadu.com (out-250.mta1.migadu.com [95.215.58.250]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1B72CA; Sun, 27 Aug 2023 06:33:12 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143191; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/w6lIGuCSX5M2SzS7SE+A2P7DaEhoFOZKnfxuRd6M0g=; b=nFii2RaAVtERxLk40QaUcpRmI8KsgMix1y0E+FJbochLiKUgvfCEHVlGKzl74Le3vQdKIX Ic8HKxNSkSYAWoFe69ooiMlQ7zn66y3ibRq5sWqNObX5lM32Ml0OMzA1739KowhZg9B8i9 pdpKOlIaZIJj3teGeY/OKxpoMDEzjMc= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 05/11] vfs: add file_pos_unlock() for io_uring usage Date: Sun, 27 Aug 2023 21:28:29 +0800 Message-Id: <20230827132835.1373581-6-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Add a helper to unlock f_pos_lock without any condition. Introduce this since io_uring handles f_pos_lock not with a fd struct, thus FDPUT_POS_UNLOCK isn't used. Signed-off-by: Hao Xu --- include/linux/file.h | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/include/linux/file.h b/include/linux/file.h index bcc6ba0aec50..a179f4794341 100644 --- a/include/linux/file.h +++ b/include/linux/file.h @@ -81,6 +81,11 @@ static inline void fdput_pos(struct fd f) fdput(f); } +static inline void file_pos_unlock(struct file *file) +{ + __f_unlock_pos(file); +} + extern int file_pos_lock_nowait(struct file *file, bool nowait); DEFINE_CLASS(fd, struct fd, fdput(_T), fdget(fd), int fd) From patchwork Sun Aug 27 13:28:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0B9CC83F18 for ; Sun, 27 Aug 2023 13:34:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230395AbjH0NeM (ORCPT ); Sun, 27 Aug 2023 09:34:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34588 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230426AbjH0Ndu (ORCPT ); Sun, 27 Aug 2023 09:33:50 -0400 Received: from out-243.mta1.migadu.com (out-243.mta1.migadu.com [95.215.58.243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 286B719B; Sun, 27 Aug 2023 06:33:44 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143223; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/SxFJtissRd8EFEKfjvtrzUrKe2UZWHAXBAv8XN+2Ws=; b=eP0/0pozPhJqhF9X0LVypWfPixS8Ogox1mSys0p44e7ItbiSDBlkHW42EuiOoGiIwrzhGw tZ/ULBZZYg3CYueJM7OBSSuL9VrufdTT1Ge7QZSiTjhbDK29Ks9eejLI5UabQHt/7N12hX DZgpFPTC3SJN3d95V92Kgu+jAWVmYA0= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 06/11] vfs: add a nowait parameter for touch_atime() Date: Sun, 27 Aug 2023 21:28:30 +0800 Message-Id: <20230827132835.1373581-7-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Add a nowait boolean parameter for touch_atime() to support nowait semantics. It is true only when io_uring is the initial caller. Signed-off-by: Hao Xu --- fs/cachefiles/namei.c | 2 +- fs/ecryptfs/file.c | 4 ++-- fs/inode.c | 7 ++++--- fs/namei.c | 4 ++-- fs/nfsd/vfs.c | 2 +- fs/overlayfs/file.c | 2 +- fs/overlayfs/inode.c | 2 +- fs/stat.c | 2 +- include/linux/fs.h | 4 ++-- kernel/bpf/inode.c | 4 ++-- net/unix/af_unix.c | 4 ++-- 11 files changed, 19 insertions(+), 18 deletions(-) diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index d9d22d0ec38a..7a21bf0e36b8 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -591,7 +591,7 @@ static bool cachefiles_open_file(struct cachefiles_object *object, * used to keep track of culling, and atimes are only updated by read, * write and readdir but not lookup or open). */ - touch_atime(&file->f_path); + touch_atime(&file->f_path, false); dput(dentry); return true; diff --git a/fs/ecryptfs/file.c b/fs/ecryptfs/file.c index ce0a3c5ed0ca..3db7006cc440 100644 --- a/fs/ecryptfs/file.c +++ b/fs/ecryptfs/file.c @@ -39,7 +39,7 @@ static ssize_t ecryptfs_read_update_atime(struct kiocb *iocb, rc = generic_file_read_iter(iocb, to); if (rc >= 0) { path = ecryptfs_dentry_to_lower_path(file->f_path.dentry); - touch_atime(path); + touch_atime(path, false); } return rc; } @@ -64,7 +64,7 @@ static ssize_t ecryptfs_splice_read_update_atime(struct file *in, loff_t *ppos, rc = filemap_splice_read(in, ppos, pipe, len, flags); if (rc >= 0) { path = ecryptfs_dentry_to_lower_path(in->f_path.dentry); - touch_atime(path); + touch_atime(path, false); } return rc; } diff --git a/fs/inode.c b/fs/inode.c index 8fefb69e1f84..e83b836f2d09 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1961,17 +1961,17 @@ bool atime_needs_update(const struct path *path, struct inode *inode) return true; } -void touch_atime(const struct path *path) +int touch_atime(const struct path *path, bool nowait) { struct vfsmount *mnt = path->mnt; struct inode *inode = d_inode(path->dentry); struct timespec64 now; if (!atime_needs_update(path, inode)) - return; + return 0; if (!sb_start_write_trylock(inode->i_sb)) - return; + return 0; if (__mnt_want_write(mnt) != 0) goto skip_update; @@ -1989,6 +1989,7 @@ void touch_atime(const struct path *path) __mnt_drop_write(mnt); skip_update: sb_end_write(inode->i_sb); + return 0; } EXPORT_SYMBOL(touch_atime); diff --git a/fs/namei.c b/fs/namei.c index e56ff39a79bc..35731d405730 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -1776,12 +1776,12 @@ static const char *pick_link(struct nameidata *nd, struct path *link, return ERR_PTR(-ELOOP); if (!(nd->flags & LOOKUP_RCU)) { - touch_atime(&last->link); + touch_atime(&last->link, false); cond_resched(); } else if (atime_needs_update(&last->link, inode)) { if (!try_to_unlazy(nd)) return ERR_PTR(-ECHILD); - touch_atime(&last->link); + touch_atime(&last->link, false); } error = security_inode_follow_link(link->dentry, inode, diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 8a2321d19194..3179e7b5d209 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1569,7 +1569,7 @@ nfsd_readlink(struct svc_rqst *rqstp, struct svc_fh *fhp, char *buf, int *lenp) if (unlikely(!d_is_symlink(path.dentry))) return nfserr_inval; - touch_atime(&path); + touch_atime(&path, false); link = vfs_get_link(path.dentry, &done); if (IS_ERR(link)) diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index 21245b00722a..6ff466ef98ea 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -255,7 +255,7 @@ static void ovl_file_accessed(struct file *file) inode->i_ctime = upperinode->i_ctime; } - touch_atime(&file->f_path); + touch_atime(&file->f_path, false); } static rwf_t ovl_iocb_to_rwf(int ifl) diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index a63e57447be9..66e03025e748 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -703,7 +703,7 @@ int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags) }; if (upperpath.dentry) { - touch_atime(&upperpath); + touch_atime(&upperpath, false); inode->i_atime = d_inode(upperpath.dentry)->i_atime; } } diff --git a/fs/stat.c b/fs/stat.c index 7c238da22ef0..713773e61110 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -485,7 +485,7 @@ static int do_readlinkat(int dfd, const char __user *pathname, if (d_is_symlink(path.dentry) || inode->i_op->readlink) { error = security_inode_readlink(path.dentry); if (!error) { - touch_atime(&path); + touch_atime(&path, false); error = vfs_readlink(path.dentry, buf, bufsiz); } } diff --git a/include/linux/fs.h b/include/linux/fs.h index f3e315e8efdd..ba54879089ac 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2201,13 +2201,13 @@ enum file_time_flags { }; extern bool atime_needs_update(const struct path *, struct inode *); -extern void touch_atime(const struct path *); +extern int touch_atime(const struct path *path, bool nowait); int inode_update_time(struct inode *inode, struct timespec64 *time, int flags); static inline void file_accessed(struct file *file) { if (!(file->f_flags & O_NOATIME)) - touch_atime(&file->f_path); + touch_atime(&file->f_path, false); } extern int file_modified(struct file *file); diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 4174f76133df..bc020b45d5c8 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -517,7 +517,7 @@ static void *bpf_obj_do_get(int path_fd, const char __user *pathname, raw = bpf_any_get(inode->i_private, *type); if (!IS_ERR(raw)) - touch_atime(&path); + touch_atime(&path, false); path_put(&path); return raw; @@ -591,7 +591,7 @@ struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type typ return ERR_PTR(ret); prog = __get_prog_inode(d_backing_inode(path.dentry), type); if (!IS_ERR(prog)) - touch_atime(&path); + touch_atime(&path, false); path_put(&path); return prog; } diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 123b35ddfd71..5868e4e47320 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1084,7 +1084,7 @@ static struct sock *unix_find_bsd(struct sockaddr_un *sunaddr, int addr_len, err = -EPROTOTYPE; if (sk->sk_type == type) - touch_atime(&path); + touch_atime(&path, false); else goto sock_put; @@ -1114,7 +1114,7 @@ static struct sock *unix_find_abstract(struct net *net, dentry = unix_sk(sk)->path.dentry; if (dentry) - touch_atime(&unix_sk(sk)->path); + touch_atime(&unix_sk(sk)->path, false); return sk; } From patchwork Sun Aug 27 13:28:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04312C83F19 for ; Sun, 27 Aug 2023 13:35:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230365AbjH0Nen (ORCPT ); Sun, 27 Aug 2023 09:34:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230462AbjH0NeU (ORCPT ); Sun, 27 Aug 2023 09:34:20 -0400 Received: from out-245.mta1.migadu.com (out-245.mta1.migadu.com [95.215.58.245]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1F7BCC9; Sun, 27 Aug 2023 06:34:11 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143249; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8SowR/5OB7dg0Ze+leVLqrAkBaJI/dvgnJq74oNV+rw=; b=QVXx/bTwdFCK3FigE3bkCbe2czyripMyGLUg4h+GStpxVumOjkcqldjziNgtEa+JIjEIdw 2WdW9WAenCowl3yazyOwy/6Hrh3GC232u8ThOV7lZj17P/Dz8FfBitp2VsRMXLtCnx4lHo En1h8t6fO7bO07ca/V9kDbVI82xb9qY= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 07/11] vfs: add nowait parameter for file_accessed() Date: Sun, 27 Aug 2023 21:28:31 +0800 Message-Id: <20230827132835.1373581-8-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Add a boolean parameter for file_accessed() to support nowait semantics. Currently it is true only with io_uring as its initial caller. Signed-off-by: Hao Xu --- arch/s390/hypfs/inode.c | 2 +- block/fops.c | 2 +- fs/btrfs/file.c | 2 +- fs/btrfs/inode.c | 2 +- fs/coda/dir.c | 4 ++-- fs/ext2/file.c | 4 ++-- fs/ext4/file.c | 6 +++--- fs/f2fs/file.c | 4 ++-- fs/fuse/dax.c | 2 +- fs/fuse/file.c | 4 ++-- fs/gfs2/file.c | 2 +- fs/hugetlbfs/inode.c | 2 +- fs/nilfs2/file.c | 2 +- fs/orangefs/file.c | 2 +- fs/orangefs/inode.c | 2 +- fs/pipe.c | 2 +- fs/ramfs/file-nommu.c | 2 +- fs/readdir.c | 2 +- fs/smb/client/cifsfs.c | 2 +- fs/splice.c | 2 +- fs/ubifs/file.c | 2 +- fs/udf/file.c | 2 +- fs/xfs/xfs_file.c | 6 +++--- fs/zonefs/file.c | 4 ++-- include/linux/fs.h | 5 +++-- mm/filemap.c | 8 ++++---- mm/shmem.c | 6 +++--- 27 files changed, 43 insertions(+), 42 deletions(-) diff --git a/arch/s390/hypfs/inode.c b/arch/s390/hypfs/inode.c index ee919bfc8186..55f562027c4f 100644 --- a/arch/s390/hypfs/inode.c +++ b/arch/s390/hypfs/inode.c @@ -157,7 +157,7 @@ static ssize_t hypfs_read_iter(struct kiocb *iocb, struct iov_iter *to) if (!count) return -EFAULT; iocb->ki_pos = pos + count; - file_accessed(file); + file_accessed(file, false); return count; } diff --git a/block/fops.c b/block/fops.c index a286bf3325c5..546ecd3c8084 100644 --- a/block/fops.c +++ b/block/fops.c @@ -601,7 +601,7 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = kiocb_write_and_wait(iocb, count); if (ret < 0) goto reexpand; - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); ret = blkdev_direct_IO(iocb, to); if (ret >= 0) { diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index fd03e689a6be..24c0bf3818a6 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2013,7 +2013,7 @@ static int btrfs_file_mmap(struct file *filp, struct vm_area_struct *vma) if (!mapping->a_ops->read_folio) return -ENOEXEC; - file_accessed(filp); + file_accessed(filp, false); vma->vm_ops = &btrfs_file_vm_ops; return 0; diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index dbbb67293e34..50e9ae8c388c 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -10153,7 +10153,7 @@ ssize_t btrfs_encoded_read(struct kiocb *iocb, struct iov_iter *iter, struct extent_map *em; bool unlocked = false; - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); btrfs_inode_lock(inode, BTRFS_ILOCK_SHARED); diff --git a/fs/coda/dir.c b/fs/coda/dir.c index 8450b1bd354b..1d94c013ac88 100644 --- a/fs/coda/dir.c +++ b/fs/coda/dir.c @@ -436,12 +436,12 @@ static int coda_readdir(struct file *coda_file, struct dir_context *ctx) if (host_file->f_op->iterate_shared) { inode_lock_shared(host_inode); ret = host_file->f_op->iterate_shared(host_file, ctx); - file_accessed(host_file); + file_accessed(host_file, false); inode_unlock_shared(host_inode); } else { inode_lock(host_inode); ret = host_file->f_op->iterate(host_file, ctx); - file_accessed(host_file); + file_accessed(host_file, false); inode_unlock(host_inode); } } diff --git a/fs/ext2/file.c b/fs/ext2/file.c index 0b4c91c62e1f..dc059cae50a4 100644 --- a/fs/ext2/file.c +++ b/fs/ext2/file.c @@ -44,7 +44,7 @@ static ssize_t ext2_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = dax_iomap_rw(iocb, to, &ext2_iomap_ops); inode_unlock_shared(inode); - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); return ret; } @@ -127,7 +127,7 @@ static int ext2_file_mmap(struct file *file, struct vm_area_struct *vma) if (!IS_DAX(file_inode(file))) return generic_file_mmap(file, vma); - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &ext2_dax_vm_ops; return 0; } diff --git a/fs/ext4/file.c b/fs/ext4/file.c index c457c8517f0f..2ab790a668a8 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -94,7 +94,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, 0, NULL, 0); inode_unlock_shared(inode); - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); return ret; } @@ -122,7 +122,7 @@ static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = dax_iomap_rw(iocb, to, &ext4_iomap_ops); inode_unlock_shared(inode); - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); return ret; } #endif @@ -820,7 +820,7 @@ static int ext4_file_mmap(struct file *file, struct vm_area_struct *vma) if (!daxdev_mapping_supported(vma, dax_dev)) return -EOPNOTSUPP; - file_accessed(file); + file_accessed(file, false); if (IS_DAX(file_inode(file))) { vma->vm_ops = &ext4_dax_vm_ops; vm_flags_set(vma, VM_HUGEPAGE); diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index 093039dee992..246e61d78f92 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -524,7 +524,7 @@ static int f2fs_file_mmap(struct file *file, struct vm_area_struct *vma) if (!f2fs_is_compress_backend_ready(inode)) return -EOPNOTSUPP; - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &f2fs_file_vm_ops; set_inode_flag(inode, FI_MMAP_FILE); return 0; @@ -4380,7 +4380,7 @@ static ssize_t f2fs_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) f2fs_up_read(&fi->i_gc_rwsem[READ]); - file_accessed(file); + file_accessed(file, false); out: trace_f2fs_direct_IO_exit(inode, pos, count, READ, ret); return ret; diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 8e74f278a3f6..8a43c37195dd 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -858,7 +858,7 @@ static const struct vm_operations_struct fuse_dax_vm_ops = { int fuse_dax_mmap(struct file *file, struct vm_area_struct *vma) { - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &fuse_dax_vm_ops; vm_flags_set(vma, VM_MIXEDMAP | VM_HUGEPAGE); return 0; diff --git a/fs/fuse/file.c b/fs/fuse/file.c index bc4115288eec..3c4cbc5e2de6 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -2496,7 +2496,7 @@ static int fuse_file_mmap(struct file *file, struct vm_area_struct *vma) if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) fuse_link_write_file(file); - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &fuse_file_vm_ops; return 0; } @@ -3193,7 +3193,7 @@ static ssize_t __fuse_copy_file_range(struct file *file_in, loff_t pos_in, clear_bit(FUSE_I_SIZE_UNSTABLE, &fi_out->state); inode_unlock(inode_out); - file_accessed(file_in); + file_accessed(file_in, false); fuse_flush_time_update(inode_out); diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 1bf3c4453516..3003be5b8266 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -601,7 +601,7 @@ static int gfs2_mmap(struct file *file, struct vm_area_struct *vma) return error; /* grab lock to update inode */ gfs2_glock_dq_uninit(&i_gh); - file_accessed(file); + file_accessed(file, false); } vma->vm_ops = &gfs2_vm_ops; diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 7b17ccfa039d..729f66346c3c 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -161,7 +161,7 @@ static int hugetlbfs_file_mmap(struct file *file, struct vm_area_struct *vma) return -EINVAL; inode_lock(inode); - file_accessed(file); + file_accessed(file, false); ret = -ENOMEM; if (!hugetlb_reserve_pages(inode, diff --git a/fs/nilfs2/file.c b/fs/nilfs2/file.c index a9eb3487efb2..a857ebcf099c 100644 --- a/fs/nilfs2/file.c +++ b/fs/nilfs2/file.c @@ -119,7 +119,7 @@ static const struct vm_operations_struct nilfs_file_vm_ops = { static int nilfs_file_mmap(struct file *file, struct vm_area_struct *vma) { - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &nilfs_file_vm_ops; return 0; } diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c index d68372241b30..5c7a17995fe1 100644 --- a/fs/orangefs/file.c +++ b/fs/orangefs/file.c @@ -412,7 +412,7 @@ static int orangefs_file_mmap(struct file *file, struct vm_area_struct *vma) /* set the sequential readahead hint */ vm_flags_mod(vma, VM_SEQ_READ, VM_RAND_READ); - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &orangefs_file_vm_ops; return 0; } diff --git a/fs/orangefs/inode.c b/fs/orangefs/inode.c index 9014bbcc8031..77d56703bb09 100644 --- a/fs/orangefs/inode.c +++ b/fs/orangefs/inode.c @@ -597,7 +597,7 @@ static ssize_t orangefs_direct_IO(struct kiocb *iocb, ret = total_count; if (ret > 0) { if (type == ORANGEFS_IO_READ) { - file_accessed(file); + file_accessed(file, false); } else { file_update_time(file); if (*offset > i_size_read(inode)) diff --git a/fs/pipe.c b/fs/pipe.c index 2d88f73f585a..ce1038d3de4b 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -393,7 +393,7 @@ pipe_read(struct kiocb *iocb, struct iov_iter *to) wake_up_interruptible_sync_poll(&pipe->rd_wait, EPOLLIN | EPOLLRDNORM); kill_fasync(&pipe->fasync_writers, SIGIO, POLL_OUT); if (ret > 0) - file_accessed(filp); + file_accessed(filp, false); return ret; } diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index efb1b4c1a0a4..ad69f828f6ad 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -267,7 +267,7 @@ static int ramfs_nommu_mmap(struct file *file, struct vm_area_struct *vma) if (!is_nommu_shared_mapping(vma->vm_flags)) return -ENOSYS; - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &generic_file_vm_ops; return 0; } diff --git a/fs/readdir.c b/fs/readdir.c index b80caf4c9321..2f4c9c663a39 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -68,7 +68,7 @@ int iterate_dir(struct file *file, struct dir_context *ctx) res = file->f_op->iterate(file, ctx); file->f_pos = ctx->pos; fsnotify_access(file); - file_accessed(file); + file_accessed(file, ctx->flags & DIR_CONTEXT_F_NOWAIT); } if (shared) inode_unlock_shared(inode); diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c index a4d8b0ea1c8c..20156c5e83e6 100644 --- a/fs/smb/client/cifsfs.c +++ b/fs/smb/client/cifsfs.c @@ -1307,7 +1307,7 @@ ssize_t cifs_file_copychunk_range(unsigned int xid, rc = target_tcon->ses->server->ops->copychunk_range(xid, smb_file_src, smb_file_target, off, len, destoff); - file_accessed(src_file); + file_accessed(src_file, false); /* force revalidate of size and timestamps of target file now * that target is updated on the server diff --git a/fs/splice.c b/fs/splice.c index 004eb1c4ce31..e4dcfa1c0fef 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -1104,7 +1104,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd, done: pipe->tail = pipe->head = 0; - file_accessed(in); + file_accessed(in, false); return bytes; read_failure: diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c index 6738fe43040b..a27c73848571 100644 --- a/fs/ubifs/file.c +++ b/fs/ubifs/file.c @@ -1603,7 +1603,7 @@ static int ubifs_file_mmap(struct file *file, struct vm_area_struct *vma) vma->vm_ops = &ubifs_file_vm_ops; if (IS_ENABLED(CONFIG_UBIFS_ATIME_SUPPORT)) - file_accessed(file); + file_accessed(file, false); return 0; } diff --git a/fs/udf/file.c b/fs/udf/file.c index 243840dc83ad..46edf6e64632 100644 --- a/fs/udf/file.c +++ b/fs/udf/file.c @@ -191,7 +191,7 @@ static int udf_release_file(struct inode *inode, struct file *filp) static int udf_file_mmap(struct file *file, struct vm_area_struct *vma) { - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &udf_file_vm_ops; return 0; diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 4f502219ae4f..c72efdb9e43e 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -227,7 +227,7 @@ xfs_file_dio_read( if (!iov_iter_count(to)) return 0; /* skip atime */ - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); ret = xfs_ilock_iocb(iocb, XFS_IOLOCK_SHARED); if (ret) @@ -257,7 +257,7 @@ xfs_file_dax_read( ret = dax_iomap_rw(iocb, to, &xfs_read_iomap_ops); xfs_iunlock(ip, XFS_IOLOCK_SHARED); - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); return ret; } @@ -1434,7 +1434,7 @@ xfs_file_mmap( if (!daxdev_mapping_supported(vma, target->bt_daxdev)) return -EOPNOTSUPP; - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &xfs_file_vm_ops; if (IS_DAX(inode)) vm_flags_set(vma, VM_HUGEPAGE); diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 92c9aaae3663..664ebae181bd 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -323,7 +323,7 @@ static int zonefs_file_mmap(struct file *file, struct vm_area_struct *vma) (vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) return -EINVAL; - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &zonefs_file_vm_ops; return 0; @@ -736,7 +736,7 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = -EINVAL; goto inode_unlock; } - file_accessed(iocb->ki_filp); + file_accessed(iocb->ki_filp, false); ret = iomap_dio_rw(iocb, to, &zonefs_read_iomap_ops, &zonefs_read_dio_ops, 0, NULL, 0); } else { diff --git a/include/linux/fs.h b/include/linux/fs.h index ba54879089ac..ed60b3d70d1e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2204,10 +2204,11 @@ extern bool atime_needs_update(const struct path *, struct inode *); extern int touch_atime(const struct path *path, bool nowait); int inode_update_time(struct inode *inode, struct timespec64 *time, int flags); -static inline void file_accessed(struct file *file) +static inline int file_accessed(struct file *file, bool nowait) { if (!(file->f_flags & O_NOATIME)) - touch_atime(&file->f_path, false); + return touch_atime(&file->f_path, nowait); + return 0; } extern int file_modified(struct file *file); diff --git a/mm/filemap.c b/mm/filemap.c index 9e44a49bbd74..1f2032f4fd10 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2723,7 +2723,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter, folio_batch_init(&fbatch); } while (iov_iter_count(iter) && iocb->ki_pos < isize && !error); - file_accessed(filp); + file_accessed(filp, false); return already_read ? already_read : error; } @@ -2809,7 +2809,7 @@ generic_file_read_iter(struct kiocb *iocb, struct iov_iter *iter) retval = kiocb_write_and_wait(iocb, count); if (retval < 0) return retval; - file_accessed(file); + file_accessed(file, false); retval = mapping->a_ops->direct_IO(iocb, iter); if (retval >= 0) { @@ -2978,7 +2978,7 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos, out: folio_batch_release(&fbatch); - file_accessed(in); + file_accessed(in, false); return total_spliced ? total_spliced : error; } @@ -3613,7 +3613,7 @@ int generic_file_mmap(struct file *file, struct vm_area_struct *vma) if (!mapping->a_ops->read_folio) return -ENOEXEC; - file_accessed(file); + file_accessed(file, false); vma->vm_ops = &generic_file_vm_ops; return 0; } diff --git a/mm/shmem.c b/mm/shmem.c index 2f2e0e618072..440b23e2d9e1 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2317,7 +2317,7 @@ static int shmem_mmap(struct file *file, struct vm_area_struct *vma) /* arm64 - allow memory tagging on RAM-based files */ vm_flags_set(vma, VM_MTE_ALLOWED); - file_accessed(file); + file_accessed(file, false); /* This is anonymous shared memory if it is unlinked at the time of mmap */ if (inode->i_nlink) vma->vm_ops = &shmem_vm_ops; @@ -2727,7 +2727,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) } *ppos = ((loff_t) index << PAGE_SHIFT) + offset; - file_accessed(file); + file_accessed(file, false); return retval ? retval : error; } @@ -2859,7 +2859,7 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, if (folio) folio_put(folio); - file_accessed(in); + file_accessed(in, false); return total_spliced ? total_spliced : error; } From patchwork Sun Aug 27 13:28:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367230 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10F08C83F16 for ; Sun, 27 Aug 2023 13:35:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230254AbjH0NfP (ORCPT ); Sun, 27 Aug 2023 09:35:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44608 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230443AbjH0Neq (ORCPT ); Sun, 27 Aug 2023 09:34:46 -0400 Received: from out-248.mta1.migadu.com (out-248.mta1.migadu.com [IPv6:2001:41d0:203:375::f8]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35037FF; Sun, 27 Aug 2023 06:34:42 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143280; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=X5RgYva0uI/YfpMcRAQXXXC0hBNumZORbEO2XQnoCOE=; b=WXr39RUhGjmbBKUuqHmNng9tnfRreU8ks2ZtLjekYh53jHM4CnAQcsfBzBDABn1mkShot8 ZX46/udLE4Y2+Gp8W4OGBBchscS3B/zTnSSJs8rd29bz5YyB6Ia5W1tSVuXmmsCsWASJVj OhE3i3OIyl7cEamjJGJ1G7EdndXC44c= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 08/11] vfs: move file_accessed() to the beginning of iterate_dir() Date: Sun, 27 Aug 2023 21:28:32 +0800 Message-Id: <20230827132835.1373581-9-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Move file_accessed() to the beginning of iterate_dir() so that we don't need to rollback all the work done when file_accessed() returns -EAGAIN at the end of getdents. Signed-off-by: Hao Xu --- fs/readdir.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/readdir.c b/fs/readdir.c index 2f4c9c663a39..6469f076ba6e 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -61,6 +61,10 @@ int iterate_dir(struct file *file, struct dir_context *ctx) res = -ENOENT; if (!IS_DEADDIR(inode)) { + res = file_accessed(file, ctx->flags & DIR_CONTEXT_F_NOWAIT); + if (res == -EAGAIN) + goto out_unlock; + ctx->pos = file->f_pos; if (shared) res = file->f_op->iterate_shared(file, ctx); @@ -68,8 +72,9 @@ int iterate_dir(struct file *file, struct dir_context *ctx) res = file->f_op->iterate(file, ctx); file->f_pos = ctx->pos; fsnotify_access(file); - file_accessed(file, ctx->flags & DIR_CONTEXT_F_NOWAIT); } + +out_unlock: if (shared) inode_unlock_shared(inode); else From patchwork Sun Aug 27 13:28:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9F6AC83F1A for ; Sun, 27 Aug 2023 13:36:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230491AbjH0Nfs (ORCPT ); Sun, 27 Aug 2023 09:35:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230527AbjH0Nf0 (ORCPT ); Sun, 27 Aug 2023 09:35:26 -0400 Received: from out-244.mta1.migadu.com (out-244.mta1.migadu.com [IPv6:2001:41d0:203:375::f4]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B569ADE; Sun, 27 Aug 2023 06:35:09 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143308; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=y42FYySYNsis5G3mORomW/nSivl81IaFtk7YSUfz0Jg=; b=S4P5Lis+5UgxAXsbIFEQ0bJcfwW5bp3gD+5EV25w4Odl6XRzUJpaJGVaokRckJ9Hpyff0y QCMWQ876jzpmtvZlXQ4fdJDhdeygghENaAwk2aokCERTn3SKcsNlbNr+KPD/7nkWidZUEs LZ/LAlKjzWeA2HQHZTdX8ZHXdvIJ060= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 09/11] vfs: error out -EAGAIN if atime needs to be updated Date: Sun, 27 Aug 2023 21:28:33 +0800 Message-Id: <20230827132835.1373581-10-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu To enforce nowait semantics, error out -EAGAIN if atime needs to be updated. Signed-off-by: Hao Xu --- fs/inode.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/inode.c b/fs/inode.c index e83b836f2d09..32d81be65cf9 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -1970,6 +1970,9 @@ int touch_atime(const struct path *path, bool nowait) if (!atime_needs_update(path, inode)) return 0; + if (nowait) + return -EAGAIN; + if (!sb_start_write_trylock(inode->i_sb)) return 0; From patchwork Sun Aug 27 13:28:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAFBBC83F12 for ; Sun, 27 Aug 2023 13:36:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230478AbjH0NgR (ORCPT ); Sun, 27 Aug 2023 09:36:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230488AbjH0Nfr (ORCPT ); Sun, 27 Aug 2023 09:35:47 -0400 Received: from out-242.mta1.migadu.com (out-242.mta1.migadu.com [95.215.58.242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E5F11BC; Sun, 27 Aug 2023 06:35:35 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143333; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CL8wzJoAbQ75iZcN0EHSqoGXIWt+IFMvEgcKIhs6zqE=; b=BRJIOv0sJCNWH9arCNNuOEeHNgvBAsJDXsD/9ROslY/E3EvWdOi7NTa+04JucH099rNnul 4aa7v7DjXCaaQfzVNr38jK1tfhfTVUIcRN9rZkXC1mcdwty8x7yMXlzmlXd+zl6Ur2HPps a6OaK6jfFgLdJwLH2JKSlhPncqJxHqY= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 10/11] vfs: trylock inode->i_rwsem in iterate_dir() to support nowait Date: Sun, 27 Aug 2023 21:28:34 +0800 Message-Id: <20230827132835.1373581-11-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu Trylock inode->i_rwsem in iterate_dir() to support nowait semantics and error out -EAGAIN when there is contention. Signed-off-by: Hao Xu --- fs/readdir.c | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/readdir.c b/fs/readdir.c index 6469f076ba6e..664ecd9665a1 100644 --- a/fs/readdir.c +++ b/fs/readdir.c @@ -43,6 +43,8 @@ int iterate_dir(struct file *file, struct dir_context *ctx) struct inode *inode = file_inode(file); bool shared = false; int res = -ENOTDIR; + bool nowait; + if (file->f_op->iterate_shared) shared = true; else if (!file->f_op->iterate) @@ -52,16 +54,22 @@ int iterate_dir(struct file *file, struct dir_context *ctx) if (res) goto out; - if (shared) - res = down_read_killable(&inode->i_rwsem); - else - res = down_write_killable(&inode->i_rwsem); - if (res) + nowait = ctx->flags & DIR_CONTEXT_F_NOWAIT; + if (nowait) { + res = shared ? down_read_trylock(&inode->i_rwsem) : + down_write_trylock(&inode->i_rwsem); + if (!res) + res = -EAGAIN; + } else { + res = shared ? down_read_killable(&inode->i_rwsem) : + down_write_killable(&inode->i_rwsem); + } + if (res < 0) goto out; res = -ENOENT; if (!IS_DEADDIR(inode)) { - res = file_accessed(file, ctx->flags & DIR_CONTEXT_F_NOWAIT); + res = file_accessed(file, nowait); if (res == -EAGAIN) goto out_unlock; From patchwork Sun Aug 27 13:28:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xu X-Patchwork-Id: 13367233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01CADC83F12 for ; Sun, 27 Aug 2023 13:37:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231147AbjH0Ng4 (ORCPT ); Sun, 27 Aug 2023 09:36:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231217AbjH0Ngi (ORCPT ); Sun, 27 Aug 2023 09:36:38 -0400 Received: from out-252.mta1.migadu.com (out-252.mta1.migadu.com [IPv6:2001:41d0:203:375::fc]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 749C51B3; Sun, 27 Aug 2023 06:36:28 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1693143386; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NAMtHKwgj081g8lIPVs72MliioxtTYHj7D9CTN85ELE=; b=u7n0PP5Y7t8pZAzx+5IbBJkZfY9azRKTfFVD9LVMOb4Thl+C2D6j0w7wWcMlmzcA25L4C8 zEudurGp2ODbmGylA6BsBPQ9105gr52QFoDr/T1ABSTboWT6yVGzUbuarS4ljsFfYjiAKI tkX7K8GiO8NfI2EeGspOsslEfgzb2IA= From: Hao Xu To: io-uring@vger.kernel.org, Jens Axboe Cc: Dominique Martinet , Pavel Begunkov , Christian Brauner , Alexander Viro , Stefan Roesch , Clay Harris , Dave Chinner , "Darrick J . Wong" , linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org, linux-cachefs@redhat.com, ecryptfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-unionfs@vger.kernel.org, bpf@vger.kernel.org, netdev@vger.kernel.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-btrfs@vger.kernel.org, codalist@coda.cs.cmu.edu, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-mm@kvack.org, linux-nilfs@vger.kernel.org, devel@lists.orangefs.org, linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-mtd@lists.infradead.org, Wanpeng Li Subject: [PATCH 11/11] io_uring: add support for getdents Date: Sun, 27 Aug 2023 21:28:35 +0800 Message-Id: <20230827132835.1373581-12-hao.xu@linux.dev> In-Reply-To: <20230827132835.1373581-1-hao.xu@linux.dev> References: <20230827132835.1373581-1-hao.xu@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org From: Hao Xu This add support for getdents64 to io_uring, acting exactly like the syscall: the directory is iterated from it's current's position as stored in the file struct, and the file's position is updated exactly as if getdents64 had been called. For filesystems that support NOWAIT in iterate_shared(), try to use it first; if a user already knows the filesystem they use do not support nowait they can force async through IOSQE_ASYNC in the sqe flags, avoiding the need to bounce back through a useless EAGAIN return. Co-developed-by: Dominique Martinet Signed-off-by: Dominique Martinet Signed-off-by: Hao Xu --- include/uapi/linux/io_uring.h | 1 + io_uring/fs.c | 53 +++++++++++++++++++++++++++++++++++ io_uring/fs.h | 3 ++ io_uring/opdef.c | 8 ++++++ 4 files changed, 65 insertions(+) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 8e61f8b7c2ce..3896397a1998 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -240,6 +240,7 @@ enum io_uring_op { IORING_OP_URING_CMD, IORING_OP_SEND_ZC, IORING_OP_SENDMSG_ZC, + IORING_OP_GETDENTS, /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/fs.c b/io_uring/fs.c index f6a69a549fd4..04711feac4e6 100644 --- a/io_uring/fs.c +++ b/io_uring/fs.c @@ -47,6 +47,12 @@ struct io_link { int flags; }; +struct io_getdents { + struct file *file; + struct linux_dirent64 __user *dirent; + unsigned int count; +}; + int io_renameat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) { struct io_rename *ren = io_kiocb_to_cmd(req, struct io_rename); @@ -291,3 +297,50 @@ void io_link_cleanup(struct io_kiocb *req) putname(sl->oldpath); putname(sl->newpath); } + +int io_getdents_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_getdents *gd = io_kiocb_to_cmd(req, struct io_getdents); + + if (READ_ONCE(sqe->off)) + return -EINVAL; + + gd->dirent = u64_to_user_ptr(READ_ONCE(sqe->addr)); + gd->count = READ_ONCE(sqe->len); + + return 0; +} + +int io_getdents(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_getdents *gd = io_kiocb_to_cmd(req, struct io_getdents); + struct file *file = req->file; + unsigned long getdents_flags = 0; + bool force_nonblock = issue_flags & IO_URING_F_NONBLOCK; + bool locked; + int ret; + + if (force_nonblock) { + if (!(file->f_flags & O_NONBLOCK) && + !(file->f_mode & FMODE_NOWAIT)) + return -EAGAIN; + + getdents_flags = DIR_CONTEXT_F_NOWAIT; + } + + ret = file_pos_lock_nowait(file, force_nonblock); + if (ret == -EAGAIN) + return ret; + locked = ret; + + ret = vfs_getdents(file, gd->dirent, gd->count, getdents_flags); + if (locked) + file_pos_unlock(file); + + if (ret == -EAGAIN && force_nonblock) + return -EAGAIN; + + io_req_set_res(req, ret, 0); + return 0; +} + diff --git a/io_uring/fs.h b/io_uring/fs.h index 0bb5efe3d6bb..f83a6f3a678d 100644 --- a/io_uring/fs.h +++ b/io_uring/fs.h @@ -18,3 +18,6 @@ int io_symlinkat(struct io_kiocb *req, unsigned int issue_flags); int io_linkat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_linkat(struct io_kiocb *req, unsigned int issue_flags); void io_link_cleanup(struct io_kiocb *req); + +int io_getdents_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_getdents(struct io_kiocb *req, unsigned int issue_flags); diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 3b9c6489b8b6..1bae6b2a8d0b 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -428,6 +428,11 @@ const struct io_issue_def io_issue_defs[] = { .prep = io_eopnotsupp_prep, #endif }, + [IORING_OP_GETDENTS] = { + .needs_file = 1, + .prep = io_getdents_prep, + .issue = io_getdents, + }, }; @@ -648,6 +653,9 @@ const struct io_cold_def io_cold_defs[] = { .fail = io_sendrecv_fail, #endif }, + [IORING_OP_GETDENTS] = { + .name = "GETDENTS", + }, }; const char *io_uring_get_opcode(u8 opcode)