Message ID | 20170524164150.9492-10-rgoldwyn@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, May 24, 2017 at 11:41:49AM -0500, Goldwyn Rodrigues wrote: > From: Goldwyn Rodrigues <rgoldwyn@suse.com> > > If IOCB_NOWAIT is set, bail if the i_rwsem is not lockable > immediately. > > IF IOMAP_NOWAIT is set, return EAGAIN in xfs_file_iomap_begin > if it needs allocation either due to file extension, writing to a hole, > or COW or waiting for other DIOs to finish. > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > Reviewed-by: Christoph Hellwig <hch@lst.de> Looks good, Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > fs/xfs/xfs_file.c | 19 ++++++++++++++----- > fs/xfs/xfs_iomap.c | 17 +++++++++++++++++ > 2 files changed, 31 insertions(+), 5 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 35703a801372..b307940e7d56 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -541,8 +541,11 @@ xfs_file_dio_aio_write( > iolock = XFS_IOLOCK_SHARED; > } > > - xfs_ilock(ip, iolock); > - > + if (!xfs_ilock_nowait(ip, iolock)) { > + if (iocb->ki_flags & IOCB_NOWAIT) > + return -EAGAIN; > + xfs_ilock(ip, iolock); > + } > ret = xfs_file_aio_write_checks(iocb, from, &iolock); > if (ret) > goto out; > @@ -553,9 +556,15 @@ xfs_file_dio_aio_write( > * otherwise demote the lock if we had to take the exclusive lock > * for other reasons in xfs_file_aio_write_checks. > */ > - if (unaligned_io) > - inode_dio_wait(inode); > - else if (iolock == XFS_IOLOCK_EXCL) { > + if (unaligned_io) { > + /* If we are going to wait for other DIO to finish, bail */ > + if (iocb->ki_flags & IOCB_NOWAIT) { > + if (atomic_read(&inode->i_dio_count)) > + return -EAGAIN; > + } else { > + inode_dio_wait(inode); > + } > + } else if (iolock == XFS_IOLOCK_EXCL) { > xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); > iolock = XFS_IOLOCK_SHARED; > } > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c > index 94e5bdf7304c..8b0e3c1e086d 100644 > --- a/fs/xfs/xfs_iomap.c > +++ b/fs/xfs/xfs_iomap.c > @@ -1016,6 +1016,15 @@ xfs_file_iomap_begin( > > if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { > if (flags & IOMAP_DIRECT) { > + /* > + * A reflinked inode will result in CoW alloc. > + * FIXME: It could still overwrite on unshared extents > + * and not need allocation. > + */ > + if (flags & IOMAP_NOWAIT) { > + error = -EAGAIN; > + goto out_unlock; > + } > /* may drop and re-acquire the ilock */ > error = xfs_reflink_allocate_cow(ip, &imap, &shared, > &lockmode); > @@ -1033,6 +1042,14 @@ xfs_file_iomap_begin( > > if ((flags & IOMAP_WRITE) && imap_needs_alloc(inode, &imap, nimaps)) { > /* > + * If nowait is set bail since we are going to make > + * allocations. > + */ > + if (flags & IOMAP_NOWAIT) { > + error = -EAGAIN; > + goto out_unlock; > + } > + /* > * We cap the maximum length we map here to MAX_WRITEBACK_PAGES > * pages to keep the chunks of work done where somewhat symmetric > * with the work writeback does. This is a completely arbitrary > -- > 2.12.0 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Despite my previous reviewed-by tag this will need another fix: xfs_file_iomap_begin needs to return EAGAIN if we don't have the extent list in memoery already. E.g. something like this: if ((flags & IOMAP_NOWAIT) && !(ip->i_d.if_flags & XFS_IFEXTENTS)) { error = -EAGAIN; goto out_unlock; } right after locking the ilock.
On 05/28/2017 04:31 AM, Christoph Hellwig wrote: > Despite my previous reviewed-by tag this will need another fix: > > xfs_file_iomap_begin needs to return EAGAIN if we don't have the extent > list in memoery already. E.g. something like this: > > if ((flags & IOMAP_NOWAIT) && !(ip->i_d.if_flags & XFS_IFEXTENTS)) { > error = -EAGAIN; > goto out_unlock; > } > > right after locking the ilock. > I am not sure if it is right to penalize the application to write to file which has been freshly opened (and is the first one to open). It basically means extent maps needs to be read from disk. Do you see a reason it would have a non-deterministic wait if it is the only user? I understand the block layer can block if it has too many requests though.
On Sun 28-05-17 21:38:26, Goldwyn Rodrigues wrote: > On 05/28/2017 04:31 AM, Christoph Hellwig wrote: > > Despite my previous reviewed-by tag this will need another fix: > > > > xfs_file_iomap_begin needs to return EAGAIN if we don't have the extent > > list in memoery already. E.g. something like this: > > > > if ((flags & IOMAP_NOWAIT) && !(ip->i_d.if_flags & XFS_IFEXTENTS)) { > > error = -EAGAIN; > > goto out_unlock; > > } > > > > right after locking the ilock. > > I am not sure if it is right to penalize the application to write to > file which has been freshly opened (and is the first one to open). It > basically means extent maps needs to be read from disk. Do you see a > reason it would have a non-deterministic wait if it is the only user? I > understand the block layer can block if it has too many requests though. Well, submitting such write will have to wait for read of metadata from disk. That is certainly considered blocking so Christoph is right that we must return EAGAIN in such case. Honza
On Sun, May 28, 2017 at 09:38:26PM -0500, Goldwyn Rodrigues wrote: > > > On 05/28/2017 04:31 AM, Christoph Hellwig wrote: > > Despite my previous reviewed-by tag this will need another fix: > > > > xfs_file_iomap_begin needs to return EAGAIN if we don't have the extent > > list in memoery already. E.g. something like this: > > > > if ((flags & IOMAP_NOWAIT) && !(ip->i_d.if_flags & XFS_IFEXTENTS)) { > > error = -EAGAIN; > > goto out_unlock; > > } > > > > right after locking the ilock. > > > > I am not sure if it is right to penalize the application to write to > file which has been freshly opened (and is the first one to open). It > basically means extent maps needs to be read from disk. Do you see a > reason it would have a non-deterministic wait if it is the only user? I > understand the block layer can block if it has too many requests though. For either a read or a write we might have to read in the extent list (note that for few enough extents they are stored in the inode and we won't have to), in which case the call will block and by the semantics you define we'll need to return -EAGAIN. Btw, can you write a small blurb up for the man page to document these ѕemantics in man-page like language?
On 05/29/2017 03:33 AM, Christoph Hellwig wrote: > On Sun, May 28, 2017 at 09:38:26PM -0500, Goldwyn Rodrigues wrote: >> >> >> On 05/28/2017 04:31 AM, Christoph Hellwig wrote: >>> Despite my previous reviewed-by tag this will need another fix: >>> >>> xfs_file_iomap_begin needs to return EAGAIN if we don't have the extent >>> list in memoery already. E.g. something like this: >>> >>> if ((flags & IOMAP_NOWAIT) && !(ip->i_d.if_flags & XFS_IFEXTENTS)) { >>> error = -EAGAIN; >>> goto out_unlock; >>> } >>> >>> right after locking the ilock. >>> >> >> I am not sure if it is right to penalize the application to write to >> file which has been freshly opened (and is the first one to open). It >> basically means extent maps needs to be read from disk. Do you see a >> reason it would have a non-deterministic wait if it is the only user? I >> understand the block layer can block if it has too many requests though. > > For either a read or a write we might have to read in the extent list > (note that for few enough extents they are stored in the inode and > we won't have to), in which case the call will block and by the > semantics you define we'll need to return -EAGAIN. Yes, that is right. I will include it in. > > Btw, can you write a small blurb up for the man page to document these > ѕemantics in man-page like language? > Yes, but which man page would it belong to? Should it be a subsection of errors in io_getevents/io_submit. We don't want to add ERRORS to io_getevents() because it would be the return value of the io_getevents call, and not the ones in the iocb structure. Should it be a new man page, say for iocb(7/8)?
On Tue 30-05-17 11:13:29, Goldwyn Rodrigues wrote: > > Btw, can you write a small blurb up for the man page to document these > > ѕemantics in man-page like language? > > > > Yes, but which man page would it belong to? > Should it be a subsection of errors in io_getevents/io_submit. We don't > want to add ERRORS to io_getevents() because it would be the return > value of the io_getevents call, and not the ones in the iocb structure. > Should it be a new man page, say for iocb(7/8)? I think you should extend the manpage for io_submit(8). There you can add definition of struct iocb in 'DESCRIPTION' section explaining at least the most common fields. You can also explain there which flags can be passed and what are they intended to do. You can also expand EAGAIN error description to specifically mention that in case of NOWAIT aio EAGAIN can be returned if io submission would block. Honza
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 35703a801372..b307940e7d56 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -541,8 +541,11 @@ xfs_file_dio_aio_write( iolock = XFS_IOLOCK_SHARED; } - xfs_ilock(ip, iolock); - + if (!xfs_ilock_nowait(ip, iolock)) { + if (iocb->ki_flags & IOCB_NOWAIT) + return -EAGAIN; + xfs_ilock(ip, iolock); + } ret = xfs_file_aio_write_checks(iocb, from, &iolock); if (ret) goto out; @@ -553,9 +556,15 @@ xfs_file_dio_aio_write( * otherwise demote the lock if we had to take the exclusive lock * for other reasons in xfs_file_aio_write_checks. */ - if (unaligned_io) - inode_dio_wait(inode); - else if (iolock == XFS_IOLOCK_EXCL) { + if (unaligned_io) { + /* If we are going to wait for other DIO to finish, bail */ + if (iocb->ki_flags & IOCB_NOWAIT) { + if (atomic_read(&inode->i_dio_count)) + return -EAGAIN; + } else { + inode_dio_wait(inode); + } + } else if (iolock == XFS_IOLOCK_EXCL) { xfs_ilock_demote(ip, XFS_IOLOCK_EXCL); iolock = XFS_IOLOCK_SHARED; } diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c index 94e5bdf7304c..8b0e3c1e086d 100644 --- a/fs/xfs/xfs_iomap.c +++ b/fs/xfs/xfs_iomap.c @@ -1016,6 +1016,15 @@ xfs_file_iomap_begin( if ((flags & (IOMAP_WRITE | IOMAP_ZERO)) && xfs_is_reflink_inode(ip)) { if (flags & IOMAP_DIRECT) { + /* + * A reflinked inode will result in CoW alloc. + * FIXME: It could still overwrite on unshared extents + * and not need allocation. + */ + if (flags & IOMAP_NOWAIT) { + error = -EAGAIN; + goto out_unlock; + } /* may drop and re-acquire the ilock */ error = xfs_reflink_allocate_cow(ip, &imap, &shared, &lockmode); @@ -1033,6 +1042,14 @@ xfs_file_iomap_begin( if ((flags & IOMAP_WRITE) && imap_needs_alloc(inode, &imap, nimaps)) { /* + * If nowait is set bail since we are going to make + * allocations. + */ + if (flags & IOMAP_NOWAIT) { + error = -EAGAIN; + goto out_unlock; + } + /* * We cap the maximum length we map here to MAX_WRITEBACK_PAGES * pages to keep the chunks of work done where somewhat symmetric * with the work writeback does. This is a completely arbitrary