diff mbox series

[v3,09/12] xfs: Add xfs_file_dio_write_atomic()

Message ID 20250227180813.1553404-10-john.g.garry@oracle.com (mailing list archive)
State New
Headers show
Series large atomic writes for xfs with CoW | expand

Commit Message

John Garry Feb. 27, 2025, 6:08 p.m. UTC
Add xfs_file_dio_write_atomic() for dedicated handling of atomic writes.

In case of -EAGAIN being returned from iomap_dio_rw(), reissue the write
in CoW-based atomic write mode.

For CoW-based mode, ensure that we have no outstanding IOs which we
may trample on.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 fs/xfs/xfs_file.c | 42 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

Comments

Darrick J. Wong Feb. 28, 2025, 1:19 a.m. UTC | #1
On Thu, Feb 27, 2025 at 06:08:10PM +0000, John Garry wrote:
> Add xfs_file_dio_write_atomic() for dedicated handling of atomic writes.
> 
> In case of -EAGAIN being returned from iomap_dio_rw(), reissue the write
> in CoW-based atomic write mode.
> 
> For CoW-based mode, ensure that we have no outstanding IOs which we
> may trample on.
> 
> Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  fs/xfs/xfs_file.c | 42 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 258c82cbce12..76ea59c638c3 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -619,6 +619,46 @@ xfs_file_dio_write_aligned(
>  	return ret;
>  }
>  
> +static noinline ssize_t
> +xfs_file_dio_write_atomic(
> +	struct xfs_inode	*ip,
> +	struct kiocb		*iocb,
> +	struct iov_iter		*from)
> +{
> +	unsigned int		iolock = XFS_IOLOCK_SHARED;
> +	unsigned int		dio_flags = 0;
> +	ssize_t			ret;
> +
> +retry:
> +	ret = xfs_ilock_iocb_for_write(iocb, &iolock);
> +	if (ret)
> +		return ret;
> +
> +	ret = xfs_file_write_checks(iocb, from, &iolock);
> +	if (ret)
> +		goto out_unlock;
> +
> +	if (dio_flags & IOMAP_DIO_FORCE_WAIT)
> +		inode_dio_wait(VFS_I(ip));
> +
> +	trace_xfs_file_direct_write(iocb, from);
> +	ret = iomap_dio_rw(iocb, from, &xfs_atomic_write_iomap_ops,
> +			&xfs_dio_write_ops, dio_flags, NULL, 0);
> +
> +	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
> +	    !(dio_flags & IOMAP_DIO_ATOMIC_SW)) {
> +		xfs_iunlock(ip, iolock);
> +		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;

One last little nit here: if the filesystem doesn't have reflink, you
can't use copy on write as a fallback.

		/*
		 * The atomic write fallback uses out of place writes
		 * implemented with the COW code, so we must fail the
		 * atomic write if that is not supported.
		 */
		if (!xfs_has_reflink(ip->i_mount))
			return -EOPNOTSUPP;
		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;

You can retain my RVB if you add that.

--D

> +		iolock = XFS_IOLOCK_EXCL;
> +		goto retry;
> +	}
> +
> +out_unlock:
> +	if (iolock)
> +		xfs_iunlock(ip, iolock);
> +	return ret;
> +}
> +
>  /*
>   * Handle block unaligned direct I/O writes
>   *
> @@ -723,6 +763,8 @@ xfs_file_dio_write(
>  		return -EINVAL;
>  	if ((iocb->ki_pos | count) & ip->i_mount->m_blockmask)
>  		return xfs_file_dio_write_unaligned(ip, iocb, from);
> +	if (iocb->ki_flags & IOCB_ATOMIC)
> +		return xfs_file_dio_write_atomic(ip, iocb, from);
>  	return xfs_file_dio_write_aligned(ip, iocb, from);
>  }
>  
> -- 
> 2.31.1
>
John Garry Feb. 28, 2025, 7:45 a.m. UTC | #2
On 28/02/2025 01:19, Darrick J. Wong wrote:
>> +	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
>> +	    !(dio_flags & IOMAP_DIO_ATOMIC_SW)) {
>> +		xfs_iunlock(ip, iolock);
>> +		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
> One last little nit here: if the filesystem doesn't have reflink, you
> can't use copy on write as a fallback.
> 
> 		/*
> 		 * The atomic write fallback uses out of place writes
> 		 * implemented with the COW code, so we must fail the
> 		 * atomic write if that is not supported.
> 		 */
> 		if (!xfs_has_reflink(ip->i_mount))
> 			return -EOPNOTSUPP;
> 		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
> 

Currently the awu max is limited to 1x FS block if no reflink, and then 
we check the write length against awu max in xfs_file_write_iter() for 
IOCB_ATOMIC. And the xfs iomap would not request a SW-based atomic write 
for 1x FS block. So in a around-about way we are checking it.

So let me know if you would still like that additional check - it seems 
sensible to add it.

Cheers,
John
Darrick J. Wong Feb. 28, 2025, 3:39 p.m. UTC | #3
On Fri, Feb 28, 2025 at 07:45:59AM +0000, John Garry wrote:
> On 28/02/2025 01:19, Darrick J. Wong wrote:
> > > +	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
> > > +	    !(dio_flags & IOMAP_DIO_ATOMIC_SW)) {
> > > +		xfs_iunlock(ip, iolock);
> > > +		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
> > One last little nit here: if the filesystem doesn't have reflink, you
> > can't use copy on write as a fallback.
> > 
> > 		/*
> > 		 * The atomic write fallback uses out of place writes
> > 		 * implemented with the COW code, so we must fail the
> > 		 * atomic write if that is not supported.
> > 		 */
> > 		if (!xfs_has_reflink(ip->i_mount))
> > 			return -EOPNOTSUPP;
> > 		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
> > 
> 
> Currently the awu max is limited to 1x FS block if no reflink, and then we
> check the write length against awu max in xfs_file_write_iter() for
> IOCB_ATOMIC. And the xfs iomap would not request a SW-based atomic write for
> 1x FS block. So in a around-about way we are checking it.
> 
> So let me know if you would still like that additional check - it seems
> sensible to add it.

Yes, please.  The more guardrails the better, particularly when someone
gets around to enabling software-only RWF_ATOMIC.

--D

> Cheers,
> John
> 
>
John Garry March 3, 2025, 1:45 p.m. UTC | #4
On 28/02/2025 15:39, Darrick J. Wong wrote:
>>> One last little nit here: if the filesystem doesn't have reflink, you
>>> can't use copy on write as a fallback.
>>>
>>> 		/*
>>> 		 * The atomic write fallback uses out of place writes
>>> 		 * implemented with the COW code, so we must fail the
>>> 		 * atomic write if that is not supported.
>>> 		 */
>>> 		if (!xfs_has_reflink(ip->i_mount))
>>> 			return -EOPNOTSUPP;
>>> 		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
>>>
>> Currently the awu max is limited to 1x FS block if no reflink, and then we
>> check the write length against awu max in xfs_file_write_iter() for
>> IOCB_ATOMIC. And the xfs iomap would not request a SW-based atomic write for
>> 1x FS block. So in a around-about way we are checking it.
>>
>> So let me know if you would still like that additional check - it seems
>> sensible to add it.
> Yes, please.  The more guardrails the better, particularly when someone
> gets around to enabling software-only RWF_ATOMIC.

ok, but I think that adding it at the start of 
xfs_atomic_write_sw_iomap_begin() is a better place (to add it).

It seems a bit neater (than adding it here) with the retry handling and 
locking/unlocking, and would save adding it in another possible future 
callsite for xfs_atomic_write_sw_iomap_begin().

Cheers,
John
diff mbox series

Patch

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 258c82cbce12..76ea59c638c3 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -619,6 +619,46 @@  xfs_file_dio_write_aligned(
 	return ret;
 }
 
+static noinline ssize_t
+xfs_file_dio_write_atomic(
+	struct xfs_inode	*ip,
+	struct kiocb		*iocb,
+	struct iov_iter		*from)
+{
+	unsigned int		iolock = XFS_IOLOCK_SHARED;
+	unsigned int		dio_flags = 0;
+	ssize_t			ret;
+
+retry:
+	ret = xfs_ilock_iocb_for_write(iocb, &iolock);
+	if (ret)
+		return ret;
+
+	ret = xfs_file_write_checks(iocb, from, &iolock);
+	if (ret)
+		goto out_unlock;
+
+	if (dio_flags & IOMAP_DIO_FORCE_WAIT)
+		inode_dio_wait(VFS_I(ip));
+
+	trace_xfs_file_direct_write(iocb, from);
+	ret = iomap_dio_rw(iocb, from, &xfs_atomic_write_iomap_ops,
+			&xfs_dio_write_ops, dio_flags, NULL, 0);
+
+	if (ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT) &&
+	    !(dio_flags & IOMAP_DIO_ATOMIC_SW)) {
+		xfs_iunlock(ip, iolock);
+		dio_flags = IOMAP_DIO_ATOMIC_SW | IOMAP_DIO_FORCE_WAIT;
+		iolock = XFS_IOLOCK_EXCL;
+		goto retry;
+	}
+
+out_unlock:
+	if (iolock)
+		xfs_iunlock(ip, iolock);
+	return ret;
+}
+
 /*
  * Handle block unaligned direct I/O writes
  *
@@ -723,6 +763,8 @@  xfs_file_dio_write(
 		return -EINVAL;
 	if ((iocb->ki_pos | count) & ip->i_mount->m_blockmask)
 		return xfs_file_dio_write_unaligned(ip, iocb, from);
+	if (iocb->ki_flags & IOCB_ATOMIC)
+		return xfs_file_dio_write_atomic(ip, iocb, from);
 	return xfs_file_dio_write_aligned(ip, iocb, from);
 }