diff mbox series

[v2,6/7] iomap: advance the iter directly on unshare range

Message ID 20250122133434.535192-7-bfoster@redhat.com (mailing list archive)
State New
Headers show
Series iomap: incremental per-operation iter advance | expand

Commit Message

Brian Foster Jan. 22, 2025, 1:34 p.m. UTC
Modify unshare range to advance the iter directly. Replace the local
pos and length calculations with direct advances and loop based on
iter state instead.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/iomap/buffered-io.c | 17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

Comments

Christoph Hellwig Jan. 28, 2025, 5:39 a.m. UTC | #1
On Wed, Jan 22, 2025 at 08:34:33AM -0500, Brian Foster wrote:
> +	size_t bytes = iomap_length(iter);

> +		bytes = min_t(u64, SIZE_MAX, bytes);

bytes needs to be a u64 for the min logic to work on 32-bit systems.
Brian Foster Jan. 28, 2025, 1:57 p.m. UTC | #2
On Mon, Jan 27, 2025 at 09:39:01PM -0800, Christoph Hellwig wrote:
> On Wed, Jan 22, 2025 at 08:34:33AM -0500, Brian Foster wrote:
> > +	size_t bytes = iomap_length(iter);
> 
> > +		bytes = min_t(u64, SIZE_MAX, bytes);
> 
> bytes needs to be a u64 for the min logic to work on 32-bit systems.
> 

Ah, thanks. FYI, I also have the following change from followon work to
fold into this to completely remove advances via iter.processed:

-       if (!iomap_want_unshare_iter(iter))
-               return bytes;
+       if (!iomap_want_unshare_iter(iter)) {
+               iomap_iter_advance(iter, bytes);
+               return 0;
+       }

And the analogous change in the next patch for zero range (unwritten &&
!range_dirty) as well.

Finally, I'm still working through converting the rest of the ops to use
iomap_iter_advance(), but I was thinking about renaming iter.processed
to iter.status as a final step. Thoughts on a rename in general or on
the actual name?

Brian
Brian Foster Jan. 28, 2025, 5:59 p.m. UTC | #3
On Mon, Jan 27, 2025 at 09:39:01PM -0800, Christoph Hellwig wrote:
> On Wed, Jan 22, 2025 at 08:34:33AM -0500, Brian Foster wrote:
> > +	size_t bytes = iomap_length(iter);
> 
> > +		bytes = min_t(u64, SIZE_MAX, bytes);
> 
> bytes needs to be a u64 for the min logic to work on 32-bit systems.
> 

Err.. I think there's another bug here. I changed iomap_iter_advance()
to return s64 so it could return length or an error, but never changed
bytes over from size_t.

But that raises another question. I'd want bytes to be s64 here to
support the current factoring, but iomap_length() returns a u64. In
poking around a bit I _think_ this is practically safe because the high
level operations are bound by loff_t (int64_t), so IIUC that means we
shouldn't actually see a length that doesn't fit in s64.

That said, that still seems a bit grotty. Perhaps one option could be to
tweak iomap_length() to return something like this:

	min_t(u64, SSIZE_MAX, end);

... to at least makes things explicit.

Another option could be to rework advance back to something like:

	int iomap_iter_advance(..., u64 *count);

... but where it returns 0 or -EIO and advances/updates *count directly.
That would mean I'd have to tweak some of the loop factoring and lift
out the error passthru assignment logic from iomap_iter(). The latter
doesn't seem like a big deal. It's mostly pointless after these changes.
I'd guess the (i.e. iomap_file_unshare()) loop logic would look more
like:

	do {
		...
		ret = iomap_iter_advance(iter, &bytes);
	} while (!ret && bytes > 0);

	return ret;

Hmm.. now that I write it out that doesn't seem so bad. It does clean up
the return path a bit. I think I'll play around with that, but let me
know if there are other thoughts or ideas..

Brian
Christoph Hellwig Jan. 29, 2025, 5:56 a.m. UTC | #4
On Tue, Jan 28, 2025 at 12:59:09PM -0500, Brian Foster wrote:
> But that raises another question. I'd want bytes to be s64 here to
> support the current factoring, but iomap_length() returns a u64. In
> poking around a bit I _think_ this is practically safe because the high
> level operations are bound by loff_t (int64_t), so IIUC that means we
> shouldn't actually see a length that doesn't fit in s64.
> 
> That said, that still seems a bit grotty. Perhaps one option could be to
> tweak iomap_length() to return something like this:
> 
> 	min_t(u64, SSIZE_MAX, end);
> 
> ... to at least makes things explicit.

Yeah.  I'm actually not sure why went want to support 64-bit ranges.
I don't even remember if this comes from Dave's really first version
or was my idea, but in hindsight just sticking to ssize_t bounds
would have been smarter.

> I'd guess the (i.e. iomap_file_unshare()) loop logic would look more
> like:
> 
> 	do {
> 		...
> 		ret = iomap_iter_advance(iter, &bytes);
> 	} while (!ret && bytes > 0);
> 
> 	return ret;
> 
> Hmm.. now that I write it out that doesn't seem so bad. It does clean up
> the return path a bit. I think I'll play around with that, but let me
> know if there are other thoughts or ideas..

Given that all the kernel read/write code mixes up bytes and negative
return values I think doing that in iomap is also fine.  But you are
deeper into the code right now, and if you think splitting the errno
and bytes is cleaner that sounds perfectly fine to me as well.  In
general not overloading a single return value with two things tends
to lead to cleaner code.

Although the above sniplet (I´m not sure how representative it is
anyway) would be a bit nicer as the slightly more verbose version
below:

	do {
		...
		ret = iomap_iter_advance(iter, &bytes);
		if (ret)
			return ret;
	} while (bytes > 0);
Christoph Hellwig Jan. 29, 2025, 5:58 a.m. UTC | #5
On Tue, Jan 28, 2025 at 08:57:27AM -0500, Brian Foster wrote:
> And the analogous change in the next patch for zero range (unwritten &&
> !range_dirty) as well.
> 
> Finally, I'm still working through converting the rest of the ops to use
> iomap_iter_advance(), but I was thinking about renaming iter.processed
> to iter.status as a final step. Thoughts on a rename in general or on
> the actual name?

Yeah, having a processed with either a count or status has proven to
not be the greatest design ever, and once that is gone picking a better
name is a good idea.  status sounds fine.  The variables tend to be
name error or err, maybe that's a little better than status?  I don't
really care strongly either way.
Brian Foster Jan. 29, 2025, 4:40 p.m. UTC | #6
On Tue, Jan 28, 2025 at 09:56:10PM -0800, Christoph Hellwig wrote:
> On Tue, Jan 28, 2025 at 12:59:09PM -0500, Brian Foster wrote:
> > But that raises another question. I'd want bytes to be s64 here to
> > support the current factoring, but iomap_length() returns a u64. In
> > poking around a bit I _think_ this is practically safe because the high
> > level operations are bound by loff_t (int64_t), so IIUC that means we
> > shouldn't actually see a length that doesn't fit in s64.
> > 
> > That said, that still seems a bit grotty. Perhaps one option could be to
> > tweak iomap_length() to return something like this:
> > 
> > 	min_t(u64, SSIZE_MAX, end);
> > 
> > ... to at least makes things explicit.
> 
> Yeah.  I'm actually not sure why went want to support 64-bit ranges.
> I don't even remember if this comes from Dave's really first version
> or was my idea, but in hindsight just sticking to ssize_t bounds
> would have been smarter.
> 

Ok, thanks.

> > I'd guess the (i.e. iomap_file_unshare()) loop logic would look more
> > like:
> > 
> > 	do {
> > 		...
> > 		ret = iomap_iter_advance(iter, &bytes);
> > 	} while (!ret && bytes > 0);
> > 
> > 	return ret;
> > 
> > Hmm.. now that I write it out that doesn't seem so bad. It does clean up
> > the return path a bit. I think I'll play around with that, but let me
> > know if there are other thoughts or ideas..
> 
> Given that all the kernel read/write code mixes up bytes and negative
> return values I think doing that in iomap is also fine.  But you are
> deeper into the code right now, and if you think splitting the errno
> and bytes is cleaner that sounds perfectly fine to me as well.  In
> general not overloading a single return value with two things tends
> to lead to cleaner code.
> 

Eh, I like the factoring that the combined return allows better, but I
don't want to get too clever and introduce type issues and whatnot in
the middle of these patches if I can help it. From what I see so far the
change to split out the error return uglifies things slightly in
iomap_iter(), but the flipside is that with the error check lifted out
the advance call from iomap_iter() can go away completely once
everything is switched over.

So if we do go with the int return for now (still testing), I might
revisit a change back to a combined s64 return (perhaps along with the
iomap_length() tweak above) in the future as a standalone cleanup when
this is all more settled and I have more mental bandwidth to think about
it. Thanks for the input.

> Although the above sniplet (I´m not sure how representative it is
> anyway) would be a bit nicer as the slightly more verbose version
> below:
> 
> 	do {
> 		...
> 		ret = iomap_iter_advance(iter, &bytes);
> 		if (ret)
> 			return ret;
> 	} while (bytes > 0);
> 

Ack.

Brian
diff mbox series

Patch

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 5ce5ac13765a..ea140d3098ff 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -1267,20 +1267,19 @@  EXPORT_SYMBOL_GPL(iomap_write_delalloc_release);
 static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 {
 	struct iomap *iomap = &iter->iomap;
-	loff_t pos = iter->pos;
-	loff_t length = iomap_length(iter);
-	loff_t written = 0;
+	size_t bytes = iomap_length(iter);
 
 	if (!iomap_want_unshare_iter(iter))
-		return length;
+		return bytes;
 
 	do {
 		struct folio *folio;
 		int status;
 		size_t offset;
-		size_t bytes = min_t(u64, SIZE_MAX, length);
+		loff_t pos = iter->pos;
 		bool ret;
 
+		bytes = min_t(u64, SIZE_MAX, bytes);
 		status = iomap_write_begin(iter, pos, bytes, &folio);
 		if (unlikely(status))
 			return status;
@@ -1298,14 +1297,10 @@  static loff_t iomap_unshare_iter(struct iomap_iter *iter)
 
 		cond_resched();
 
-		pos += bytes;
-		written += bytes;
-		length -= bytes;
-
 		balance_dirty_pages_ratelimited(iter->inode->i_mapping);
-	} while (length > 0);
+	} while ((bytes = iomap_iter_advance(iter, bytes)) > 0);
 
-	return written;
+	return bytes < 0 ? bytes : 0;
 }
 
 int