mbox series

[v2,0/2] iter revert problems

Message ID cover.1628780390.git.asml.silence@gmail.com (mailing list archive)
Headers show
Series iter revert problems | expand

Message

Pavel Begunkov Aug. 12, 2021, 8:40 p.m. UTC
For the bug description see 2/2. As mentioned there the current problems
is because of generic_write_checks(), but there was also a similar case
fixed in 5.12, which should have been triggerable by normal
write(2)/read(2) and others.

It may be better to enforce reexpands as a long term solution, but for
now this patchset is quickier and easier to backport.

v2: don't fail it has been justly fully reverted

Pavel Begunkov (2):
  iov_iter: mark truncated iters
  io_uring: don't retry with truncated iter

 fs/io_uring.c       | 16 ++++++++++++++++
 include/linux/uio.h |  5 ++++-
 2 files changed, 20 insertions(+), 1 deletion(-)

Comments

Jens Axboe Aug. 16, 2021, 3:35 p.m. UTC | #1
On 8/12/21 2:40 PM, Pavel Begunkov wrote:
> For the bug description see 2/2. As mentioned there the current problems
> is because of generic_write_checks(), but there was also a similar case
> fixed in 5.12, which should have been triggerable by normal
> write(2)/read(2) and others.
> 
> It may be better to enforce reexpands as a long term solution, but for
> now this patchset is quickier and easier to backport.
> 
> v2: don't fail it has been justly fully reverted

Al, what do you think of this approach? It'll fix the issue, but might be
cleaner to have iov->truncated actually track the truncated size. That'd
make it a more complete solution, at the expense of bloat iov_iter which
this version will not.
Pavel Begunkov Aug. 21, 2021, 2:24 p.m. UTC | #2
On 8/12/21 9:40 PM, Pavel Begunkov wrote:
> For the bug description see 2/2. As mentioned there the current problems
> is because of generic_write_checks(), but there was also a similar case
> fixed in 5.12, which should have been triggerable by normal
> write(2)/read(2) and others.
> 
> It may be better to enforce reexpands as a long term solution, but for
> now this patchset is quickier and easier to backport.

We need to do something with this, hopefully soon.


> v2: don't fail it has been justly fully reverted
> 
> Pavel Begunkov (2):
>   iov_iter: mark truncated iters
>   io_uring: don't retry with truncated iter
> 
>  fs/io_uring.c       | 16 ++++++++++++++++
>  include/linux/uio.h |  5 ++++-
>  2 files changed, 20 insertions(+), 1 deletion(-)
>
Al Viro Aug. 21, 2021, 10:25 p.m. UTC | #3
On Sat, Aug 21, 2021 at 03:24:28PM +0100, Pavel Begunkov wrote:
> On 8/12/21 9:40 PM, Pavel Begunkov wrote:
> > For the bug description see 2/2. As mentioned there the current problems
> > is because of generic_write_checks(), but there was also a similar case
> > fixed in 5.12, which should have been triggerable by normal
> > write(2)/read(2) and others.
> > 
> > It may be better to enforce reexpands as a long term solution, but for
> > now this patchset is quickier and easier to backport.
> 
> We need to do something with this, hopefully soon.

I still don't like that approach ;-/  If anything, I would rather do
something like this, and to hell with one extra word on stack in
several functions; at least that way the semantics is easy to describe.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/io_uring.c b/fs/io_uring.c
index d94fb5835a20..5501f8b3af3b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3420,6 +3420,7 @@ static int io_write(struct io_kiocb *req, unsigned int issue_flags)
 	} else {
 copy_iov:
 		/* some cases will consume bytes even on error returns */
+		iov_iter_reexpand(iter, iter->count + iter->truncated);
 		iov_iter_revert(iter, io_size - iov_iter_count(iter));
 		ret = io_setup_async_rw(req, iovec, inline_vecs, iter, false);
 		return ret ?: -EAGAIN;
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 82c3c3e819e0..5265024e8b90 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -47,6 +47,7 @@ struct iov_iter {
 		};
 		loff_t xarray_start;
 	};
+	size_t truncated;
 };
 
 static inline enum iter_type iov_iter_type(const struct iov_iter *i)
@@ -254,8 +255,10 @@ static inline void iov_iter_truncate(struct iov_iter *i, u64 count)
 	 * conversion in assignement is by definition greater than all
 	 * values of size_t, including old i->count.
 	 */
-	if (i->count > count)
+	if (i->count > count) {
+		i->truncated += i->count - count;
 		i->count = count;
+	}
 }
 
 /*
@@ -264,6 +267,7 @@ static inline void iov_iter_truncate(struct iov_iter *i, u64 count)
  */
 static inline void iov_iter_reexpand(struct iov_iter *i, size_t count)
 {
+	i->truncated -= count - i->count;
 	i->count = count;
 }
Jens Axboe Aug. 21, 2021, 11:13 p.m. UTC | #4
On 8/21/21 4:25 PM, Al Viro wrote:
> On Sat, Aug 21, 2021 at 03:24:28PM +0100, Pavel Begunkov wrote:
>> On 8/12/21 9:40 PM, Pavel Begunkov wrote:
>>> For the bug description see 2/2. As mentioned there the current problems
>>> is because of generic_write_checks(), but there was also a similar case
>>> fixed in 5.12, which should have been triggerable by normal
>>> write(2)/read(2) and others.
>>>
>>> It may be better to enforce reexpands as a long term solution, but for
>>> now this patchset is quickier and easier to backport.
>>
>> We need to do something with this, hopefully soon.
> 
> I still don't like that approach ;-/  If anything, I would rather do
> something like this, and to hell with one extra word on stack in
> several functions; at least that way the semantics is easy to describe.

Pavel suggested this very approach initially as well when we discussed
it, and if you're fine with the extra size_t, it is by far the best way
to get this done and not have a wonky/fragile API.
David Laight Aug. 23, 2021, 12:28 p.m. UTC | #5
From: Jens Axboe
> Sent: 22 August 2021 00:14
> 
> On 8/21/21 4:25 PM, Al Viro wrote:
> > On Sat, Aug 21, 2021 at 03:24:28PM +0100, Pavel Begunkov wrote:
> >> On 8/12/21 9:40 PM, Pavel Begunkov wrote:
> >>> For the bug description see 2/2. As mentioned there the current problems
> >>> is because of generic_write_checks(), but there was also a similar case
> >>> fixed in 5.12, which should have been triggerable by normal
> >>> write(2)/read(2) and others.
> >>>
> >>> It may be better to enforce reexpands as a long term solution, but for
> >>> now this patchset is quickier and easier to backport.
> >>
> >> We need to do something with this, hopefully soon.
> >
> > I still don't like that approach ;-/  If anything, I would rather do
> > something like this, and to hell with one extra word on stack in
> > several functions; at least that way the semantics is easy to describe.
> 
> Pavel suggested this very approach initially as well when we discussed
> it, and if you're fine with the extra size_t, it is by far the best way
> to get this done and not have a wonky/fragile API.

All (well maybe almost all) the users of iov_iter have the
short iov[] cache and the pointer to the big iov[] to kfree()
allocated together with the iov_iter structure itself.
These are almost always on stack.

Putting the whole lot together in a single structure would
make the call sequences a lot less complex and wouldn't use
any more stack/data is almost all the cases.

It would also mean that the 'iter' code could always have a pointer
to the base of the original iov[] list.
The lack of which is probably makes the 'revert' code hard?

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)