diff mbox series

io_uring: Require zeroed sqe->len on provided-buffers send

Message ID 20240429181556.31828-1-krisman@suse.de (mailing list archive)
State New
Headers show
Series io_uring: Require zeroed sqe->len on provided-buffers send | expand

Commit Message

Gabriel Krisman Bertazi April 29, 2024, 6:15 p.m. UTC
When sending from a provided buffer, we set sr->len to be the smallest
between the actual buffer size and sqe->len.  But, now that we
disconnect the buffer from the submission request, we can get in a
situation where the buffers and requests mismatch, and only part of a
buffer gets sent.  Assume:

* buf[1]->len = 128; buf[2]->len = 256
* sqe[1]->len = 128; sqe[2]->len = 256

If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
runs first, sqe[1] picks buff[2], and the last half of buff[2] is
never sent.

While arguably the use-case of different-length sends is questionable,
it has already raised confusion with potential users of this
feature. Let's make the interface less tricky by forcing the length to
only come from the buffer ring entry itself.

Fixes: ac5f71a3d9d7 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
 io_uring/net.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Jens Axboe April 30, 2024, 1:02 p.m. UTC | #1
On 4/29/24 12:15 PM, Gabriel Krisman Bertazi wrote:
> When sending from a provided buffer, we set sr->len to be the smallest
> between the actual buffer size and sqe->len.  But, now that we
> disconnect the buffer from the submission request, we can get in a
> situation where the buffers and requests mismatch, and only part of a
> buffer gets sent.  Assume:
> 
> * buf[1]->len = 128; buf[2]->len = 256
> * sqe[1]->len = 128; sqe[2]->len = 256
> 
> If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
> runs first, sqe[1] picks buff[2], and the last half of buff[2] is
> never sent.
> 
> While arguably the use-case of different-length sends is questionable,
> it has already raised confusion with potential users of this
> feature. Let's make the interface less tricky by forcing the length to
> only come from the buffer ring entry itself.
> 
> Fixes: ac5f71a3d9d7 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
> ---
>  io_uring/net.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/io_uring/net.c b/io_uring/net.c
> index 51c41d771c50..ffe37dd77a74 100644
> --- a/io_uring/net.c
> +++ b/io_uring/net.c
> @@ -423,6 +423,8 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  		sr->buf_group = req->buf_index;
>  		req->buf_list = NULL;
>  	}
> +	if (req->flags & REQ_F_BUFFER_SELECT && sr->len)
> +		return -EINVAL;
>  
>  #ifdef CONFIG_COMPAT
>  	if (req->ctx->compat)

Why not put it in io_send(), under io_do_buffer_select()? Then
you can get rid of the:

.max_len = min_not_zero(sr->len, INT_MAX),

and just do

.max_len = INT_MAX,

instead as well.
Gabriel Krisman Bertazi May 1, 2024, 8:47 p.m. UTC | #2
Jens Axboe <axboe@kernel.dk> writes:

> On 4/29/24 12:15 PM, Gabriel Krisman Bertazi wrote:
>> When sending from a provided buffer, we set sr->len to be the smallest
>> between the actual buffer size and sqe->len.  But, now that we
>> disconnect the buffer from the submission request, we can get in a
>> situation where the buffers and requests mismatch, and only part of a
>> buffer gets sent.  Assume:
>> 
>> * buf[1]->len = 128; buf[2]->len = 256
>> * sqe[1]->len = 128; sqe[2]->len = 256
>> 
>> If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
>> runs first, sqe[1] picks buff[2], and the last half of buff[2] is
>> never sent.
>> 
>> While arguably the use-case of different-length sends is questionable,
>> it has already raised confusion with potential users of this
>> feature. Let's make the interface less tricky by forcing the length to
>> only come from the buffer ring entry itself.
>> 
>> Fixes: ac5f71a3d9d7 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
>> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
>> ---
>>  io_uring/net.c | 2 ++
>>  1 file changed, 2 insertions(+)
>> 
>> diff --git a/io_uring/net.c b/io_uring/net.c
>> index 51c41d771c50..ffe37dd77a74 100644
>> --- a/io_uring/net.c
>> +++ b/io_uring/net.c
>> @@ -423,6 +423,8 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>  		sr->buf_group = req->buf_index;
>>  		req->buf_list = NULL;
>>  	}
>> +	if (req->flags & REQ_F_BUFFER_SELECT && sr->len)
>> +		return -EINVAL;
>>  
>>  #ifdef CONFIG_COMPAT
>>  	if (req->ctx->compat)
>
> Why not put it in io_send(), under io_do_buffer_select()? Then
> you can get rid of the:
>
> .max_len = min_not_zero(sr->len, INT_MAX),
>
> and just do
>
> .max_len = INT_MAX,
>

Mostly because I'd expect this kind of validation of userspace data to
be done early in ->prep, when we are consuming the sqe.  But more
importantly, if I read the code correctly, doing it under
io_do_buffer_select() in io_send() is more convoluted because we have
that backward jump in case we don't send the full set of buffers in the
bundle case, and we dirty sr->len with the actual returned buffer length.

since we already checked in prep, we can safely ignore it in the
io_do_buffer_select, anyway. What do you think of the below?

-- >8 --
Subject: [PATCH] io_uring: Require zeroed sqe->len on provided-buffers send

When sending from a provided buffer, we set sr->len to be the smallest
between the actual buffer size and sqe->len.  But, now that we
disconnect the buffer from the submission request, we can get in a
situation where the buffers and requests mismatch, and only part of a
buffer gets sent.  Assume:

* buf[1]->len = 128; buf[2]->len = 256
* sqe[1]->len = 128; sqe[2]->len = 256

If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
runs first, sqe[1] picks buff[2], and the last half of buff[2] is
never sent.

While arguably the use-case of different-length sends is questionable,
it has already raised confusion with potential users of this
feature. Let's make the interface less tricky by forcing the length to
only come from the buffer ring entry itself.

Fixes: ac5f71a3d9d7 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>

---
v2:
  - Disregard sr->len when selecting buffer in io_send()
---
 io_uring/net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/io_uring/net.c b/io_uring/net.c
index 51c41d771c50..cf43053a25b7 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -423,6 +423,8 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		sr->buf_group = req->buf_index;
 		req->buf_list = NULL;
 	}
+	if (req->flags & REQ_F_BUFFER_SELECT && sr->len)
+		return -EINVAL;
 
 #ifdef CONFIG_COMPAT
 	if (req->ctx->compat)
@@ -586,7 +588,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
 	if (io_do_buffer_select(req)) {
 		struct buf_sel_arg arg = {
 			.iovs = &kmsg->fast_iov,
-			.max_len = min_not_zero(sr->len, INT_MAX),
+			.max_len = INT_MAX,
 			.nr_iovs = 1,
 			.mode = KBUF_MODE_EXPAND,
 		};
Jens Axboe May 1, 2024, 8:55 p.m. UTC | #3
On 5/1/24 2:47 PM, Gabriel Krisman Bertazi wrote:
> Jens Axboe <axboe@kernel.dk> writes:
> 
>> On 4/29/24 12:15 PM, Gabriel Krisman Bertazi wrote:
>>> When sending from a provided buffer, we set sr->len to be the smallest
>>> between the actual buffer size and sqe->len.  But, now that we
>>> disconnect the buffer from the submission request, we can get in a
>>> situation where the buffers and requests mismatch, and only part of a
>>> buffer gets sent.  Assume:
>>>
>>> * buf[1]->len = 128; buf[2]->len = 256
>>> * sqe[1]->len = 128; sqe[2]->len = 256
>>>
>>> If sqe1 runs first, it picks buff[1] and it's all good. But, if sqe[2]
>>> runs first, sqe[1] picks buff[2], and the last half of buff[2] is
>>> never sent.
>>>
>>> While arguably the use-case of different-length sends is questionable,
>>> it has already raised confusion with potential users of this
>>> feature. Let's make the interface less tricky by forcing the length to
>>> only come from the buffer ring entry itself.
>>>
>>> Fixes: ac5f71a3d9d7 ("io_uring/net: add provided buffer support for IORING_OP_SEND")
>>> Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
>>> ---
>>>  io_uring/net.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/io_uring/net.c b/io_uring/net.c
>>> index 51c41d771c50..ffe37dd77a74 100644
>>> --- a/io_uring/net.c
>>> +++ b/io_uring/net.c
>>> @@ -423,6 +423,8 @@ int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>>  		sr->buf_group = req->buf_index;
>>>  		req->buf_list = NULL;
>>>  	}
>>> +	if (req->flags & REQ_F_BUFFER_SELECT && sr->len)
>>> +		return -EINVAL;
>>>  
>>>  #ifdef CONFIG_COMPAT
>>>  	if (req->ctx->compat)
>>
>> Why not put it in io_send(), under io_do_buffer_select()? Then
>> you can get rid of the:
>>
>> .max_len = min_not_zero(sr->len, INT_MAX),
>>
>> and just do
>>
>> .max_len = INT_MAX,
>>
> 
> Mostly because I'd expect this kind of validation of userspace data to
> be done early in ->prep, when we are consuming the sqe.  But more
> importantly, if I read the code correctly, doing it under
> io_do_buffer_select() in io_send() is more convoluted because we have
> that backward jump in case we don't send the full set of buffers in the
> bundle case, and we dirty sr->len with the actual returned buffer length.
> 
> since we already checked in prep, we can safely ignore it in the
> io_do_buffer_select, anyway. What do you think of the below?

Yep, I think that looks very reasonable. I'll queue it up, thanks!
diff mbox series

Patch

diff --git a/io_uring/net.c b/io_uring/net.c
index 51c41d771c50..ffe37dd77a74 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -423,6 +423,8 @@  int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		sr->buf_group = req->buf_index;
 		req->buf_list = NULL;
 	}
+	if (req->flags & REQ_F_BUFFER_SELECT && sr->len)
+		return -EINVAL;
 
 #ifdef CONFIG_COMPAT
 	if (req->ctx->compat)