diff mbox series

[V6,5/8] io_uring: support sqe group with members depending on leader

Message ID 20240912104933.1875409-6-ming.lei@redhat.com (mailing list archive)
State New
Headers show
Series io_uring: support sqe group and provide group kbuf | expand

Commit Message

Ming Lei Sept. 12, 2024, 10:49 a.m. UTC
IOSQE_SQE_GROUP just starts to queue members after the leader is completed,
which way is just for simplifying implementation, and this behavior is never
part of UAPI, and it may be relaxed and members can be queued concurrently
with leader in future.

However, some resource can't cross OPs, such as kernel buffer, otherwise
the buffer may be leaked easily in case that any OP failure or application
panic.

Add flag REQ_F_SQE_GROUP_DEP for allowing members to depend on group leader
explicitly, so that group members won't be queued until the leader request is
completed, the kernel resource lifetime can be aligned with group leader
or group, one typical use case is to support zero copy for device internal
buffer.

Signed-off-by: Ming Lei <ming.lei@redhat.com>
---
 include/linux/io_uring_types.h | 3 +++
 1 file changed, 3 insertions(+)

Comments

Pavel Begunkov Oct. 4, 2024, 1:18 p.m. UTC | #1
On 9/12/24 11:49, Ming Lei wrote:
> IOSQE_SQE_GROUP just starts to queue members after the leader is completed,
> which way is just for simplifying implementation, and this behavior is never
> part of UAPI, and it may be relaxed and members can be queued concurrently
> with leader in future.
> 
> However, some resource can't cross OPs, such as kernel buffer, otherwise
> the buffer may be leaked easily in case that any OP failure or application
> panic.
> 
> Add flag REQ_F_SQE_GROUP_DEP for allowing members to depend on group leader
> explicitly, so that group members won't be queued until the leader request is
> completed, the kernel resource lifetime can be aligned with group leader

That's the current and only behaviour, we don't need an extra flag
for that. We can add it back later when anything changes.

> or group, one typical use case is to support zero copy for device internal
> buffer.
> 
> Signed-off-by: Ming Lei <ming.lei@redhat.com>
> ---
>   include/linux/io_uring_types.h | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
> index 11c6726abbb9..793d5a26d9b8 100644
> --- a/include/linux/io_uring_types.h
> +++ b/include/linux/io_uring_types.h
> @@ -472,6 +472,7 @@ enum {
>   	REQ_F_BL_NO_RECYCLE_BIT,
>   	REQ_F_BUFFERS_COMMIT_BIT,
>   	REQ_F_SQE_GROUP_LEADER_BIT,
> +	REQ_F_SQE_GROUP_DEP_BIT,
>   
>   	/* not a real bit, just to check we're not overflowing the space */
>   	__REQ_F_LAST_BIT,
> @@ -554,6 +555,8 @@ enum {
>   	REQ_F_BUFFERS_COMMIT	= IO_REQ_FLAG(REQ_F_BUFFERS_COMMIT_BIT),
>   	/* sqe group lead */
>   	REQ_F_SQE_GROUP_LEADER	= IO_REQ_FLAG(REQ_F_SQE_GROUP_LEADER_BIT),
> +	/* sqe group with members depending on leader */
> +	REQ_F_SQE_GROUP_DEP	= IO_REQ_FLAG(REQ_F_SQE_GROUP_DEP_BIT),
>   };
>   
>   typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);
Ming Lei Oct. 6, 2024, 3:54 a.m. UTC | #2
On Fri, Oct 04, 2024 at 02:18:13PM +0100, Pavel Begunkov wrote:
> On 9/12/24 11:49, Ming Lei wrote:
> > IOSQE_SQE_GROUP just starts to queue members after the leader is completed,
> > which way is just for simplifying implementation, and this behavior is never
> > part of UAPI, and it may be relaxed and members can be queued concurrently
> > with leader in future.
> > 
> > However, some resource can't cross OPs, such as kernel buffer, otherwise
> > the buffer may be leaked easily in case that any OP failure or application
> > panic.
> > 
> > Add flag REQ_F_SQE_GROUP_DEP for allowing members to depend on group leader
> > explicitly, so that group members won't be queued until the leader request is
> > completed, the kernel resource lifetime can be aligned with group leader
> 
> That's the current and only behaviour, we don't need an extra flag
> for that. We can add it back later when anything changes.

OK.

Thanks, 
Ming
diff mbox series

Patch

diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 11c6726abbb9..793d5a26d9b8 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -472,6 +472,7 @@  enum {
 	REQ_F_BL_NO_RECYCLE_BIT,
 	REQ_F_BUFFERS_COMMIT_BIT,
 	REQ_F_SQE_GROUP_LEADER_BIT,
+	REQ_F_SQE_GROUP_DEP_BIT,
 
 	/* not a real bit, just to check we're not overflowing the space */
 	__REQ_F_LAST_BIT,
@@ -554,6 +555,8 @@  enum {
 	REQ_F_BUFFERS_COMMIT	= IO_REQ_FLAG(REQ_F_BUFFERS_COMMIT_BIT),
 	/* sqe group lead */
 	REQ_F_SQE_GROUP_LEADER	= IO_REQ_FLAG(REQ_F_SQE_GROUP_LEADER_BIT),
+	/* sqe group with members depending on leader */
+	REQ_F_SQE_GROUP_DEP	= IO_REQ_FLAG(REQ_F_SQE_GROUP_DEP_BIT),
 };
 
 typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);