diff mbox series

[1/1] io_uring: optimise locking for local tw with submit_wait

Message ID 281fc79d98b5d91fe4778c5137a17a2ab4693e5c.1665088876.git.asml.silence@gmail.com (mailing list archive)
State New
Headers show
Series [1/1] io_uring: optimise locking for local tw with submit_wait | expand

Commit Message

Pavel Begunkov Oct. 6, 2022, 8:42 p.m. UTC
Running local task_work requires taking uring_lock, for submit + wait we
can try to run them right after submit while we still hold the lock and
save one lock/unlokc pair. The optimisation was implemented in the first
local tw patches but got dropped for simplicity.

Suggested-by: Dylan Yudaken <dylany@fb.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c | 12 ++++++++++--
 io_uring/io_uring.h |  7 +++++++
 2 files changed, 17 insertions(+), 2 deletions(-)

Comments

Jens Axboe Oct. 6, 2022, 8:59 p.m. UTC | #1
On 10/6/22 2:42 PM, Pavel Begunkov wrote:
> Running local task_work requires taking uring_lock, for submit + wait we
> can try to run them right after submit while we still hold the lock and
> save one lock/unlokc pair. The optimisation was implemented in the first
> local tw patches but got dropped for simplicity.
> 
> Suggested-by: Dylan Yudaken <dylany@fb.com>
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  io_uring/io_uring.c | 12 ++++++++++--
>  io_uring/io_uring.h |  7 +++++++
>  2 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 355fc1f3083d..b092473eca1d 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
>  			mutex_unlock(&ctx->uring_lock);
>  			goto out;
>  		}
> -		if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll)
> -			goto iopoll_locked;
> +		if (flags & IORING_ENTER_GETEVENTS) {
> +			if (ctx->syscall_iopoll)
> +				goto iopoll_locked;
> +			/*
> +			 * Ignore errors, we'll soon call io_cqring_wait() and
> +			 * it should handle ownership problems if any.
> +			 */
> +			if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
> +				(void)io_run_local_work_locked(ctx);
> +		}
>  		mutex_unlock(&ctx->uring_lock);
>  	}
>  
> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
> index e733d31f31d2..8504bc1f3839 100644
> --- a/io_uring/io_uring.h
> +++ b/io_uring/io_uring.h
> @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
>  	return ret;
>  }
>  
> +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
> +{
> +	if (llist_empty(&ctx->work_llist))
> +		return 0;
> +	return __io_run_local_work(ctx, true);
> +}

Do you have pending patches that also use this? If not, maybe we
should just keep it in io_uring.c? If you do, then this looks fine
to me rather than needing to shuffle it later.
Pavel Begunkov Oct. 6, 2022, 9:09 p.m. UTC | #2
On 10/6/22 21:59, Jens Axboe wrote:
> On 10/6/22 2:42 PM, Pavel Begunkov wrote:
>> Running local task_work requires taking uring_lock, for submit + wait we
>> can try to run them right after submit while we still hold the lock and
>> save one lock/unlokc pair. The optimisation was implemented in the first
>> local tw patches but got dropped for simplicity.
>>
>> Suggested-by: Dylan Yudaken <dylany@fb.com>
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>   io_uring/io_uring.c | 12 ++++++++++--
>>   io_uring/io_uring.h |  7 +++++++
>>   2 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index 355fc1f3083d..b092473eca1d 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
>>   			mutex_unlock(&ctx->uring_lock);
>>   			goto out;
>>   		}
>> -		if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll)
>> -			goto iopoll_locked;
>> +		if (flags & IORING_ENTER_GETEVENTS) {
>> +			if (ctx->syscall_iopoll)
>> +				goto iopoll_locked;
>> +			/*
>> +			 * Ignore errors, we'll soon call io_cqring_wait() and
>> +			 * it should handle ownership problems if any.
>> +			 */
>> +			if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
>> +				(void)io_run_local_work_locked(ctx);
>> +		}
>>   		mutex_unlock(&ctx->uring_lock);
>>   	}
>>   
>> diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
>> index e733d31f31d2..8504bc1f3839 100644
>> --- a/io_uring/io_uring.h
>> +++ b/io_uring/io_uring.h
>> @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
>>   	return ret;
>>   }
>>   
>> +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
>> +{
>> +	if (llist_empty(&ctx->work_llist))
>> +		return 0;
>> +	return __io_run_local_work(ctx, true);
>> +}
> 
> Do you have pending patches that also use this? If not, maybe we
> should just keep it in io_uring.c? If you do, then this looks fine
> to me rather than needing to shuffle it later.

No, I don't. I'd argue it's better as a helper because at least it
hides always confusing bool argument, and we'd also need to replace
a similar one in io_iopoll_check(). Add we can stick must_hold there
for even more clarity. But ultimately I don't care much.
Jens Axboe Oct. 6, 2022, 9:11 p.m. UTC | #3
On 10/6/22 3:09 PM, Pavel Begunkov wrote:
> On 10/6/22 21:59, Jens Axboe wrote:
>> On 10/6/22 2:42 PM, Pavel Begunkov wrote:
>>> Running local task_work requires taking uring_lock, for submit + wait we
>>> can try to run them right after submit while we still hold the lock and
>>> save one lock/unlokc pair. The optimisation was implemented in the first
>>> local tw patches but got dropped for simplicity.
>>>
>>> Suggested-by: Dylan Yudaken <dylany@fb.com>
>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>> ---
>>>   io_uring/io_uring.c | 12 ++++++++++--
>>>   io_uring/io_uring.h |  7 +++++++
>>>   2 files changed, 17 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index 355fc1f3083d..b092473eca1d 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -3224,8 +3224,16 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
>>>               mutex_unlock(&ctx->uring_lock);
>>>               goto out;
>>>           }
>>> -        if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll)
>>> -            goto iopoll_locked;
>>> +        if (flags & IORING_ENTER_GETEVENTS) {
>>> +            if (ctx->syscall_iopoll)
>>> +                goto iopoll_locked;
>>> +            /*
>>> +             * Ignore errors, we'll soon call io_cqring_wait() and
>>> +             * it should handle ownership problems if any.
>>> +             */
>>> +            if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
>>> +                (void)io_run_local_work_locked(ctx);
>>> +        }
>>>           mutex_unlock(&ctx->uring_lock);
>>>       }
>>>   diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
>>> index e733d31f31d2..8504bc1f3839 100644
>>> --- a/io_uring/io_uring.h
>>> +++ b/io_uring/io_uring.h
>>> @@ -275,6 +275,13 @@ static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
>>>       return ret;
>>>   }
>>>   +static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
>>> +{
>>> +    if (llist_empty(&ctx->work_llist))
>>> +        return 0;
>>> +    return __io_run_local_work(ctx, true);
>>> +}
>>
>> Do you have pending patches that also use this? If not, maybe we
>> should just keep it in io_uring.c? If you do, then this looks fine
>> to me rather than needing to shuffle it later.
> 
> No, I don't. I'd argue it's better as a helper because at least it
> hides always confusing bool argument, and we'd also need to replace
> a similar one in io_iopoll_check(). Add we can stick must_hold there
> for even more clarity. But ultimately I don't care much.

I really don't feel that strongly about it either, let's just keep
it the way it is.
Jens Axboe Oct. 6, 2022, 9:14 p.m. UTC | #4
On Thu, 6 Oct 2022 21:42:33 +0100, Pavel Begunkov wrote:
> Running local task_work requires taking uring_lock, for submit + wait we
> can try to run them right after submit while we still hold the lock and
> save one lock/unlokc pair. The optimisation was implemented in the first
> local tw patches but got dropped for simplicity.
> 
> 

Applied, thanks!

[1/1] io_uring: optimise locking for local tw with submit_wait
      commit: a2b61c4d8fcb005007bae5b2f007d43cba89baa1

Best regards,
diff mbox series

Patch

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 355fc1f3083d..b092473eca1d 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -3224,8 +3224,16 @@  SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 			mutex_unlock(&ctx->uring_lock);
 			goto out;
 		}
-		if ((flags & IORING_ENTER_GETEVENTS) && ctx->syscall_iopoll)
-			goto iopoll_locked;
+		if (flags & IORING_ENTER_GETEVENTS) {
+			if (ctx->syscall_iopoll)
+				goto iopoll_locked;
+			/*
+			 * Ignore errors, we'll soon call io_cqring_wait() and
+			 * it should handle ownership problems if any.
+			 */
+			if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
+				(void)io_run_local_work_locked(ctx);
+		}
 		mutex_unlock(&ctx->uring_lock);
 	}
 
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index e733d31f31d2..8504bc1f3839 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -275,6 +275,13 @@  static inline int io_run_task_work_ctx(struct io_ring_ctx *ctx)
 	return ret;
 }
 
+static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
+{
+	if (llist_empty(&ctx->work_llist))
+		return 0;
+	return __io_run_local_work(ctx, true);
+}
+
 static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
 {
 	if (!*locked) {