diff mbox series

[stable-6.1,1/1] io_uring: fix waiters missing wake ups

Message ID 760086647776a5aebfa77cfff728837d476a4fd8.1737718881.git.asml.silence@gmail.com (mailing list archive)
State New
Headers show
Series [stable-6.1,1/1] io_uring: fix waiters missing wake ups | expand

Commit Message

Pavel Begunkov Jan. 24, 2025, 6:53 p.m. UTC
[ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]

There are reports of mariadb hangs, which is caused by a missing
barrier in the waking code resulting in waiters losing events.

The problem was introduced in a backport
3ab9326f93ec4 ("io_uring: wake up optimisations"),
and the change restores the barrier present in the original commit
3ab9326f93ec4 ("io_uring: wake up optimisations")

Reported by: Xan Charbonnet <xan@charbonnet.com>
Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 io_uring/io_uring.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Jens Axboe Jan. 24, 2025, 8:47 p.m. UTC | #1
On 1/24/25 11:53 AM, Pavel Begunkov wrote:
> [ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]
> 
> There are reports of mariadb hangs, which is caused by a missing
> barrier in the waking code resulting in waiters losing events.
> 
> The problem was introduced in a backport
> 3ab9326f93ec4 ("io_uring: wake up optimisations"),
> and the change restores the barrier present in the original commit
> 3ab9326f93ec4 ("io_uring: wake up optimisations")
> 
> Reported by: Xan Charbonnet <xan@charbonnet.com>
> Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  io_uring/io_uring.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 9b58ba4616d40..e5a8ee944ef59 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -592,8 +592,10 @@ static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
>  	io_commit_cqring(ctx);
>  	spin_unlock(&ctx->completion_lock);
>  	io_commit_cqring_flush(ctx);
> -	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
> +	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
> +		smp_mb();
>  		__io_cqring_wake(ctx);
> +	}
>  }

We could probably just s/__io_cqring_wake/io_cqring_wake here to get
the same effect. Not that it really matters, it's just simpler.
Pavel Begunkov Jan. 24, 2025, 9:23 p.m. UTC | #2
On 1/24/25 20:47, Jens Axboe wrote:
> On 1/24/25 11:53 AM, Pavel Begunkov wrote:
>> [ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]
>>
>> There are reports of mariadb hangs, which is caused by a missing
>> barrier in the waking code resulting in waiters losing events.
>>
>> The problem was introduced in a backport
>> 3ab9326f93ec4 ("io_uring: wake up optimisations"),
>> and the change restores the barrier present in the original commit
>> 3ab9326f93ec4 ("io_uring: wake up optimisations")
>>
>> Reported by: Xan Charbonnet <xan@charbonnet.com>
>> Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
>> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>> ---
>>   io_uring/io_uring.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>> index 9b58ba4616d40..e5a8ee944ef59 100644
>> --- a/io_uring/io_uring.c
>> +++ b/io_uring/io_uring.c
>> @@ -592,8 +592,10 @@ static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
>>   	io_commit_cqring(ctx);
>>   	spin_unlock(&ctx->completion_lock);
>>   	io_commit_cqring_flush(ctx);
>> -	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
>> +	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
>> +		smp_mb();
>>   		__io_cqring_wake(ctx);
>> +	}
>>   }
> 
> We could probably just s/__io_cqring_wake/io_cqring_wake here to get
> the same effect. Not that it really matters, it's just simpler.

Right, I noticed but am keeping it closer to the original
in case we'd need to port more in the future.
Jens Axboe Jan. 24, 2025, 9:23 p.m. UTC | #3
On 1/24/25 2:23 PM, Pavel Begunkov wrote:
> On 1/24/25 20:47, Jens Axboe wrote:
>> On 1/24/25 11:53 AM, Pavel Begunkov wrote:
>>> [ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]
>>>
>>> There are reports of mariadb hangs, which is caused by a missing
>>> barrier in the waking code resulting in waiters losing events.
>>>
>>> The problem was introduced in a backport
>>> 3ab9326f93ec4 ("io_uring: wake up optimisations"),
>>> and the change restores the barrier present in the original commit
>>> 3ab9326f93ec4 ("io_uring: wake up optimisations")
>>>
>>> Reported by: Xan Charbonnet <xan@charbonnet.com>
>>> Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
>>> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
>>> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
>>> ---
>>>   io_uring/io_uring.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
>>> index 9b58ba4616d40..e5a8ee944ef59 100644
>>> --- a/io_uring/io_uring.c
>>> +++ b/io_uring/io_uring.c
>>> @@ -592,8 +592,10 @@ static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
>>>       io_commit_cqring(ctx);
>>>       spin_unlock(&ctx->completion_lock);
>>>       io_commit_cqring_flush(ctx);
>>> -    if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
>>> +    if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
>>> +        smp_mb();
>>>           __io_cqring_wake(ctx);
>>> +    }
>>>   }
>>
>> We could probably just s/__io_cqring_wake/io_cqring_wake here to get
>> the same effect. Not that it really matters, it's just simpler.
> 
> Right, I noticed but am keeping it closer to the original
> in case we'd need to port more in the future.

Yep that's fine, let's just go with this one as-is.
lizetao Jan. 25, 2025, 6:59 a.m. UTC | #4
> -----Original Message-----
> From: Pavel Begunkov <asml.silence@gmail.com>
> Sent: Saturday, January 25, 2025 2:54 AM
> To: io-uring@vger.kernel.org; stable@vger.kernel.org
> Cc: asml.silence@gmail.com; Xan Charbonnet <xan@charbonnet.com>;
> Salvatore Bonaccorso <carnil@debian.org>
> Subject: [PATCH stable-6.1 1/1] io_uring: fix waiters missing wake ups
> 
> [ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]
> 
> There are reports of mariadb hangs, which is caused by a missing barrier in the
> waking code resulting in waiters losing events.
> 
> The problem was introduced in a backport
> 3ab9326f93ec4 ("io_uring: wake up optimisations"), and the change restores
> the barrier present in the original commit
> 3ab9326f93ec4 ("io_uring: wake up optimisations")
> 
> Reported by: Xan Charbonnet <xan@charbonnet.com>
> Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
> Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> ---
>  io_uring/io_uring.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index
> 9b58ba4616d40..e5a8ee944ef59 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -592,8 +592,10 @@ static inline void __io_cq_unlock_post_flush(struct
> io_ring_ctx *ctx)
>  	io_commit_cqring(ctx);
>  	spin_unlock(&ctx->completion_lock);
>  	io_commit_cqring_flush(ctx);
> -	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
> +	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
> +		smp_mb();
>  		__io_cqring_wake(ctx);
> +	}
>  }
> 
>  void io_cq_unlock_post(struct io_ring_ctx *ctx)
> --
> 2.47.1
> 

Reviewed-by: Li Zetao <lizetao1@huawei.com>

--
Li Zetao
Greg KH Jan. 30, 2025, 8:51 a.m. UTC | #5
On Sat, Jan 25, 2025 at 06:59:06AM +0000, lizetao wrote:
> 
> 
> > -----Original Message-----
> > From: Pavel Begunkov <asml.silence@gmail.com>
> > Sent: Saturday, January 25, 2025 2:54 AM
> > To: io-uring@vger.kernel.org; stable@vger.kernel.org
> > Cc: asml.silence@gmail.com; Xan Charbonnet <xan@charbonnet.com>;
> > Salvatore Bonaccorso <carnil@debian.org>
> > Subject: [PATCH stable-6.1 1/1] io_uring: fix waiters missing wake ups
> > 
> > [ upstream commit 3181e22fb79910c7071e84a43af93ac89e8a7106 ]
> > 
> > There are reports of mariadb hangs, which is caused by a missing barrier in the
> > waking code resulting in waiters losing events.
> > 
> > The problem was introduced in a backport
> > 3ab9326f93ec4 ("io_uring: wake up optimisations"), and the change restores
> > the barrier present in the original commit
> > 3ab9326f93ec4 ("io_uring: wake up optimisations")
> > 
> > Reported by: Xan Charbonnet <xan@charbonnet.com>
> > Fixes: 3ab9326f93ec4 ("io_uring: wake up optimisations")
> > Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1093243#99
> > Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
> > ---
> >  io_uring/io_uring.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index
> > 9b58ba4616d40..e5a8ee944ef59 100644
> > --- a/io_uring/io_uring.c
> > +++ b/io_uring/io_uring.c
> > @@ -592,8 +592,10 @@ static inline void __io_cq_unlock_post_flush(struct
> > io_ring_ctx *ctx)
> >  	io_commit_cqring(ctx);
> >  	spin_unlock(&ctx->completion_lock);
> >  	io_commit_cqring_flush(ctx);
> > -	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
> > +	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
> > +		smp_mb();
> >  		__io_cqring_wake(ctx);
> > +	}
> >  }
> > 
> >  void io_cq_unlock_post(struct io_ring_ctx *ctx)
> > --
> > 2.47.1
> > 
> 
> Reviewed-by: Li Zetao <lizetao1@huawei.com>

Now queued up, thanks.

greg k-h
diff mbox series

Patch

diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 9b58ba4616d40..e5a8ee944ef59 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -592,8 +592,10 @@  static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
 	io_commit_cqring(ctx);
 	spin_unlock(&ctx->completion_lock);
 	io_commit_cqring_flush(ctx);
-	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN))
+	if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
+		smp_mb();
 		__io_cqring_wake(ctx);
+	}
 }
 
 void io_cq_unlock_post(struct io_ring_ctx *ctx)