diff mbox series

blk-throttle: fix zero wait time for iops throttled group

Message ID 156259979778.2486.6296077059654653057.stgit@buzz (mailing list archive)
State New, archived
Headers show
Series blk-throttle: fix zero wait time for iops throttled group | expand

Commit Message

Konstantin Khlebnikov July 8, 2019, 3:29 p.m. UTC
After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
limit is enforced") wait time could be zero even if group is throttled and
cannot issue requests right now. As a result throtl_select_dispatch() turns
into busy-loop under irq-safe queue spinlock.

Fix is simple: always round up target time to the next throttle slice.

Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: stable@vger.kernel.org # v4.19+
---
 block/blk-throttle.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

Comments

Liu Bo July 8, 2019, 7:08 p.m. UTC | #1
On Mon, Jul 08, 2019 at 06:29:57PM +0300, Konstantin Khlebnikov wrote:
> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
> limit is enforced") wait time could be zero even if group is throttled and
> cannot issue requests right now. As a result throtl_select_dispatch() turns
> into busy-loop under irq-safe queue spinlock.
> 
> Fix is simple: always round up target time to the next throttle slice.
> 
> Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: stable@vger.kernel.org # v4.19+
> ---
>  block/blk-throttle.c |    9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 9ea7c0ecad10..8ab6c8153223 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -881,13 +881,10 @@ static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio,
>  	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
>  	u64 tmp;
>  
> -	jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
> -
> -	/* Slice has just started. Consider one slice interval */
> -	if (!jiffy_elapsed)
> -		jiffy_elapsed_rnd = tg->td->throtl_slice;
> +	jiffy_elapsed = jiffies - tg->slice_start[rw];
>  
> -	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
> +	/* Round up to the next throttle slice, wait time must be nonzero */
> +	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
>  
>  	/*
>  	 * jiffy_elapsed_rnd should not be a big value as minimum iops can be

Did you use a tiny iops limit to run into this?

thanks,
-liubo
Konstantin Khlebnikov July 9, 2019, 7:18 a.m. UTC | #2
On 08.07.2019 22:08, Liu Bo wrote:
> On Mon, Jul 08, 2019 at 06:29:57PM +0300, Konstantin Khlebnikov wrote:
>> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
>> limit is enforced") wait time could be zero even if group is throttled and
>> cannot issue requests right now. As a result throtl_select_dispatch() turns
>> into busy-loop under irq-safe queue spinlock.
>>
>> Fix is simple: always round up target time to the next throttle slice.
>>
>> Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
>> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
>> Cc: stable@vger.kernel.org # v4.19+
>> ---
>>   block/blk-throttle.c |    9 +++------
>>   1 file changed, 3 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
>> index 9ea7c0ecad10..8ab6c8153223 100644
>> --- a/block/blk-throttle.c
>> +++ b/block/blk-throttle.c
>> @@ -881,13 +881,10 @@ static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio,
>>   	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
>>   	u64 tmp;
>>   
>> -	jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
>> -
>> -	/* Slice has just started. Consider one slice interval */
>> -	if (!jiffy_elapsed)
>> -		jiffy_elapsed_rnd = tg->td->throtl_slice;
>> +	jiffy_elapsed = jiffies - tg->slice_start[rw];
>>   
>> -	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
>> +	/* Round up to the next throttle slice, wait time must be nonzero */
>> +	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
>>   
>>   	/*
>>   	 * jiffy_elapsed_rnd should not be a big value as minimum iops can be
> 
> Did you use a tiny iops limit to run into this?

Yep. 25 iops

also kernel built with HZ=250, this might be related

> 
> thanks,
> -liubo
>
Konstantin Khlebnikov July 10, 2019, 10:42 a.m. UTC | #3
On 08.07.2019 18:29, Konstantin Khlebnikov wrote:
> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
> limit is enforced") wait time could be zero even if group is throttled and
> cannot issue requests right now. As a result throtl_select_dispatch() turns
> into busy-loop under irq-safe queue spinlock.

To be clear: this almost instantly kills entire machine - other cpus stuck at sending ipi.

> 
> Fix is simple: always round up target time to the next throttle slice.
> 
> Fixes: 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops limit is enforced")
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: stable@vger.kernel.org # v4.19+
> ---
>   block/blk-throttle.c |    9 +++------
>   1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/block/blk-throttle.c b/block/blk-throttle.c
> index 9ea7c0ecad10..8ab6c8153223 100644
> --- a/block/blk-throttle.c
> +++ b/block/blk-throttle.c
> @@ -881,13 +881,10 @@ static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio,
>   	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
>   	u64 tmp;
>   
> -	jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
> -
> -	/* Slice has just started. Consider one slice interval */
> -	if (!jiffy_elapsed)
> -		jiffy_elapsed_rnd = tg->td->throtl_slice;
> +	jiffy_elapsed = jiffies - tg->slice_start[rw];
>   
> -	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
> +	/* Round up to the next throttle slice, wait time must be nonzero */
> +	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
>   
>   	/*
>   	 * jiffy_elapsed_rnd should not be a big value as minimum iops can be
>
Jens Axboe July 10, 2019, 2 p.m. UTC | #4
On 7/8/19 9:29 AM, Konstantin Khlebnikov wrote:
> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
> limit is enforced") wait time could be zero even if group is throttled and
> cannot issue requests right now. As a result throtl_select_dispatch() turns
> into busy-loop under irq-safe queue spinlock.
> 
> Fix is simple: always round up target time to the next throttle slice.

Applied, thanks. In the future, please break lines at 72 chars in
commit messages, I fixed it up.
Konstantin Khlebnikov July 10, 2019, 2:24 p.m. UTC | #5
On 10.07.2019 17:00, Jens Axboe wrote:
> On 7/8/19 9:29 AM, Konstantin Khlebnikov wrote:
>> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
>> limit is enforced") wait time could be zero even if group is throttled and
>> cannot issue requests right now. As a result throtl_select_dispatch() turns
>> into busy-loop under irq-safe queue spinlock.
>>
>> Fix is simple: always round up target time to the next throttle slice.
> 
> Applied, thanks. In the future, please break lines at 72 chars in
> commit messages, I fixed it up.
> 

Ok, but Documentation/process/submitting-patches.rst and
scripts/checkpatch.pl recommends 75 chars per line.
Jens Axboe July 10, 2019, 2:25 p.m. UTC | #6
On 7/10/19 8:24 AM, Konstantin Khlebnikov wrote:
> On 10.07.2019 17:00, Jens Axboe wrote:
>> On 7/8/19 9:29 AM, Konstantin Khlebnikov wrote:
>>> After commit 991f61fe7e1d ("Blk-throttle: reduce tail io latency when iops
>>> limit is enforced") wait time could be zero even if group is throttled and
>>> cannot issue requests right now. As a result throtl_select_dispatch() turns
>>> into busy-loop under irq-safe queue spinlock.
>>>
>>> Fix is simple: always round up target time to the next throttle slice.
>>
>> Applied, thanks. In the future, please break lines at 72 chars in
>> commit messages, I fixed it up.
>>
> 
> Ok, but Documentation/process/submitting-patches.rst and
> scripts/checkpatch.pl recommends 75 chars per line.

Huh, oh well. Not a big deal for me, line breaking is easily automated.
diff mbox series

Patch

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 9ea7c0ecad10..8ab6c8153223 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -881,13 +881,10 @@  static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio,
 	unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd;
 	u64 tmp;
 
-	jiffy_elapsed = jiffy_elapsed_rnd = jiffies - tg->slice_start[rw];
-
-	/* Slice has just started. Consider one slice interval */
-	if (!jiffy_elapsed)
-		jiffy_elapsed_rnd = tg->td->throtl_slice;
+	jiffy_elapsed = jiffies - tg->slice_start[rw];
 
-	jiffy_elapsed_rnd = roundup(jiffy_elapsed_rnd, tg->td->throtl_slice);
+	/* Round up to the next throttle slice, wait time must be nonzero */
+	jiffy_elapsed_rnd = roundup(jiffy_elapsed + 1, tg->td->throtl_slice);
 
 	/*
 	 * jiffy_elapsed_rnd should not be a big value as minimum iops can be