[RFC] blk-mq: fixup RESTART when queue becomes idle

Message ID	1516375212.3190.4.camel@wdc.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-block-owner@kernel.org> From: Bart Van Assche <Bart.VanAssche@wdc.com> To: "ming.lei@redhat.com" <ming.lei@redhat.com>, "axboe@kernel.dk" <axboe@kernel.dk> CC: "dm-devel@redhat.com" <dm-devel@redhat.com>, "hch@infradead.org" <hch@infradead.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "linux-block@vger.kernel.org" <linux-block@vger.kernel.org>, "osandov@fb.com" <osandov@fb.com>, "snitzer@redhat.com" <snitzer@redhat.com> Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Thread-Topic: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle Thread-Index: AQHTkAXlp7DN5JBZKUeyLfttG5XpOaN52OmAgAADroCAAATDAIAAE3uAgAAE1gCAABc0gIAAaoKAgAAZS4CAADjnAIAAhGAA Date: Fri, 19 Jan 2018 15:20:13 +0000 Message-ID: <1516375212.3190.4.camel@wdc.com> References: <20180118024124.8079-1-ming.lei@redhat.com> <b2e5b7e6-ce4b-6053-adae-63cc44d773af@wdc.com> <20180118170353.GB19734@redhat.com> <1516296056.2676.23.camel@wdc.com> <20180118183039.GA20121@redhat.com> <1516301278.2676.35.camel@wdc.com> <deeb2b2e-6d0e-a144-843d-d08626de8aea@kernel.dk> <20180119023212.GA25413@ming.t460p> <eba1191e-4d59-a763-03b2-acd2d2812ea3@kernel.dk> <20180119072623.GB25369@ming.t460p> In-Reply-To: <20180119072623.GB25369@ming.t460p> Accept-Language: en-US Content-Language: en-US wdcipoutbound: EOP-TRUE spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="utf-8" Content-ID: <EC07D5ADB2C027459C9C88933F8C59E3@namprd04.prod.outlook.com> Content-Transfer-Encoding: base64 MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk

Message ID

1516375212.3190.4.camel@wdc.com (mailing list archive)

State

New, archived

Headers

From: Bart Van Assche <Bart.VanAssche@wdc.com>
To: "ming.lei@redhat.com" <ming.lei@redhat.com>,
	"axboe@kernel.dk" <axboe@kernel.dk>
CC: "dm-devel@redhat.com" <dm-devel@redhat.com>,
	"hch@infradead.org" <hch@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	"osandov@fb.com" <osandov@fb.com>,
	"snitzer@redhat.com" <snitzer@redhat.com>
Subject: Re: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle
Thread-Topic: [RFC PATCH] blk-mq: fixup RESTART when queue becomes idle
Thread-Index: AQHTkAXlp7DN5JBZKUeyLfttG5XpOaN52OmAgAADroCAAATDAIAAE3uAgAAE1gCAABc0gIAAaoKAgAAZS4CAADjnAIAAhGAA
Date: Fri, 19 Jan 2018 15:20:13 +0000
Message-ID: <1516375212.3190.4.camel@wdc.com>
References: <20180118024124.8079-1-ming.lei@redhat.com>
	<b2e5b7e6-ce4b-6053-adae-63cc44d773af@wdc.com>
	<20180118170353.GB19734@redhat.com>
	<1516296056.2676.23.camel@wdc.com>
	<20180118183039.GA20121@redhat.com>
	<1516301278.2676.35.camel@wdc.com>
	<deeb2b2e-6d0e-a144-843d-d08626de8aea@kernel.dk>
	<20180119023212.GA25413@ming.t460p>
	<eba1191e-4d59-a763-03b2-acd2d2812ea3@kernel.dk>
	<20180119072623.GB25369@ming.t460p>
In-Reply-To: <20180119072623.GB25369@ming.t460p>
Accept-Language: en-US
Content-Language: en-US
wdcipoutbound: EOP-TRUE
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-ID: <EC07D5ADB2C027459C9C88933F8C59E3@namprd04.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: 6cf4dc82-3b11-4e3d-b6f9-08d55f502377
X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jan 2018 15:20:13.8613
	(UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: b61c8803-16f3-4c35-9b17-6f65f441df86
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0401MB1114
Sender: linux-block-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-block.vger.kernel.org>
X-Mailing-List: linux-block@vger.kernel.org
X-Virus-Scanned: ClamAV using ClamSMTP

Commit Message

Bart Van Assche Jan. 19, 2018, 3:20 p.m. UTC

On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
> Please see queue_delayed_work_on(), hctx->run_work is shared by all
> scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
> scheduling can make progress during the 100ms.

How about addressing that as follows:


Bart.

Comments

Jens Axboe Jan. 19, 2018, 3:25 p.m. UTC | #1

On 1/19/18 8:20 AM, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
>> Please see queue_delayed_work_on(), hctx->run_work is shared by all
>> scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
>> scheduling can make progress during the 100ms.
> 
> How about addressing that as follows:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f7515dd95a36..57f8379a476d 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1403,9 +1403,9 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  		put_cpu();
>  	}
>  
> -	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> -					 &hctx->run_work,
> -					 msecs_to_jiffies(msecs));
> +	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> +				    &hctx->run_work,
> +				    msecs_to_jiffies(msecs));
>  }

Exactly. That's why I said it was just a bug in my previous email, not
honoring a newer run is just stupid. Only other thing you have to be
careful with here is the STOPPED bit.

Ming Lei Jan. 19, 2018, 3:33 p.m. UTC | #2

On Fri, Jan 19, 2018 at 03:20:13PM +0000, Bart Van Assche wrote:
> On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:
> > Please see queue_delayed_work_on(), hctx->run_work is shared by all
> > scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new
> > scheduling can make progress during the 100ms.
> 
> How about addressing that as follows:
> 
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index f7515dd95a36..57f8379a476d 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1403,9 +1403,9 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
>  		put_cpu();
>  	}
>  
> -	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> -					 &hctx->run_work,
> -					 msecs_to_jiffies(msecs));
> +	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
> +				    &hctx->run_work,
> +				    msecs_to_jiffies(msecs));
>  }
>  
>  void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)
> 
> Bart.

Yes, this one together with Jen's suggestion with returning
BLK_STS_NO_DEV_RESOURCE should fix this issue.

Could you cook a fix for this issue? Otherwise I am happy to do
that.

Bart Van Assche Jan. 19, 2018, 4:06 p.m. UTC | #3

On Fri, 2018-01-19 at 23:33 +0800, Ming Lei wrote:
> On Fri, Jan 19, 2018 at 03:20:13PM +0000, Bart Van Assche wrote:

> > On Fri, 2018-01-19 at 15:26 +0800, Ming Lei wrote:

> > > Please see queue_delayed_work_on(), hctx->run_work is shared by all

> > > scheduling, once blk_mq_delay_run_hw_queue(100ms) returns, no new

> > > scheduling can make progress during the 100ms.

> > 

> > How about addressing that as follows:

> > 

> > diff --git a/block/blk-mq.c b/block/blk-mq.c

> > index f7515dd95a36..57f8379a476d 100644

> > --- a/block/blk-mq.c

> > +++ b/block/blk-mq.c

> > @@ -1403,9 +1403,9 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,

> >  		put_cpu();

> >  	}

> >  

> > -	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),

> > -					 &hctx->run_work,

> > -					 msecs_to_jiffies(msecs));

> > +	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx),

> > +				    &hctx->run_work,

> > +				    msecs_to_jiffies(msecs));

> >  }

> >  

> >  void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)

> > 

> > Bart.

> 

> Yes, this one together with Jen's suggestion with returning

> BLK_STS_NO_DEV_RESOURCE should fix this issue.

> 

> Could you cook a fix for this issue? Otherwise I am happy to do

> that.


Hello Ming,

I will look further into this.

Bart.

diff --git a/block/blk-mq.c b/block/blk-mq.c
index f7515dd95a36..57f8379a476d 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1403,9 +1403,9 @@  static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		put_cpu();
 	}
 
-	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
-					 &hctx->run_work,
-					 msecs_to_jiffies(msecs));
+	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
+				    &hctx->run_work,
+				    msecs_to_jiffies(msecs));
 }
 
 void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)

[RFC] blk-mq: fixup RESTART when queue becomes idle

Commit Message

Comments

Patch