Message ID | 20190828044020.23915-1-damien.lemoal@wdc.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | block: mq-deadline: Fix queue restart handling | expand |
On 2019-08-28 06:40, Damien Le Moal wrote: > Commit 7211aef86f79 ("block: mq-deadline: Fix write completion > handling") added a call to blk_mq_sched_mark_restart_hctx() in > dd_dispatch_request() to make sure that write request dispatching does > not stall when all target zones are locked. This fix left a subtle race > when a write completion happens during a dispatch execution on another > CPU: > > CPU 0: Dispatch CPU1: write completion > > dd_dispatch_request() > lock(&dd->lock); > ... > lock(&dd->zone_lock); dd_finish_request() > rq = find request lock(&dd->zone_lock); > unlock(&dd->zone_lock); > zone write unlock > unlock(&dd->zone_lock); > ... > __blk_mq_free_request > check restart flag (not set) > -> queue not run > ... > if (!rq && have writes) > blk_mq_sched_mark_restart_hctx() > unlock(&dd->lock) > > Since the dispatch context finishes after the write request completion > handling, marking the queue as needing a restart is not seen from > __blk_mq_free_request() and blk_mq_sched_restart() not executed leading > to the dispatch stall under 100% write workloads. > > Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from > dd_dispatch_request() into dd_finish_request() under the zone lock to > ensure full mutual exclusion between write request dispatch selection > and zone unlock on write request completion. > > Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling") > Cc: stable@vger.kernel.org > Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com> > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> > --- > block/mq-deadline.c | 19 +++++++++---------- > 1 file changed, 9 insertions(+), 10 deletions(-) > > diff --git a/block/mq-deadline.c b/block/mq-deadline.c > index a17466f310f4..b490f47fd553 100644 > --- a/block/mq-deadline.c > +++ b/block/mq-deadline.c > @@ -377,13 +377,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd) > * hardware queue, but we may return a request that is for a > * different hardware queue. This is because mq-deadline has shared > * state for all hardware queues, in terms of sorting, FIFOs, etc. > - * > - * For a zoned block device, __dd_dispatch_request() may return NULL > - * if all the queued write requests are directed at zones that are already > - * locked due to on-going write requests. In this case, make sure to mark > - * the queue as needing a restart to ensure that the queue is run again > - * and the pending writes dispatched once the target zones for the ongoing > - * write requests are unlocked in dd_finish_request(). > */ > static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) > { > @@ -392,9 +385,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) > > spin_lock(&dd->lock); > rq = __dd_dispatch_request(dd); > - if (!rq && blk_queue_is_zoned(hctx->queue) && > - !list_empty(&dd->fifo_list[WRITE])) > - blk_mq_sched_mark_restart_hctx(hctx); > spin_unlock(&dd->lock); > > return rq; > @@ -561,6 +551,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio) > * spinlock so that the zone is never unlocked while deadline_fifo_request() > * or deadline_next_request() are executing. This function is called for > * all requests, whether or not these requests complete successfully. > + * > + * For a zoned block device, __dd_dispatch_request() may have stopped > + * dispatching requests if all the queued requests are write requests directed > + * at zones that are already locked due to on-going write requests. To ensure > + * write request dispatch progress in this case, mark the queue as needing a > + * restart to ensure that the queue is run again after completion of the > + * request and zones being unlocked. > */ > static void dd_finish_request(struct request *rq) > { > @@ -572,6 +569,8 @@ static void dd_finish_request(struct request *rq) > > spin_lock_irqsave(&dd->zone_lock, flags); > blk_req_zone_write_unlock(rq); > + if (!list_empty(&dd->fifo_list[WRITE])) > + blk_mq_sched_mark_restart_hctx(rq->mq_hctx); > spin_unlock_irqrestore(&dd->zone_lock, flags); > } > } > Looks good to me. Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com>
Hi, [This is an automated email] This commit has been processed because it contains a "Fixes:" tag, fixing commit: 7211aef86f79 block: mq-deadline: Fix write completion handling. The bot has tested the following trees: v5.2.10, v4.19.68. v5.2.10: Build OK! v4.19.68: Build failed! Errors: block/mq-deadline.c:571:39: error: ‘struct request’ has no member named ‘mq_hctx’; did you mean ‘mq_ctx’? NOTE: The patch will not be queued to stable trees until it is upstream. How should we proceed with this patch? -- Thanks, Sasha
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
On 8/27/19 10:40 PM, Damien Le Moal wrote: > Commit 7211aef86f79 ("block: mq-deadline: Fix write completion > handling") added a call to blk_mq_sched_mark_restart_hctx() in > dd_dispatch_request() to make sure that write request dispatching does > not stall when all target zones are locked. This fix left a subtle race > when a write completion happens during a dispatch execution on another > CPU: > > CPU 0: Dispatch CPU1: write completion > > dd_dispatch_request() > lock(&dd->lock); > ... > lock(&dd->zone_lock); dd_finish_request() > rq = find request lock(&dd->zone_lock); > unlock(&dd->zone_lock); > zone write unlock > unlock(&dd->zone_lock); > ... > __blk_mq_free_request > check restart flag (not set) > -> queue not run > ... > if (!rq && have writes) > blk_mq_sched_mark_restart_hctx() > unlock(&dd->lock) > > Since the dispatch context finishes after the write request completion > handling, marking the queue as needing a restart is not seen from > __blk_mq_free_request() and blk_mq_sched_restart() not executed leading > to the dispatch stall under 100% write workloads. > > Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from > dd_dispatch_request() into dd_finish_request() under the zone lock to > ensure full mutual exclusion between write request dispatch selection > and zone unlock on write request completion. Applied, thanks.
diff --git a/block/mq-deadline.c b/block/mq-deadline.c index a17466f310f4..b490f47fd553 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -377,13 +377,6 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd) * hardware queue, but we may return a request that is for a * different hardware queue. This is because mq-deadline has shared * state for all hardware queues, in terms of sorting, FIFOs, etc. - * - * For a zoned block device, __dd_dispatch_request() may return NULL - * if all the queued write requests are directed at zones that are already - * locked due to on-going write requests. In this case, make sure to mark - * the queue as needing a restart to ensure that the queue is run again - * and the pending writes dispatched once the target zones for the ongoing - * write requests are unlocked in dd_finish_request(). */ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) { @@ -392,9 +385,6 @@ static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx) spin_lock(&dd->lock); rq = __dd_dispatch_request(dd); - if (!rq && blk_queue_is_zoned(hctx->queue) && - !list_empty(&dd->fifo_list[WRITE])) - blk_mq_sched_mark_restart_hctx(hctx); spin_unlock(&dd->lock); return rq; @@ -561,6 +551,13 @@ static void dd_prepare_request(struct request *rq, struct bio *bio) * spinlock so that the zone is never unlocked while deadline_fifo_request() * or deadline_next_request() are executing. This function is called for * all requests, whether or not these requests complete successfully. + * + * For a zoned block device, __dd_dispatch_request() may have stopped + * dispatching requests if all the queued requests are write requests directed + * at zones that are already locked due to on-going write requests. To ensure + * write request dispatch progress in this case, mark the queue as needing a + * restart to ensure that the queue is run again after completion of the + * request and zones being unlocked. */ static void dd_finish_request(struct request *rq) { @@ -572,6 +569,8 @@ static void dd_finish_request(struct request *rq) spin_lock_irqsave(&dd->zone_lock, flags); blk_req_zone_write_unlock(rq); + if (!list_empty(&dd->fifo_list[WRITE])) + blk_mq_sched_mark_restart_hctx(rq->mq_hctx); spin_unlock_irqrestore(&dd->zone_lock, flags); } }
Commit 7211aef86f79 ("block: mq-deadline: Fix write completion handling") added a call to blk_mq_sched_mark_restart_hctx() in dd_dispatch_request() to make sure that write request dispatching does not stall when all target zones are locked. This fix left a subtle race when a write completion happens during a dispatch execution on another CPU: CPU 0: Dispatch CPU1: write completion dd_dispatch_request() lock(&dd->lock); ... lock(&dd->zone_lock); dd_finish_request() rq = find request lock(&dd->zone_lock); unlock(&dd->zone_lock); zone write unlock unlock(&dd->zone_lock); ... __blk_mq_free_request check restart flag (not set) -> queue not run ... if (!rq && have writes) blk_mq_sched_mark_restart_hctx() unlock(&dd->lock) Since the dispatch context finishes after the write request completion handling, marking the queue as needing a restart is not seen from __blk_mq_free_request() and blk_mq_sched_restart() not executed leading to the dispatch stall under 100% write workloads. Fix this by moving the call to blk_mq_sched_mark_restart_hctx() from dd_dispatch_request() into dd_finish_request() under the zone lock to ensure full mutual exclusion between write request dispatch selection and zone unlock on write request completion. Fixes: 7211aef86f79 ("block: mq-deadline: Fix write completion handling") Cc: stable@vger.kernel.org Reported-by: Hans Holmberg <Hans.Holmberg@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- block/mq-deadline.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)