Message ID | 20230515144601.52811-2-ming.lei@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: handle passthrough request as really passthrough | expand |
On Mon, May 15, 2023 at 10:46:00PM +0800, Ming Lei wrote: > + } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || > + pt != blk_rq_is_passthrough(rq)) { Can your format this as: } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || pt != blk_rq_is_passthrough(rq)) { for readability? > + /* > + * Both passthrough and flush request don't belong to > + * scheduler, but flush request won't be added to plug > + * list, so needn't handle here. > + */ > rq_list_add_tail(&requeue_lastp, rq); This comment confuses the heck out of me. The check if for passthrough vs non-passthrough and doesn't involved flush requests at all. I'd prefer to drop it, and instead comment on passthrough requests not going to the scheduled below where we actually issue other requests to the scheduler. > + if (pt) { > + spin_lock(&this_hctx->lock); > + list_splice_tail_init(&list, &this_hctx->dispatch); > + spin_unlock(&this_hctx->lock); > + blk_mq_run_hw_queue(this_hctx, from_sched); .. aka here. But why can't we just use the blk_mq_insert_requests for this case anyway? As in just doing a: - if (this_hctx->queue->elevator) { + if (this_hctx->queue->elevator && !pt) { ?
On Tue, May 16, 2023 at 08:22:21AM +0200, Christoph Hellwig wrote: > On Mon, May 15, 2023 at 10:46:00PM +0800, Ming Lei wrote: > > + } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || > > + pt != blk_rq_is_passthrough(rq)) { > > Can your format this as: > > } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || > pt != blk_rq_is_passthrough(rq)) { > > for readability? Do you mean indent for 'pt = blk_rq_is_passthrough(rq)' and keep 'pt' aligned with 'if' in last line? Otherwise, can't see any difference between the two, :-( > > > + /* > > + * Both passthrough and flush request don't belong to > > + * scheduler, but flush request won't be added to plug > > + * list, so needn't handle here. > > + */ > > rq_list_add_tail(&requeue_lastp, rq); > > This comment confuses the heck out of me. The check if for passthrough > vs non-passthrough and doesn't involved flush requests at all. > > I'd prefer to drop it, and instead comment on passthrough requests > not going to the scheduled below where we actually issue other requests > to the scheduler. Any request can be in plug list in theory, we just don't add flush request to plug, that is why the above comment is added. If you don't like the words for flush request, I can drop it. > > > + if (pt) { > > + spin_lock(&this_hctx->lock); > > + list_splice_tail_init(&list, &this_hctx->dispatch); > > + spin_unlock(&this_hctx->lock); > > + blk_mq_run_hw_queue(this_hctx, from_sched); > > .. aka here. But why can't we just use the blk_mq_insert_requests > for this case anyway? If the pt request is part of error recovery, it should be issued to ->dispatch list directly, so just for the sake of safety, meantime keep same behavior with blk_mq_insert_request(). Thanks, Ming
On Tue, May 16, 2023 at 04:10:01PM +0800, Ming Lei wrote: > On Tue, May 16, 2023 at 08:22:21AM +0200, Christoph Hellwig wrote: > > On Mon, May 15, 2023 at 10:46:00PM +0800, Ming Lei wrote: > > > + } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || > > > + pt != blk_rq_is_passthrough(rq)) { > > > > Can your format this as: > > > > } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || > > pt != blk_rq_is_passthrough(rq)) { > > > > for readability? > > Do you mean indent for 'pt = blk_rq_is_passthrough(rq)' and keep 'pt' aligned > with 'if' in last line? Yes. > > This comment confuses the heck out of me. The check if for passthrough > > vs non-passthrough and doesn't involved flush requests at all. > > > > I'd prefer to drop it, and instead comment on passthrough requests > > not going to the scheduled below where we actually issue other requests > > to the scheduler. > > Any request can be in plug list in theory, we just don't add flush request > to plug, that is why the above comment is added. If you don't like the > words for flush request, I can drop it. I just don't think it maks any sense in this context. If we want to enforce the invariant that there's no flush request I'd rather add a WARN_ON to not only talk about enforce it. I'm not sure it's really required, though. > > > + if (pt) { > > > + spin_lock(&this_hctx->lock); > > > + list_splice_tail_init(&list, &this_hctx->dispatch); > > > + spin_unlock(&this_hctx->lock); > > > + blk_mq_run_hw_queue(this_hctx, from_sched); > > > > .. aka here. But why can't we just use the blk_mq_insert_requests > > for this case anyway? > > If the pt request is part of error recovery, it should be issued to > ->dispatch list directly, so just for the sake of safety, meantime keep > same behavior with blk_mq_insert_request(). But if it is part of error recovery it won't be plugged. Please don't do weird cargo cult things here and just use the common helpers.
diff --git a/block/blk-mq.c b/block/blk-mq.c index f6dad0886a2f..b4aaf42f1125 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2711,6 +2711,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched) struct request *requeue_list = NULL; struct request **requeue_lastp = &requeue_list; unsigned int depth = 0; + bool pt = false; LIST_HEAD(list); do { @@ -2719,7 +2720,14 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched) if (!this_hctx) { this_hctx = rq->mq_hctx; this_ctx = rq->mq_ctx; - } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx) { + pt = blk_rq_is_passthrough(rq); + } else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx || + pt != blk_rq_is_passthrough(rq)) { + /* + * Both passthrough and flush request don't belong to + * scheduler, but flush request won't be added to plug + * list, so needn't handle here. + */ rq_list_add_tail(&requeue_lastp, rq); continue; } @@ -2731,7 +2739,13 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched) trace_block_unplug(this_hctx->queue, depth, !from_sched); percpu_ref_get(&this_hctx->queue->q_usage_counter); - if (this_hctx->queue->elevator) { + /* passthrough requests don't belong to scheduler */ + if (pt) { + spin_lock(&this_hctx->lock); + list_splice_tail_init(&list, &this_hctx->dispatch); + spin_unlock(&this_hctx->lock); + blk_mq_run_hw_queue(this_hctx, from_sched); + } else if (this_hctx->queue->elevator) { this_hctx->queue->elevator->type->ops.insert_requests(this_hctx, &list, 0); blk_mq_run_hw_queue(this_hctx, from_sched);
Passthrough(pt) request shouldn't be queued to scheduler, especially some schedulers(such as bfq) supposes that req->bio is always available, then blk-cgroup can be retrieved via bio. We never let passthrough request cross scheduler before commit 1c2d2fff6dc0 ("block: wire-up support for passthrough plugging"). Fix this issue by queuing pt request from plug list to hctx->dispatch directly. Reported-by: Guangwu Zhang <guazhang@redhat.com> Closes: https://lore.kernel.org/linux-block/CAGS2=YosaYaUTEMU3uaf+y=8MqSrhL7sYsJn8EwbaM=76p_4Qg@mail.gmail.com/ Investigated-by: Yu Kuai <yukuai1@huaweicloud.com> Fixes: 1c2d2fff6dc0 ("block: wire-up support for passthrough plugging") Signed-off-by: Ming Lei <ming.lei@redhat.com> --- block/blk-mq.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)