Message ID | 20230819031206.2744005-1-chengming.zhou@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | blk-mq: fix mismatch between IO scheduler insert and finish | expand |
On 8/18/23 9:12 PM, chengming.zhou@linux.dev wrote: > From: Chengming Zhou <zhouchengming@bytedance.com> > > IO scheduler has requirement that one request which has been inserted > must call finish_request() only once. > > Now we have three special cases to consider: > 1. rq has not insert, has complete: e.g. empty preflush > 2. rq has insert, has not complete: e.g. merged requests will be freed > 3. rq has insert, has twice complete: e.g. postflushes > > Note case 1 which existed before, has been no problem since all the > schedulers will check in their finish_request() if the rq has been > inserted or not, like checking "rq->elv.priv[0]". > > Then case 2 and case 3 are the introduced regression, we moved the > scheduler finish_request() from free phase to complete phase to solve > a deadlock problem. But it caused no finish_request() for request in > case 2, and double finish_request() for request in case 3. > > So we still need finish_request() in blk_mq_free_request() to cover > case 2. And clear RQF_USE_SCHED flag to avoid double finish_request(). > It should be fine since we're freeing the request now anyway. > > Of course, we can also make all schedulers' finish_request() to clear > "rq->elv.priv[0]" to avoid double finish. Or clear it in blk-mq, make > the rq like not inserted as case 1. > > FYI it's easy to reproduce warning in mq-deadline using this: > ``` > DEV=sdb > echo mq-deadline > /sys/block/$DEV/queue/scheduler > mkfs.ext4 /dev/$DEV > mount /dev/$DEV /mnt > cd /mnt > stress-ng --symlink 4 --timeout 60 > echo none > /sys/block/$DEV/queue/scheduler > ``` > > Reported-by: kernel test robot <oliver.sang@intel.com> > Closes: https://lore.kernel.org/oe-lkp/202308172100.8ce4b853-oliver.sang@intel.com > Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> I folded in this one and added a link to it as well, final result is here: https://git.kernel.dk/cgit/linux/commit/?h=block-6.5&id=e5c0ca13659e9d18f53368d651ed7e6e433ec1cf I'll get this sent off today.
diff --git a/block/blk-mq.c b/block/blk-mq.c index a6d59320e034..953f08354c8c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -685,8 +685,15 @@ static void blk_mq_finish_request(struct request *rq) { struct request_queue *q = rq->q; - if (rq->rq_flags & RQF_USE_SCHED) + if (rq->rq_flags & RQF_USE_SCHED) { q->elevator->type->ops.finish_request(rq); + /* + * For postflush request that may need to be + * completed twice, we should clear this flag + * to avoid double finish_request() on the rq. + */ + rq->rq_flags &= ~RQF_USE_SCHED; + } } static void __blk_mq_free_request(struct request *rq) @@ -715,6 +722,8 @@ void blk_mq_free_request(struct request *rq) { struct request_queue *q = rq->q; + blk_mq_finish_request(rq); + if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq))) laptop_io_completion(q->disk->bdi);