Message ID | 20230710064705.1847287-2-chengming.zhou@linux.dev (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [1/2] blk-flush: fix rq->flush.seq for post-flush requests | expand |
On Mon, Jul 10, 2023 at 02:47:05PM +0800, chengming.zhou@linux.dev wrote: > From: Chengming Zhou <zhouchengming@bytedance.com> > > Now we unconditionally blk_rq_init_flush() to replace rq->end_io to > make rq return twice back to the flush state machine for post-flush. > > Obviously, non post-flush requests don't need it, they don't need to > end request twice, so they don't need to replace rq->end_io callback. > And the same for requests with the FUA bit on hardware with FUA support. > > So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace > rq->end_io if it needs post-flush. Otherwise, it can end like normal > request and doesn't need to return back to the flush state machine. I really like the idea behind this optimization, but I kinda hate adding more magic to the already way too magic flush sequence. I wonder if a better idea would be to kill the flush sequence entirely, and just split the flush_queue into a preflush and a postflush queue. This would remove a field from struct request and lead to more readable code as well.
On 2023/7/10 21:33, Christoph Hellwig wrote: > On Mon, Jul 10, 2023 at 02:47:05PM +0800, chengming.zhou@linux.dev wrote: >> From: Chengming Zhou <zhouchengming@bytedance.com> >> >> Now we unconditionally blk_rq_init_flush() to replace rq->end_io to >> make rq return twice back to the flush state machine for post-flush. >> >> Obviously, non post-flush requests don't need it, they don't need to >> end request twice, so they don't need to replace rq->end_io callback. >> And the same for requests with the FUA bit on hardware with FUA support. >> >> So we move blk_rq_init_flush() to REQ_FSEQ_DATA stage and only replace >> rq->end_io if it needs post-flush. Otherwise, it can end like normal >> request and doesn't need to return back to the flush state machine. > > I really like the idea behind this optimization, but I kinda hate > adding more magic to the already way too magic flush sequence. Yes, agree. > > I wonder if a better idea would be to kill the flush sequence entirely, > and just split the flush_queue into a preflush and a postflush queue. > This would remove a field from struct request and lead to more readable > code as well. I have thought about this for some time, it seems feasible. So I implement it today and test it using blktests, it works. I will send the patchset soon. Thanks.
diff --git a/block/blk-flush.c b/block/blk-flush.c index 094a6adb2718..1b92654e8757 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -93,6 +93,7 @@ enum { static void blk_kick_flush(struct request_queue *q, struct blk_flush_queue *fq, blk_opf_t flags); +static void blk_rq_init_flush(struct request *rq); static inline struct blk_flush_queue * blk_get_flush_queue(struct request_queue *q, struct blk_mq_ctx *ctx) @@ -187,7 +188,15 @@ static void blk_flush_complete_seq(struct request *rq, break; case REQ_FSEQ_DATA: - fq->flush_data_in_flight++; + /* + * Only for requests that need post-flush, + * we need to do rq->end_io replacement trick + * to return back to the flush state machine. + */ + if (!(rq->flush.seq & REQ_FSEQ_POSTFLUSH)) { + blk_rq_init_flush(rq); + fq->flush_data_in_flight++; + } spin_lock(&q->requeue_lock); list_move_tail(&rq->queuelist, &q->flush_list); spin_unlock(&q->requeue_lock); @@ -202,7 +211,13 @@ static void blk_flush_complete_seq(struct request *rq, * normal completion and end it. */ list_del_init(&rq->queuelist); - blk_flush_restore_request(rq); + /* + * Only for requests that had rq->end_io replaced, + * we need to restore rq->end_io and make it a normal + * request before the second end. + */ + if (rq->rq_flags & RQF_FLUSH_SEQ) + blk_flush_restore_request(rq); blk_mq_end_request(rq, error); break; @@ -389,7 +404,6 @@ static enum rq_end_io_ret mq_flush_data_end_io(struct request *rq, static void blk_rq_init_flush(struct request *rq) { - rq->flush.seq = 0; rq->rq_flags |= RQF_FLUSH_SEQ; rq->flush.saved_end_io = rq->end_io; /* Usually NULL */ rq->end_io = mq_flush_data_end_io; @@ -424,6 +438,7 @@ bool blk_insert_flush(struct request *rq) * the request accounting. */ rq->cmd_flags |= REQ_SYNC; + rq->flush.seq = 0; switch (policy) { case 0: @@ -458,7 +473,6 @@ bool blk_insert_flush(struct request *rq) * Mark the request as part of a flush sequence and submit it * for further processing to the flush state machine. */ - blk_rq_init_flush(rq); spin_lock_irq(&fq->mq_flush_lock); blk_flush_complete_seq(rq, fq, REQ_FSEQ_ACTIONS & ~policy, 0); spin_unlock_irq(&fq->mq_flush_lock);