Message ID | 20201126091852.8588-1-jefflexu@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v9] block: disable iopoll for split bio | expand |
Looks good,
Reviewed-by: Christoph Hellwig <hch@lst.de>
Hi Jens, How about this patch? It has been reviewed by Christoph and Ming Lei.
On 11/26/20 2:18 AM, Jeffle Xu wrote: > iopoll is initially for small size, latency sensitive IO. It doesn't > work well for big IO, especially when it needs to be split to multiple > bios. In this case, the returned cookie of __submit_bio_noacct_mq() is > indeed the cookie of the last split bio. The completion of *this* last > split bio done by iopoll doesn't mean the whole original bio has > completed. Callers of iopoll still need to wait for completion of other > split bios. > > Besides bio splitting may cause more trouble for iopoll which isn't > supposed to be used in case of big IO. > > iopoll for split bio may cause potential race if CPU migration happens > during bio submission. Since the returned cookie is that of the last > split bio, polling on the corresponding hardware queue doesn't help > complete other split bios, if these split bios are enqueued into > different hardware queues. Since interrupts are disabled for polling > queues, the completion of these other split bios depends on timeout > mechanism, thus causing a potential hang. > > iopoll for split bio may also cause hang for sync polling. Currently > both the blkdev and iomap-based fs (ext4/xfs, etc) support sync polling > in direct IO routine. These routines will submit bio without REQ_NOWAIT > flag set, and then start sync polling in current process context. The > process may hang in blk_mq_get_tag() if the submitted bio has to be > split into multiple bios and can rapidly exhaust the queue depth. The > process are waiting for the completion of the previously allocated > requests, which should be reaped by the following polling, and thus > causing a deadlock. > > To avoid these subtle trouble described above, just disable iopoll for > split bio and return BLK_QC_T_NONE in this case. The side effect is that > non-HIPRI IO also returns BLK_QC_T_NONE now. It should be acceptable > since the returned cookie is never used for non-HIPRI IO. Applied, thanks.
diff --git a/block/blk-merge.c b/block/blk-merge.c index bcf5e4580603..8a2f1fb7bb16 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -279,6 +279,14 @@ static struct bio *blk_bio_segment_split(struct request_queue *q, return NULL; split: *segs = nsegs; + + /* + * Bio splitting may cause subtle trouble such as hang when doing sync + * iopoll in direct IO routine. Given performance gain of iopoll for + * big IO can be trival, disable iopoll when split needed. + */ + bio->bi_opf &= ~REQ_HIPRI; + return bio_split(bio, sectors, GFP_NOIO, bs); } diff --git a/block/blk-mq.c b/block/blk-mq.c index 55bcee5dc032..06afd54b8949 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2157,6 +2157,7 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) unsigned int nr_segs; blk_qc_t cookie; blk_status_t ret; + bool hipri; blk_queue_bounce(q, &bio); __blk_queue_split(&bio, &nr_segs); @@ -2173,6 +2174,8 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) rq_qos_throttle(q, bio); + hipri = bio->bi_opf & REQ_HIPRI; + data.cmd_flags = bio->bi_opf; rq = __blk_mq_alloc_request(&data); if (unlikely(!rq)) { @@ -2265,6 +2268,8 @@ blk_qc_t blk_mq_submit_bio(struct bio *bio) blk_mq_sched_insert_request(rq, false, true, true); } + if (!hipri) + return BLK_QC_T_NONE; return cookie; queue_exit: blk_queue_exit(q);