Message ID | 20211129012659.1553733-1-yebin10@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [-next] block: Fix fsync always failed if once failed | expand |
On Mon, Nov 29, 2021 at 09:26:59AM +0800, Ye Bin wrote: > We do test with inject error fault base on v4.19, after test some time we found > sync /dev/sda always failed. > [root@localhost] sync /dev/sda > sync: error syncing '/dev/sda': Input/output error > > scsi log as follows: > [19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b > [19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 > [19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK > [19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 > [19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0 > [19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0) > [19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes > [19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done. > [19069.812564] print_req_error: I/O error, dev sda, sector 0 > > ftrace log as follows: > rep-306069 [007] .... 19654.923315: block_bio_queue: 8,0 FWS 0 + 0 [rep] > rep-306069 [007] .... 19654.923333: block_getrq: 8,0 FWS 0 + 0 [rep] > kworker/7:1H-250 [007] .... 19654.923352: block_rq_issue: 8,0 FF 0 () 0 + 0 [kworker/7:1H] > <idle>-0 [007] ..s. 19654.923562: block_rq_complete: 8,0 FF () 18446744073709551615 + 0 [0] > <idle>-0 [007] d.s. 19654.923576: block_rq_complete: 8,0 WS () 0 + 0 [-5] > > As 8d6996630c03 introduce 'fq->rq_status', this data only update when 'flush_rq' > reference count isn't zero. If flush request once failed and record error code > in 'fq->rq_status'. If there is no chance to update 'fq->rq_status',then do fsync > will always failed. > To address this issue reset 'fq->rq_status' after return error code to upper layer. > > Fixes: 8d6996630c03("block: fix null pointer dereference in blk_mq_rq_timed_out()") > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > block/blk-flush.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/block/blk-flush.c b/block/blk-flush.c > index 902e80e48e4a..8b43088d41f2 100644 > --- a/block/blk-flush.c > +++ b/block/blk-flush.c > @@ -242,8 +242,10 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) > * avoiding use-after-free. > */ > WRITE_ONCE(flush_rq->state, MQ_RQ_IDLE); > - if (fq->rq_status != BLK_STS_OK) > + if (fq->rq_status != BLK_STS_OK) { > error = fq->rq_status; > + fq->rq_status = BLK_STS_OK; > + } Looks fine, Reviewed-by: Ming Lei <ming.lei@redhat.com> Thanks, Ming
On Mon, 29 Nov 2021 09:26:59 +0800, Ye Bin wrote: > We do test with inject error fault base on v4.19, after test some time we found > sync /dev/sda always failed. > [root@localhost] sync /dev/sda > sync: error syncing '/dev/sda': Input/output error > > scsi log as follows: > [19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b > [19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 > [19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK > [19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 > [19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0 > [19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0) > [19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes > [19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done. > [19069.812564] print_req_error: I/O error, dev sda, sector 0 > > [...] Applied, thanks! [1/1] block: Fix fsync always failed if once failed commit: 8a7518931baa8ea023700987f3db31cb0a80610b Best regards,
diff --git a/block/blk-flush.c b/block/blk-flush.c index 902e80e48e4a..8b43088d41f2 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -242,8 +242,10 @@ static void flush_end_io(struct request *flush_rq, blk_status_t error) * avoiding use-after-free. */ WRITE_ONCE(flush_rq->state, MQ_RQ_IDLE); - if (fq->rq_status != BLK_STS_OK) + if (fq->rq_status != BLK_STS_OK) { error = fq->rq_status; + fq->rq_status = BLK_STS_OK; + } if (!q->elevator) { flush_rq->tag = BLK_MQ_NO_TAG;
We do test with inject error fault base on v4.19, after test some time we found sync /dev/sda always failed. [root@localhost] sync /dev/sda sync: error syncing '/dev/sda': Input/output error scsi log as follows: [19069.812296] sd 0:0:0:0: [sda] tag#64 Send: scmd 0x00000000d03a0b6b [19069.812302] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 [19069.812533] sd 0:0:0:0: [sda] tag#64 Done: SUCCESS Result: hostbyte=DID_OK driverbyte=DRIVER_OK [19069.812536] sd 0:0:0:0: [sda] tag#64 CDB: Synchronize Cache(10) 35 00 00 00 00 00 00 00 00 00 [19069.812539] sd 0:0:0:0: [sda] tag#64 scsi host busy 1 failed 0 [19069.812542] sd 0:0:0:0: Notifying upper driver of completion (result 0) [19069.812546] sd 0:0:0:0: [sda] tag#64 sd_done: completed 0 of 0 bytes [19069.812549] sd 0:0:0:0: [sda] tag#64 0 sectors total, 0 bytes done. [19069.812564] print_req_error: I/O error, dev sda, sector 0 ftrace log as follows: rep-306069 [007] .... 19654.923315: block_bio_queue: 8,0 FWS 0 + 0 [rep] rep-306069 [007] .... 19654.923333: block_getrq: 8,0 FWS 0 + 0 [rep] kworker/7:1H-250 [007] .... 19654.923352: block_rq_issue: 8,0 FF 0 () 0 + 0 [kworker/7:1H] <idle>-0 [007] ..s. 19654.923562: block_rq_complete: 8,0 FF () 18446744073709551615 + 0 [0] <idle>-0 [007] d.s. 19654.923576: block_rq_complete: 8,0 WS () 0 + 0 [-5] As 8d6996630c03 introduce 'fq->rq_status', this data only update when 'flush_rq' reference count isn't zero. If flush request once failed and record error code in 'fq->rq_status'. If there is no chance to update 'fq->rq_status',then do fsync will always failed. To address this issue reset 'fq->rq_status' after return error code to upper layer. Fixes: 8d6996630c03("block: fix null pointer dereference in blk_mq_rq_timed_out()") Signed-off-by: Ye Bin <yebin10@huawei.com> --- block/blk-flush.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)