Message ID | 1504719490-7523-1-git-send-email-idryomov@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Sep 6, 2017 at 7:38 PM, Ilya Dryomov <idryomov@gmail.com> wrote: > sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to > permit trying WRITE SAME on older SCSI devices, unless ->no_write_same > is set. This means blkdev_issue_zeroout() must cope with WRITE SAME > failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, > unless BLKDEV_ZERO_NOFALLBACK is specified. > > Commit 71027e97d796 ("block: stop using discards for zeroing") added > the following to __blkdev_issue_zeroout() comment: > > "Note that this function may fail with -EOPNOTSUPP if the driver signals > zeroing offload support, but the device fails to process the command (for > some devices there is no non-destructive way to verify whether this > operation is actually supported). In this case the caller should call > retry the call to blkdev_issue_zeroout() and the fallback path will be used." > > But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME > support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned > to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: > > $ fallocate -zn -l 1k /dev/sdg > fallocate: fallocate failed: Remote I/O error > $ fallocate -zn -l 1k /dev/sdg # OK > > (The second fallocate(1) succeeds because sd_done() sets ->no_write_same > in response to a sense that would become BLK_STS_TARGET.) > > Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This > is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. > > Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing") > Cc: Christoph Hellwig <hch@lst.de> > Cc: "Martin K. Petersen" <martin.petersen@oracle.com> > Cc: Hannes Reinecke <hare@suse.com> > Cc: stable@vger.kernel.org # 4.12+ > Signed-off-by: Ilya Dryomov <idryomov@gmail.com> > --- > block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++---------------- > 1 file changed, 33 insertions(+), 16 deletions(-) > > diff --git a/block/blk-lib.c b/block/blk-lib.c > index 3fe0aec90597..876b5478c1a2 100644 > --- a/block/blk-lib.c > +++ b/block/blk-lib.c > @@ -287,12 +287,6 @@ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) > * Zero-fill a block range, either using hardware offload or by explicitly > * writing zeroes to the device. > * > - * Note that this function may fail with -EOPNOTSUPP if the driver signals > - * zeroing offload support, but the device fails to process the command (for > - * some devices there is no non-destructive way to verify whether this > - * operation is actually supported). In this case the caller should call > - * retry the call to blkdev_issue_zeroout() and the fallback path will be used. > - * > * If a device is using logical block provisioning, the underlying space will > * not be released if %flags contains BLKDEV_ZERO_NOUNMAP. > * > @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, > } > EXPORT_SYMBOL(__blkdev_issue_zeroout); > > +/* > + * This function may fail with -EREMOTEIO if the driver signals zeroing > + * offload support, but the device fails to process the command (for > + * some devices there is no non-destructive way to verify whether this > + * operation is actually supported). In this case the caller should > + * retry and the fallback path in __blkdev_issue_zeroout() will be used > + * unless %flags contains BLKDEV_ZERO_NOFALLBACK. > + */ > +static int __blkdev_issue_zeroout_wait(struct block_device *bdev, > + sector_t sector, sector_t nr_sects, > + gfp_t gfp_mask, unsigned flags) > +{ > + int ret; > + struct bio *bio = NULL; > + struct blk_plug plug; > + > + blk_start_plug(&plug); > + ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, > + &bio, flags); > + if (ret == 0 && bio) { > + ret = submit_bio_wait(bio); > + bio_put(bio); > + } > + blk_finish_plug(&plug); > + > + return ret; > +} > + > /** > * blkdev_issue_zeroout - zero-fill a block range > * @bdev: blockdev to write > @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, > sector_t nr_sects, gfp_t gfp_mask, unsigned flags) > { > int ret; > - struct bio *bio = NULL; > - struct blk_plug plug; > > - blk_start_plug(&plug); > - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, > - &bio, flags); > - if (ret == 0 && bio) { > - ret = submit_bio_wait(bio); > - bio_put(bio); > - } > - blk_finish_plug(&plug); > + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, > + gfp_mask, flags); > + if (ret == -EREMOTEIO) > + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, > + gfp_mask, flags); > > return ret; > } Ping... This fixes a regression introduced in 4.12. Christoph, Martin, could you please take a look? Thanks, Ilya
On Wed, Sep 13, 2017 at 11:50 AM, Ilya Dryomov <idryomov@gmail.com> wrote: > On Wed, Sep 6, 2017 at 7:38 PM, Ilya Dryomov <idryomov@gmail.com> wrote: >> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to >> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same >> is set. This means blkdev_issue_zeroout() must cope with WRITE SAME >> failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, >> unless BLKDEV_ZERO_NOFALLBACK is specified. >> >> Commit 71027e97d796 ("block: stop using discards for zeroing") added >> the following to __blkdev_issue_zeroout() comment: >> >> "Note that this function may fail with -EOPNOTSUPP if the driver signals >> zeroing offload support, but the device fails to process the command (for >> some devices there is no non-destructive way to verify whether this >> operation is actually supported). In this case the caller should call >> retry the call to blkdev_issue_zeroout() and the fallback path will be used." >> >> But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME >> support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned >> to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: >> >> $ fallocate -zn -l 1k /dev/sdg >> fallocate: fallocate failed: Remote I/O error >> $ fallocate -zn -l 1k /dev/sdg # OK >> >> (The second fallocate(1) succeeds because sd_done() sets ->no_write_same >> in response to a sense that would become BLK_STS_TARGET.) >> >> Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This >> is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. >> >> Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing") >> Cc: Christoph Hellwig <hch@lst.de> >> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> >> Cc: Hannes Reinecke <hare@suse.com> >> Cc: stable@vger.kernel.org # 4.12+ >> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> >> --- >> block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++---------------- >> 1 file changed, 33 insertions(+), 16 deletions(-) >> >> diff --git a/block/blk-lib.c b/block/blk-lib.c >> index 3fe0aec90597..876b5478c1a2 100644 >> --- a/block/blk-lib.c >> +++ b/block/blk-lib.c >> @@ -287,12 +287,6 @@ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) >> * Zero-fill a block range, either using hardware offload or by explicitly >> * writing zeroes to the device. >> * >> - * Note that this function may fail with -EOPNOTSUPP if the driver signals >> - * zeroing offload support, but the device fails to process the command (for >> - * some devices there is no non-destructive way to verify whether this >> - * operation is actually supported). In this case the caller should call >> - * retry the call to blkdev_issue_zeroout() and the fallback path will be used. >> - * >> * If a device is using logical block provisioning, the underlying space will >> * not be released if %flags contains BLKDEV_ZERO_NOUNMAP. >> * >> @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, >> } >> EXPORT_SYMBOL(__blkdev_issue_zeroout); >> >> +/* >> + * This function may fail with -EREMOTEIO if the driver signals zeroing >> + * offload support, but the device fails to process the command (for >> + * some devices there is no non-destructive way to verify whether this >> + * operation is actually supported). In this case the caller should >> + * retry and the fallback path in __blkdev_issue_zeroout() will be used >> + * unless %flags contains BLKDEV_ZERO_NOFALLBACK. >> + */ >> +static int __blkdev_issue_zeroout_wait(struct block_device *bdev, >> + sector_t sector, sector_t nr_sects, >> + gfp_t gfp_mask, unsigned flags) >> +{ >> + int ret; >> + struct bio *bio = NULL; >> + struct blk_plug plug; >> + >> + blk_start_plug(&plug); >> + ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, >> + &bio, flags); >> + if (ret == 0 && bio) { >> + ret = submit_bio_wait(bio); >> + bio_put(bio); >> + } >> + blk_finish_plug(&plug); >> + >> + return ret; >> +} >> + >> /** >> * blkdev_issue_zeroout - zero-fill a block range >> * @bdev: blockdev to write >> @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, >> sector_t nr_sects, gfp_t gfp_mask, unsigned flags) >> { >> int ret; >> - struct bio *bio = NULL; >> - struct blk_plug plug; >> >> - blk_start_plug(&plug); >> - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, >> - &bio, flags); >> - if (ret == 0 && bio) { >> - ret = submit_bio_wait(bio); >> - bio_put(bio); >> - } >> - blk_finish_plug(&plug); >> + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, >> + gfp_mask, flags); >> + if (ret == -EREMOTEIO) >> + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, >> + gfp_mask, flags); >> >> return ret; >> } > > Ping... This fixes a regression introduced in 4.12. Christoph, > Martin, could you please take a look? Christoph, the regression got introduced with your "always use REQ_OP_WRITE_ZEROES for zeroing offload" series, could you please look at this? Thanks, Ilya
On Wed, Sep 06, 2017 at 07:38:10PM +0200, Ilya Dryomov wrote: > sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to > permit trying WRITE SAME on older SCSI devices, unless ->no_write_same > is set. This means blkdev_issue_zeroout() must cope with WRITE SAME > failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, > unless BLKDEV_ZERO_NOFALLBACK is specified. > > Commit 71027e97d796 ("block: stop using discards for zeroing") added > the following to __blkdev_issue_zeroout() comment: > > "Note that this function may fail with -EOPNOTSUPP if the driver signals > zeroing offload support, but the device fails to process the command (for > some devices there is no non-destructive way to verify whether this > operation is actually supported). In this case the caller should call > retry the call to blkdev_issue_zeroout() and the fallback path will be used." > > But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME > support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned > to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: > > $ fallocate -zn -l 1k /dev/sdg > fallocate: fallocate failed: Remote I/O error > $ fallocate -zn -l 1k /dev/sdg # OK > > (The second fallocate(1) succeeds because sd_done() sets ->no_write_same > in response to a sense that would become BLK_STS_TARGET.) > > Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This > is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. I'm really not sure we should check for -EREMOTEIO specifically, but Martin who is more familiar with the SCSI code might be able to correct me, I'd feel safer about checking for any error which is what the old code did. Except for that the patch looks fine to me.
On Tue, Sep 19, 2017 at 10:32 PM, Christoph Hellwig <hch@lst.de> wrote: > On Wed, Sep 06, 2017 at 07:38:10PM +0200, Ilya Dryomov wrote: >> sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to >> permit trying WRITE SAME on older SCSI devices, unless ->no_write_same >> is set. This means blkdev_issue_zeroout() must cope with WRITE SAME >> failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, >> unless BLKDEV_ZERO_NOFALLBACK is specified. >> >> Commit 71027e97d796 ("block: stop using discards for zeroing") added >> the following to __blkdev_issue_zeroout() comment: >> >> "Note that this function may fail with -EOPNOTSUPP if the driver signals >> zeroing offload support, but the device fails to process the command (for >> some devices there is no non-destructive way to verify whether this >> operation is actually supported). In this case the caller should call >> retry the call to blkdev_issue_zeroout() and the fallback path will be used." >> >> But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME >> support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned >> to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: >> >> $ fallocate -zn -l 1k /dev/sdg >> fallocate: fallocate failed: Remote I/O error >> $ fallocate -zn -l 1k /dev/sdg # OK >> >> (The second fallocate(1) succeeds because sd_done() sets ->no_write_same >> in response to a sense that would become BLK_STS_TARGET.) >> >> Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This >> is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. > > I'm really not sure we should check for -EREMOTEIO specifically, but > Martin who is more familiar with the SCSI code might be able to > correct me, I'd feel safer about checking for any error which is > what the old code did. Yeah, I pondered that. The old code did check for any error, but it wouldn't attempt another WRITE SAME after a failure. To do that here, we'd need to refactor __blkdev_issue_zeroout(); as it is, if we retry __blkdev_issue_zeroout() on a random error, it would happily go the REQ_OP_WRITE_ZEROES way again. Martin, what do you think? Thanks, Ilya
Christoph, > I'm really not sure we should check for -EREMOTEIO specifically, but > Martin who is more familiar with the SCSI code might be able to > correct me, I'd feel safer about checking for any error which is > what the old code did. > > Except for that the patch looks fine to me. We explicitly return -EREMOTEIO when the device reports ILLEGAL REQUEST. But I agree that we should fall back to manually zeroing for any error.
diff --git a/block/blk-lib.c b/block/blk-lib.c index 3fe0aec90597..876b5478c1a2 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -287,12 +287,6 @@ static unsigned int __blkdev_sectors_to_bio_pages(sector_t nr_sects) * Zero-fill a block range, either using hardware offload or by explicitly * writing zeroes to the device. * - * Note that this function may fail with -EOPNOTSUPP if the driver signals - * zeroing offload support, but the device fails to process the command (for - * some devices there is no non-destructive way to verify whether this - * operation is actually supported). In this case the caller should call - * retry the call to blkdev_issue_zeroout() and the fallback path will be used. - * * If a device is using logical block provisioning, the underlying space will * not be released if %flags contains BLKDEV_ZERO_NOUNMAP. * @@ -343,6 +337,34 @@ int __blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, } EXPORT_SYMBOL(__blkdev_issue_zeroout); +/* + * This function may fail with -EREMOTEIO if the driver signals zeroing + * offload support, but the device fails to process the command (for + * some devices there is no non-destructive way to verify whether this + * operation is actually supported). In this case the caller should + * retry and the fallback path in __blkdev_issue_zeroout() will be used + * unless %flags contains BLKDEV_ZERO_NOFALLBACK. + */ +static int __blkdev_issue_zeroout_wait(struct block_device *bdev, + sector_t sector, sector_t nr_sects, + gfp_t gfp_mask, unsigned flags) +{ + int ret; + struct bio *bio = NULL; + struct blk_plug plug; + + blk_start_plug(&plug); + ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, + &bio, flags); + if (ret == 0 && bio) { + ret = submit_bio_wait(bio); + bio_put(bio); + } + blk_finish_plug(&plug); + + return ret; +} + /** * blkdev_issue_zeroout - zero-fill a block range * @bdev: blockdev to write @@ -360,17 +382,12 @@ int blkdev_issue_zeroout(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, unsigned flags) { int ret; - struct bio *bio = NULL; - struct blk_plug plug; - blk_start_plug(&plug); - ret = __blkdev_issue_zeroout(bdev, sector, nr_sects, gfp_mask, - &bio, flags); - if (ret == 0 && bio) { - ret = submit_bio_wait(bio); - bio_put(bio); - } - blk_finish_plug(&plug); + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, + gfp_mask, flags); + if (ret == -EREMOTEIO) + ret = __blkdev_issue_zeroout_wait(bdev, sector, nr_sects, + gfp_mask, flags); return ret; }
sd_config_write_same() ignores ->max_ws_blocks == 0 and resets it to permit trying WRITE SAME on older SCSI devices, unless ->no_write_same is set. This means blkdev_issue_zeroout() must cope with WRITE SAME failing with BLK_STS_TARGET/-EREMOTEIO and explicitly write zeroes, unless BLKDEV_ZERO_NOFALLBACK is specified. Commit 71027e97d796 ("block: stop using discards for zeroing") added the following to __blkdev_issue_zeroout() comment: "Note that this function may fail with -EOPNOTSUPP if the driver signals zeroing offload support, but the device fails to process the command (for some devices there is no non-destructive way to verify whether this operation is actually supported). In this case the caller should call retry the call to blkdev_issue_zeroout() and the fallback path will be used." But __blkdev_issue_zeroout() doesn't fail in this case: if WRITE SAME support is indicated, a REQ_OP_WRITE_ZEROES bio is built and returned to blkdev_issue_zeroout(). -EREMOTEIO is then propagated up: $ fallocate -zn -l 1k /dev/sdg fallocate: fallocate failed: Remote I/O error $ fallocate -zn -l 1k /dev/sdg # OK (The second fallocate(1) succeeds because sd_done() sets ->no_write_same in response to a sense that would become BLK_STS_TARGET.) Retry __blkdev_issue_zeroout() if the I/O fails with -EREMOTEIO. This is roughly what we did until 4.12, sans BLKDEV_ZERO_NOFALLBACK knob. Fixes: c20cfc27a473 ("block: stop using blkdev_issue_write_same for zeroing") Cc: Christoph Hellwig <hch@lst.de> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Hannes Reinecke <hare@suse.com> Cc: stable@vger.kernel.org # 4.12+ Signed-off-by: Ilya Dryomov <idryomov@gmail.com> --- block/blk-lib.c | 49 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 33 insertions(+), 16 deletions(-)