Message ID | 20200219063800.880834-1-damien.lemoal@wdc.com (mailing list archive) |
---|---|
State | Mainlined |
Commit | 51fdaa0490241e8cd41b40cbf43a336d1a014460 |
Headers | show |
Series | scsi: sd_sbc: Fix sd_zbc_report_zones() | expand |
On Wed, Feb 19, 2020 at 03:38:00PM +0900, Damien Le Moal wrote: > The block layer generic blk_revalidate_disk_zones() checks the validity > of zone descriptors reported by a disk using the > blk_revalidate_zone_cb() callback function executed for each zone > descriptor. If a ZBC disk reports invalid zone descriptors, > blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() > changes the disk capacity to 0, which in turn results in the gendisk > structure capacity to be set to 0. This all works well for the first > revalidate pass on a disk and the block layer detects the capactiy > change. > > On the second revalidate pass, blk_revalidate_disk_zones() is called > again and sd_zbc_report_zones() executed to check the zones a second > time. However, for this second pass, the gendisk capacity is now 0, > which results in sd_zbc_report_zones() to do nothing and to report > success and no zones. blk_revalidate_disk_zones() in turn returns > success and sets the disk queue chunk_sectors limit with zero as > no zones were checked, causing a oops to trigger on the > BUG_ON(!is_power_of_2(chunk_sectors)) in blk_queue_chunk_sectors(). > > Fix this by using the sdkp capacity field rather than the gendisk > capacity for the report zones loop in sd_zbc_report_zones(). Also add a > check to return immediately an error if the sdkp capacity is 0. > With this fix, invalid/buggy ZBC disk scan does not trigger a oops and > are exposed with a 0 capacity. This change also preserve the chance for > the disk to be correctly revalidated on the second revalidate pass as > the scsi disk structure capacity field is always set to the disk > reported value when sd_zbc_report_zones() is called. > > Fixes: d41003513e61 ("block: rework zone reporting") > Cc: Cc: <stable@vger.kernel.org> # v5.5 > Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de>
On 19/02/2020 07:38, Damien Le Moal wrote: > Fixes: d41003513e61 ("block: rework zone reporting") > Cc: Cc:<stable@vger.kernel.org> # v5.5 Double Cc: Otherwise looks good, Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Damien, > The block layer generic blk_revalidate_disk_zones() checks the > validity of zone descriptors reported by a disk using the > blk_revalidate_zone_cb() callback function executed for each zone > descriptor. If a ZBC disk reports invalid zone descriptors, > blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() > changes the disk capacity to 0, which in turn results in the gendisk > structure capacity to be set to 0. This all works well for the first > revalidate pass on a disk and the block layer detects the capactiy > change. Applied to 5.6/scsi-fixes, thanks!
diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index e4282bce5834..f45c22b09726 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -161,6 +161,7 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, unsigned int nr_zones, report_zones_cb cb, void *data) { struct scsi_disk *sdkp = scsi_disk(disk); + sector_t capacity = logical_to_sectors(sdkp->device, sdkp->capacity); unsigned int nr, i; unsigned char *buf; size_t offset, buflen = 0; @@ -171,11 +172,15 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, /* Not a zoned device */ return -EOPNOTSUPP; + if (!capacity) + /* Device gone or invalid */ + return -ENODEV; + buf = sd_zbc_alloc_report_buffer(sdkp, nr_zones, &buflen); if (!buf) return -ENOMEM; - while (zone_idx < nr_zones && sector < get_capacity(disk)) { + while (zone_idx < nr_zones && sector < capacity) { ret = sd_zbc_do_report_zones(sdkp, buf, buflen, sectors_to_logical(sdkp->device, sector), true); if (ret)
The block layer generic blk_revalidate_disk_zones() checks the validity of zone descriptors reported by a disk using the blk_revalidate_zone_cb() callback function executed for each zone descriptor. If a ZBC disk reports invalid zone descriptors, blk_revalidate_disk_zones() returns an error and sd_zbc_read_zones() changes the disk capacity to 0, which in turn results in the gendisk structure capacity to be set to 0. This all works well for the first revalidate pass on a disk and the block layer detects the capactiy change. On the second revalidate pass, blk_revalidate_disk_zones() is called again and sd_zbc_report_zones() executed to check the zones a second time. However, for this second pass, the gendisk capacity is now 0, which results in sd_zbc_report_zones() to do nothing and to report success and no zones. blk_revalidate_disk_zones() in turn returns success and sets the disk queue chunk_sectors limit with zero as no zones were checked, causing a oops to trigger on the BUG_ON(!is_power_of_2(chunk_sectors)) in blk_queue_chunk_sectors(). Fix this by using the sdkp capacity field rather than the gendisk capacity for the report zones loop in sd_zbc_report_zones(). Also add a check to return immediately an error if the sdkp capacity is 0. With this fix, invalid/buggy ZBC disk scan does not trigger a oops and are exposed with a 0 capacity. This change also preserve the chance for the disk to be correctly revalidated on the second revalidate pass as the scsi disk structure capacity field is always set to the disk reported value when sd_zbc_report_zones() is called. Fixes: d41003513e61 ("block: rework zone reporting") Cc: Cc: <stable@vger.kernel.org> # v5.5 Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> --- drivers/scsi/sd_zbc.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)