Message ID | 20220811143043.126029-14-p.raghav@samsung.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | support zoned block devices with non-power-of-2 zone sizes | expand |
On 2022/08/11 7:30, Pankaj Raghav wrote: > Only zoned devices with power-of-2(po2) number of sectors per zone(zone > size) were supported in linux but now non power-of-2(npo2) zone sizes > support has been added to the block layer. > > Filesystems such as F2FS and btrfs have support for zoned devices with > po2 zone size assumption. Before adding native support for npo2 zone > sizes, it was suggested to create a dm target for npo2 zone size device to > appear as a po2 size zoned target so that file systems can initially work > without any explicit changes by using this target. > > The design of this target is very simple: remap the device zone size to > the zone capacity and change the zone size to be the nearest power of 2 > number of sectors. > > For e.g., a device with a zone size/capacity of 3M will have an equivalent > target layout as follows: > > Device layout :- > zone capacity = 3M > zone size = 3M > > |--------------|-------------| > 0 3M 6M > > Target layout :- > zone capacity=3M > zone size = 4M > > |--------------|---|--------------|---| > 0 3M 4M 7M 8M > > The area between target's zone capacity and zone size will be emulated > in the target. > The read IOs that fall in the emulated gap area will return 0 filled > bio and all the other IOs in that area will result in an error. > If a read IO span across the emulated area boundary, then the IOs are > split across them. All other IO operations that span across the emulated > area boundary will result in an error. > > The target can be easily created as follows: > dmsetup create <label> --table '0 <size_sects> po2zone /dev/nvme<id>' > > Note that the target does not support partial mapping of the underlying > device. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > Suggested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > Suggested-by: Damien Le Moal <damien.lemoal@wdc.com> > Suggested-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Joel Granados <j.granados@samsung.com> You changed this patch. So it needs review again. Please do not carry previous review tags in that case. > --- > .../admin-guide/device-mapper/dm-po2zone.rst | 71 +++++ > .../admin-guide/device-mapper/index.rst | 1 + > drivers/md/Kconfig | 10 + > drivers/md/Makefile | 2 + > drivers/md/dm-po2zone-target.c | 245 ++++++++++++++++++ > 5 files changed, 329 insertions(+) > create mode 100644 Documentation/admin-guide/device-mapper/dm-po2zone.rst > create mode 100644 drivers/md/dm-po2zone-target.c > > diff --git a/Documentation/admin-guide/device-mapper/dm-po2zone.rst b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > new file mode 100644 > index 000000000000..19dc215fbcca > --- /dev/null > +++ b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > @@ -0,0 +1,71 @@ > +========== > +dm-po2zone > +========== > +The dm-po2zone device mapper target exposes a zoned block device with a > +non-power-of-2(npo2) number of sectors per zone as a power-of-2(po2) > +number of sectors per zone(zone size). > +The filesystems that support zoned block devices such as F2FS and BTRFS > +assume po2 zone size as the kernel has traditionally only supported > +those devices. However, as the kernel now supports zoned block devices with > +npo2 zone sizes, the filesystems can run on top of the dm-po2zone target before > +adding native support. > + > +Partial mapping of the underlying device is not supported by this target. > + > +Algorithm > +========= > +The device mapper target maps the underlying device's zone size to the > +zone capacity and changes the zone size to the nearest po2 zone size. > +The gap between the zone capacity and the zone size is emulated in the target. > +E.g., a zoned block device with a zone size (and capacity) of 3M will have an > +equivalent target layout with mapping as follows: > + > +:: > + > + 0M 3M 4M 6M 8M > + | | | | | > + +x------------+--+x---------+--+x------- Target > + |x | |x | |x > + x x x > + x x x > + x x x > + x x x > + |x |x |x > + +x------------+x------------+x---------- Device > + | | | > + 0M 3M 6M > + > +A simple remap is performed for all the BIOs that do not cross the > +emulation gap area, i.e., the area between the zone capacity and size. > + > +If a BIO crosses the emulation gap area, the following operations are performed: > + > + Read: > + - If the BIO lies entirely in the emulation gap area, then zero out the BIO and complete it. > + - If the BIO spans the emulation gap area, split the BIO across the zone capacity boundary > + and remap only the BIO within the zone capacity boundary. The other part of the split BIO > + will be zeroed out. > + > + Other operations: > + - Return an error > + > +Table parameters > +================ > + > +:: > + > + <dev path> > + > +Mandatory parameters: > + > + <dev path>: > + Full pathname to the underlying block-device, or a > + "major:minor" device-number. > + > +Examples > +======== > + > +:: > + > + #!/bin/sh > + echo "0 `blockdev --getsz $1` po2zone $1" | dmsetup create po2z > diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst > index cde52cc09645..1fd04b5b0565 100644 > --- a/Documentation/admin-guide/device-mapper/index.rst > +++ b/Documentation/admin-guide/device-mapper/index.rst > @@ -23,6 +23,7 @@ Device Mapper > dm-service-time > dm-uevent > dm-zoned > + dm-po2zone > era > kcopyd > linear > diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig > index 998a5cfdbc4e..638801b2449a 100644 > --- a/drivers/md/Kconfig > +++ b/drivers/md/Kconfig > @@ -518,6 +518,16 @@ config DM_FLAKEY > help > A target that intermittently fails I/O for debugging purposes. > > +config DM_PO2ZONE > + tristate "Zoned block devices target emulating a power-of-2 number of sectors per zone" > + depends on BLK_DEV_DM > + depends on BLK_DEV_ZONED > + help > + A target that converts a zoned block device with non-power-of-2(npo2) > + number of sectors per zone to be power-of-2(po2). Use this target for > + zoned block devices with npo2 number of sectors per zone until native > + support is added to the filesystems and applications. > + > config DM_VERITY > tristate "Verity target support" > depends on BLK_DEV_DM > diff --git a/drivers/md/Makefile b/drivers/md/Makefile > index 84291e38dca8..c23f81cc8789 100644 > --- a/drivers/md/Makefile > +++ b/drivers/md/Makefile > @@ -26,6 +26,7 @@ dm-era-y += dm-era-target.o > dm-clone-y += dm-clone-target.o dm-clone-metadata.o > dm-verity-y += dm-verity-target.o > dm-zoned-y += dm-zoned-target.o dm-zoned-metadata.o dm-zoned-reclaim.o > +dm-po2zone-y += dm-po2zone-target.o > > md-mod-y += md.o md-bitmap.o > raid456-y += raid5.o raid5-cache.o raid5-ppl.o > @@ -60,6 +61,7 @@ obj-$(CONFIG_DM_CRYPT) += dm-crypt.o > obj-$(CONFIG_DM_DELAY) += dm-delay.o > obj-$(CONFIG_DM_DUST) += dm-dust.o > obj-$(CONFIG_DM_FLAKEY) += dm-flakey.o > +obj-$(CONFIG_DM_PO2ZONE) += dm-po2zone.o > obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o > obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o > obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o > diff --git a/drivers/md/dm-po2zone-target.c b/drivers/md/dm-po2zone-target.c > new file mode 100644 > index 000000000000..d7610463245d > --- /dev/null > +++ b/drivers/md/dm-po2zone-target.c > @@ -0,0 +1,245 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2022 Samsung Electronics Co., Ltd. > + */ > + > +#include <linux/device-mapper.h> > + > +#define DM_MSG_PREFIX "po2zone" > + > +struct dm_po2z_target { > + struct dm_dev *dev; > + sector_t zone_size; /* Actual zone size of the underlying dev*/ > + sector_t zone_size_po2; /* zone_size rounded to the nearest po2 value */ > + unsigned int zone_size_po2_shift; > + sector_t zone_size_diff; /* diff between zone_size_po2 and zone_size */ > + unsigned int nr_zones; > +}; > + > +static inline unsigned int npo2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return div64_u64(sect, dmh->zone_size); > +} > + > +static inline unsigned int po2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect >> dmh->zone_size_po2_shift; > +} > + > +static inline sector_t target_to_device_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect - (po2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +static inline sector_t device_to_target_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect + (npo2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +/* > + * This target works on the complete zoned device. Partial mapping is not > + * supported. > + * Construct a zoned po2 logical device: <dev-path> > + */ > +static int dm_po2z_ctr(struct dm_target *ti, unsigned int argc, char **argv) > +{ > + struct dm_po2z_target *dmh = NULL; > + int ret; > + sector_t zone_size; > + sector_t dev_capacity; > + > + if (argc != 1) > + return -EINVAL; > + > + dmh = kmalloc(sizeof(*dmh), GFP_KERNEL); > + if (!dmh) > + return -ENOMEM; > + > + ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), > + &dmh->dev); > + if (ret) { > + ti->error = "Device lookup failed"; > + kfree(dmh); > + return ret; > + } > + > + if (!bdev_is_zoned(dmh->dev->bdev)) { > + DMERR("%pg is not a zoned device", dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + zone_size = bdev_zone_sectors(dmh->dev->bdev); > + dev_capacity = get_capacity(dmh->dev->bdev->bd_disk); > + if (ti->len != dev_capacity || ti->begin) { > + DMERR("%pg Partial mapping of the target not supported", > + dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + if (is_power_of_2(zone_size)) > + DMWARN("%pg: underlying device has a power-of-2 number of sectors per zone", > + dmh->dev->bdev); > + > + dmh->zone_size = zone_size; > + dmh->zone_size_po2 = 1 << get_count_order_long(zone_size); > + dmh->zone_size_po2_shift = ilog2(dmh->zone_size_po2); > + dmh->zone_size_diff = dmh->zone_size_po2 - dmh->zone_size; > + ti->private = dmh; > + ti->max_io_len = dmh->zone_size_po2; > + dmh->nr_zones = npo2_zone_no(dmh, ti->len); > + ti->len = dmh->zone_size_po2 * dmh->nr_zones; > + > + return 0; > +} > + > +static int dm_po2z_report_zones_cb(struct blk_zone *zone, unsigned int idx, > + void *data) > +{ > + struct dm_report_zones_args *args = data; > + struct dm_po2z_target *dmh = args->tgt->private; > + > + zone->start = device_to_target_sect(dmh, zone->start); > + zone->wp = device_to_target_sect(dmh, zone->wp); > + zone->len = dmh->zone_size_po2; > + args->next_sector = zone->start + zone->len; > + > + return args->orig_cb(zone, args->zone_idx++, args->orig_data); > +} > + > +static int dm_po2z_report_zones(struct dm_target *ti, > + struct dm_report_zones_args *args, > + unsigned int nr_zones) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t sect = po2_zone_no(dmh, args->next_sector) * dmh->zone_size; > + > + return blkdev_report_zones(dmh->dev->bdev, sect, nr_zones, > + dm_po2z_report_zones_cb, args); > +} > + > +static int dm_po2z_end_io(struct dm_target *ti, struct bio *bio, > + blk_status_t *error) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + if (bio->bi_status == BLK_STS_OK && bio_op(bio) == REQ_OP_ZONE_APPEND) > + bio->bi_iter.bi_sector = > + device_to_target_sect(dmh, bio->bi_iter.bi_sector); > + > + return DM_ENDIO_DONE; > +} > + > +static void dm_po2z_io_hints(struct dm_target *ti, struct queue_limits *limits) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + limits->chunk_sectors = dmh->zone_size_po2; > +} > + > +static bool bio_across_emulated_zone_area(struct dm_po2z_target *dmh, > + struct bio *bio) > +{ > + unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); > + sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + return (bio->bi_iter.bi_sector + nr_sectors) > > + (zone_idx << dmh->zone_size_po2_shift) + dmh->zone_size; > +} > + > +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, > + struct bio *bio) > +{ > + sector_t start_sect = bio->bi_iter.bi_sector; > + unsigned int zone_idx = po2_zone_no(dmh, start_sect); > + sector_t relative_sect_in_zone = start_sect - (zone_idx << dmh->zone_size_po2_shift); > + sector_t split_io_pos; > + > + /* > + * If the starting sector is in the emulated area then fill > + * all the bio with zeros. If bio is across emulated zone boundary, > + * split the bio across boundaries and fill zeros only for the > + * bio that is between the zone capacity and the zone size. > + */ > + if (relative_sect_in_zone < dmh->zone_size) { > + split_io_pos = (zone_idx * dmh->zone_size_po2) + dmh->zone_size; > + dm_accept_partial_bio(bio, split_io_pos - start_sect); > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, start_sect); > + > + return DM_MAPIO_REMAPPED; > + } > + > + zero_fill_bio(bio); > + bio_endio(bio); > + return DM_MAPIO_SUBMITTED; > +} > + > +static int dm_po2z_map(struct dm_target *ti, struct bio *bio) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + bio_set_dev(bio, dmh->dev->bdev); > + if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { > + if (!bio_across_emulated_zone_area(dmh, bio)) { > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, > + bio->bi_iter.bi_sector); > + return DM_MAPIO_REMAPPED; > + } > + /* > + * Read operation on the emulated zone area (between zone capacity > + * and zone size) will fill the bio with zeroes. Any other operation > + * in the emulated area should return an error. > + */ > + if (bio_op(bio) == REQ_OP_READ) > + return dm_po2z_map_read_emulated_area(dmh, bio); > + > + return DM_MAPIO_KILL; > + } > + return DM_MAPIO_REMAPPED; > +} > + > +static int dm_po2z_iterate_devices(struct dm_target *ti, > + iterate_devices_callout_fn fn, void *data) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t len = dmh->nr_zones * dmh->zone_size; > + > + return fn(ti, dmh->dev, 0, len, data); > +} > + > +static struct target_type dm_po2z_target = { > + .name = "po2zone", > + .version = { 1, 0, 0 }, > + .features = DM_TARGET_ZONED_HM | DM_TARGET_EMULATED_ZONES, > + .map = dm_po2z_map, > + .end_io = dm_po2z_end_io, > + .report_zones = dm_po2z_report_zones, > + .iterate_devices = dm_po2z_iterate_devices, > + .module = THIS_MODULE, > + .io_hints = dm_po2z_io_hints, > + .ctr = dm_po2z_ctr, > +}; > + > +static int __init dm_po2z_init(void) > +{ > + return dm_register_target(&dm_po2z_target); > +} > + > +static void __exit dm_po2z_exit(void) > +{ > + dm_unregister_target(&dm_po2z_target); > +} > + > +/* Module hooks */ > +module_init(dm_po2z_init); > +module_exit(dm_po2z_exit); > + > +MODULE_DESCRIPTION(DM_NAME "power-of-2 zoned target"); > +MODULE_AUTHOR("Pankaj Raghav <p.raghav@samsung.com>"); > +MODULE_LICENSE("GPL"); > +
Hi Damien, >> |--------------|---|--------------|---| >> 0 3M 4M 7M 8M >> >> The area between target's zone capacity and zone size will be emulated >> in the target. >> The read IOs that fall in the emulated gap area will return 0 filled >> bio and all the other IOs in that area will result in an error. >> If a read IO span across the emulated area boundary, then the IOs are >> split across them. All other IO operations that span across the emulated >> area boundary will result in an error. >> >> The target can be easily created as follows: >> dmsetup create <label> --table '0 <size_sects> po2zone /dev/nvme<id>' >> >> Note that the target does not support partial mapping of the underlying >> device. >> >> Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> >> Suggested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> >> Suggested-by: Damien Le Moal <damien.lemoal@wdc.com> >> Suggested-by: Hannes Reinecke <hare@suse.de> >> Reviewed-by: Joel Granados <j.granados@samsung.com> > > You changed this patch. So it needs review again. Please do not carry previous > review tags in that case. > The changes were non-functional such as changing multiplication to shifts and rephrasing error messages. While I agree with your comment not to carry forward review tags if the patch has changed, I thought it might be too much to ask a reviewer to re-review for non-functional changes such as these. But I will request Joel to do a quick re-review. Thanks.
On Thu, Aug 11, 2022 at 04:30:43PM +0200, Pankaj Raghav wrote: > Only zoned devices with power-of-2(po2) number of sectors per zone(zone > size) were supported in linux but now non power-of-2(npo2) zone sizes > support has been added to the block layer. LGTM. I believe the commit message to be more clear like this. I also looked at the other changes and it seems that they have addressed the previous comments Reviewed-by: Joel Granados <j.granados@samsung.com> > > Filesystems such as F2FS and btrfs have support for zoned devices with > po2 zone size assumption. Before adding native support for npo2 zone > sizes, it was suggested to create a dm target for npo2 zone size device to > appear as a po2 size zoned target so that file systems can initially work > without any explicit changes by using this target. > > The design of this target is very simple: remap the device zone size to > the zone capacity and change the zone size to be the nearest power of 2 > number of sectors. > > For e.g., a device with a zone size/capacity of 3M will have an equivalent > target layout as follows: > > Device layout :- > zone capacity = 3M > zone size = 3M > > |--------------|-------------| > 0 3M 6M > > Target layout :- > zone capacity=3M > zone size = 4M > > |--------------|---|--------------|---| > 0 3M 4M 7M 8M > > The area between target's zone capacity and zone size will be emulated > in the target. > The read IOs that fall in the emulated gap area will return 0 filled > bio and all the other IOs in that area will result in an error. > If a read IO span across the emulated area boundary, then the IOs are > split across them. All other IO operations that span across the emulated > area boundary will result in an error. > > The target can be easily created as follows: > dmsetup create <label> --table '0 <size_sects> po2zone /dev/nvme<id>' > > Note that the target does not support partial mapping of the underlying > device. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > Suggested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > Suggested-by: Damien Le Moal <damien.lemoal@wdc.com> > Suggested-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Joel Granados <j.granados@samsung.com> > --- > .../admin-guide/device-mapper/dm-po2zone.rst | 71 +++++ > .../admin-guide/device-mapper/index.rst | 1 + > drivers/md/Kconfig | 10 + > drivers/md/Makefile | 2 + > drivers/md/dm-po2zone-target.c | 245 ++++++++++++++++++ > 5 files changed, 329 insertions(+) > create mode 100644 Documentation/admin-guide/device-mapper/dm-po2zone.rst > create mode 100644 drivers/md/dm-po2zone-target.c > > diff --git a/Documentation/admin-guide/device-mapper/dm-po2zone.rst b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > new file mode 100644 > index 000000000000..19dc215fbcca > --- /dev/null > +++ b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > @@ -0,0 +1,71 @@ > +========== > +dm-po2zone > +========== > +The dm-po2zone device mapper target exposes a zoned block device with a > +non-power-of-2(npo2) number of sectors per zone as a power-of-2(po2) > +number of sectors per zone(zone size). > +The filesystems that support zoned block devices such as F2FS and BTRFS > +assume po2 zone size as the kernel has traditionally only supported > +those devices. However, as the kernel now supports zoned block devices with > +npo2 zone sizes, the filesystems can run on top of the dm-po2zone target before > +adding native support. > + > +Partial mapping of the underlying device is not supported by this target. > + > +Algorithm > +========= > +The device mapper target maps the underlying device's zone size to the > +zone capacity and changes the zone size to the nearest po2 zone size. > +The gap between the zone capacity and the zone size is emulated in the target. > +E.g., a zoned block device with a zone size (and capacity) of 3M will have an > +equivalent target layout with mapping as follows: > + > +:: > + > + 0M 3M 4M 6M 8M > + | | | | | > + +x------------+--+x---------+--+x------- Target > + |x | |x | |x > + x x x > + x x x > + x x x > + x x x > + |x |x |x > + +x------------+x------------+x---------- Device > + | | | > + 0M 3M 6M > + > +A simple remap is performed for all the BIOs that do not cross the > +emulation gap area, i.e., the area between the zone capacity and size. > + > +If a BIO crosses the emulation gap area, the following operations are performed: > + > + Read: > + - If the BIO lies entirely in the emulation gap area, then zero out the BIO and complete it. > + - If the BIO spans the emulation gap area, split the BIO across the zone capacity boundary > + and remap only the BIO within the zone capacity boundary. The other part of the split BIO > + will be zeroed out. > + > + Other operations: > + - Return an error > + > +Table parameters > +================ > + > +:: > + > + <dev path> > + > +Mandatory parameters: > + > + <dev path>: > + Full pathname to the underlying block-device, or a > + "major:minor" device-number. > + > +Examples > +======== > + > +:: > + > + #!/bin/sh > + echo "0 `blockdev --getsz $1` po2zone $1" | dmsetup create po2z > diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst > index cde52cc09645..1fd04b5b0565 100644 > --- a/Documentation/admin-guide/device-mapper/index.rst > +++ b/Documentation/admin-guide/device-mapper/index.rst > @@ -23,6 +23,7 @@ Device Mapper > dm-service-time > dm-uevent > dm-zoned > + dm-po2zone > era > kcopyd > linear > diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig > index 998a5cfdbc4e..638801b2449a 100644 > --- a/drivers/md/Kconfig > +++ b/drivers/md/Kconfig > @@ -518,6 +518,16 @@ config DM_FLAKEY > help > A target that intermittently fails I/O for debugging purposes. > > +config DM_PO2ZONE > + tristate "Zoned block devices target emulating a power-of-2 number of sectors per zone" > + depends on BLK_DEV_DM > + depends on BLK_DEV_ZONED > + help > + A target that converts a zoned block device with non-power-of-2(npo2) > + number of sectors per zone to be power-of-2(po2). Use this target for > + zoned block devices with npo2 number of sectors per zone until native > + support is added to the filesystems and applications. > + > config DM_VERITY > tristate "Verity target support" > depends on BLK_DEV_DM > diff --git a/drivers/md/Makefile b/drivers/md/Makefile > index 84291e38dca8..c23f81cc8789 100644 > --- a/drivers/md/Makefile > +++ b/drivers/md/Makefile > @@ -26,6 +26,7 @@ dm-era-y += dm-era-target.o > dm-clone-y += dm-clone-target.o dm-clone-metadata.o > dm-verity-y += dm-verity-target.o > dm-zoned-y += dm-zoned-target.o dm-zoned-metadata.o dm-zoned-reclaim.o > +dm-po2zone-y += dm-po2zone-target.o > > md-mod-y += md.o md-bitmap.o > raid456-y += raid5.o raid5-cache.o raid5-ppl.o > @@ -60,6 +61,7 @@ obj-$(CONFIG_DM_CRYPT) += dm-crypt.o > obj-$(CONFIG_DM_DELAY) += dm-delay.o > obj-$(CONFIG_DM_DUST) += dm-dust.o > obj-$(CONFIG_DM_FLAKEY) += dm-flakey.o > +obj-$(CONFIG_DM_PO2ZONE) += dm-po2zone.o > obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o > obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o > obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o > diff --git a/drivers/md/dm-po2zone-target.c b/drivers/md/dm-po2zone-target.c > new file mode 100644 > index 000000000000..d7610463245d > --- /dev/null > +++ b/drivers/md/dm-po2zone-target.c > @@ -0,0 +1,245 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2022 Samsung Electronics Co., Ltd. > + */ > + > +#include <linux/device-mapper.h> > + > +#define DM_MSG_PREFIX "po2zone" > + > +struct dm_po2z_target { > + struct dm_dev *dev; > + sector_t zone_size; /* Actual zone size of the underlying dev*/ > + sector_t zone_size_po2; /* zone_size rounded to the nearest po2 value */ > + unsigned int zone_size_po2_shift; > + sector_t zone_size_diff; /* diff between zone_size_po2 and zone_size */ > + unsigned int nr_zones; > +}; > + > +static inline unsigned int npo2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return div64_u64(sect, dmh->zone_size); > +} > + > +static inline unsigned int po2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect >> dmh->zone_size_po2_shift; > +} > + > +static inline sector_t target_to_device_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect - (po2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +static inline sector_t device_to_target_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect + (npo2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +/* > + * This target works on the complete zoned device. Partial mapping is not > + * supported. > + * Construct a zoned po2 logical device: <dev-path> > + */ > +static int dm_po2z_ctr(struct dm_target *ti, unsigned int argc, char **argv) > +{ > + struct dm_po2z_target *dmh = NULL; > + int ret; > + sector_t zone_size; > + sector_t dev_capacity; > + > + if (argc != 1) > + return -EINVAL; > + > + dmh = kmalloc(sizeof(*dmh), GFP_KERNEL); > + if (!dmh) > + return -ENOMEM; > + > + ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), > + &dmh->dev); > + if (ret) { > + ti->error = "Device lookup failed"; > + kfree(dmh); > + return ret; > + } > + > + if (!bdev_is_zoned(dmh->dev->bdev)) { > + DMERR("%pg is not a zoned device", dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + zone_size = bdev_zone_sectors(dmh->dev->bdev); > + dev_capacity = get_capacity(dmh->dev->bdev->bd_disk); > + if (ti->len != dev_capacity || ti->begin) { > + DMERR("%pg Partial mapping of the target not supported", > + dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + if (is_power_of_2(zone_size)) > + DMWARN("%pg: underlying device has a power-of-2 number of sectors per zone", > + dmh->dev->bdev); > + > + dmh->zone_size = zone_size; > + dmh->zone_size_po2 = 1 << get_count_order_long(zone_size); > + dmh->zone_size_po2_shift = ilog2(dmh->zone_size_po2); > + dmh->zone_size_diff = dmh->zone_size_po2 - dmh->zone_size; > + ti->private = dmh; > + ti->max_io_len = dmh->zone_size_po2; > + dmh->nr_zones = npo2_zone_no(dmh, ti->len); > + ti->len = dmh->zone_size_po2 * dmh->nr_zones; > + > + return 0; > +} > + > +static int dm_po2z_report_zones_cb(struct blk_zone *zone, unsigned int idx, > + void *data) > +{ > + struct dm_report_zones_args *args = data; > + struct dm_po2z_target *dmh = args->tgt->private; > + > + zone->start = device_to_target_sect(dmh, zone->start); > + zone->wp = device_to_target_sect(dmh, zone->wp); > + zone->len = dmh->zone_size_po2; > + args->next_sector = zone->start + zone->len; > + > + return args->orig_cb(zone, args->zone_idx++, args->orig_data); > +} > + > +static int dm_po2z_report_zones(struct dm_target *ti, > + struct dm_report_zones_args *args, > + unsigned int nr_zones) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t sect = po2_zone_no(dmh, args->next_sector) * dmh->zone_size; > + > + return blkdev_report_zones(dmh->dev->bdev, sect, nr_zones, > + dm_po2z_report_zones_cb, args); > +} > + > +static int dm_po2z_end_io(struct dm_target *ti, struct bio *bio, > + blk_status_t *error) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + if (bio->bi_status == BLK_STS_OK && bio_op(bio) == REQ_OP_ZONE_APPEND) > + bio->bi_iter.bi_sector = > + device_to_target_sect(dmh, bio->bi_iter.bi_sector); > + > + return DM_ENDIO_DONE; > +} > + > +static void dm_po2z_io_hints(struct dm_target *ti, struct queue_limits *limits) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + limits->chunk_sectors = dmh->zone_size_po2; > +} > + > +static bool bio_across_emulated_zone_area(struct dm_po2z_target *dmh, > + struct bio *bio) > +{ > + unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); > + sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + return (bio->bi_iter.bi_sector + nr_sectors) > > + (zone_idx << dmh->zone_size_po2_shift) + dmh->zone_size; > +} > + > +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, > + struct bio *bio) > +{ > + sector_t start_sect = bio->bi_iter.bi_sector; > + unsigned int zone_idx = po2_zone_no(dmh, start_sect); > + sector_t relative_sect_in_zone = start_sect - (zone_idx << dmh->zone_size_po2_shift); > + sector_t split_io_pos; > + > + /* > + * If the starting sector is in the emulated area then fill > + * all the bio with zeros. If bio is across emulated zone boundary, > + * split the bio across boundaries and fill zeros only for the > + * bio that is between the zone capacity and the zone size. > + */ > + if (relative_sect_in_zone < dmh->zone_size) { > + split_io_pos = (zone_idx * dmh->zone_size_po2) + dmh->zone_size; > + dm_accept_partial_bio(bio, split_io_pos - start_sect); > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, start_sect); > + > + return DM_MAPIO_REMAPPED; > + } > + > + zero_fill_bio(bio); > + bio_endio(bio); > + return DM_MAPIO_SUBMITTED; > +} > + > +static int dm_po2z_map(struct dm_target *ti, struct bio *bio) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + bio_set_dev(bio, dmh->dev->bdev); > + if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { > + if (!bio_across_emulated_zone_area(dmh, bio)) { > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, > + bio->bi_iter.bi_sector); > + return DM_MAPIO_REMAPPED; > + } > + /* > + * Read operation on the emulated zone area (between zone capacity > + * and zone size) will fill the bio with zeroes. Any other operation > + * in the emulated area should return an error. > + */ > + if (bio_op(bio) == REQ_OP_READ) > + return dm_po2z_map_read_emulated_area(dmh, bio); > + > + return DM_MAPIO_KILL; > + } > + return DM_MAPIO_REMAPPED; > +} > + > +static int dm_po2z_iterate_devices(struct dm_target *ti, > + iterate_devices_callout_fn fn, void *data) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t len = dmh->nr_zones * dmh->zone_size; > + > + return fn(ti, dmh->dev, 0, len, data); > +} > + > +static struct target_type dm_po2z_target = { > + .name = "po2zone", > + .version = { 1, 0, 0 }, > + .features = DM_TARGET_ZONED_HM | DM_TARGET_EMULATED_ZONES, > + .map = dm_po2z_map, > + .end_io = dm_po2z_end_io, > + .report_zones = dm_po2z_report_zones, > + .iterate_devices = dm_po2z_iterate_devices, > + .module = THIS_MODULE, > + .io_hints = dm_po2z_io_hints, > + .ctr = dm_po2z_ctr, > +}; > + > +static int __init dm_po2z_init(void) > +{ > + return dm_register_target(&dm_po2z_target); > +} > + > +static void __exit dm_po2z_exit(void) > +{ > + dm_unregister_target(&dm_po2z_target); > +} > + > +/* Module hooks */ > +module_init(dm_po2z_init); > +module_exit(dm_po2z_exit); > + > +MODULE_DESCRIPTION(DM_NAME "power-of-2 zoned target"); > +MODULE_AUTHOR("Pankaj Raghav <p.raghav@samsung.com>"); > +MODULE_LICENSE("GPL"); > + > -- > 2.25.1 >
On 2022/08/11 7:30, Pankaj Raghav wrote: > Only zoned devices with power-of-2(po2) number of sectors per zone(zone > size) were supported in linux but now non power-of-2(npo2) zone sizes > support has been added to the block layer. > > Filesystems such as F2FS and btrfs have support for zoned devices with > po2 zone size assumption. Before adding native support for npo2 zone > sizes, it was suggested to create a dm target for npo2 zone size device to > appear as a po2 size zoned target so that file systems can initially work > without any explicit changes by using this target. > > The design of this target is very simple: remap the device zone size to > the zone capacity and change the zone size to be the nearest power of 2 > number of sectors. > > For e.g., a device with a zone size/capacity of 3M will have an equivalent > target layout as follows: > > Device layout :- > zone capacity = 3M > zone size = 3M > > |--------------|-------------| > 0 3M 6M > > Target layout :- > zone capacity=3M > zone size = 4M > > |--------------|---|--------------|---| > 0 3M 4M 7M 8M > > The area between target's zone capacity and zone size will be emulated > in the target. > The read IOs that fall in the emulated gap area will return 0 filled > bio and all the other IOs in that area will result in an error. > If a read IO span across the emulated area boundary, then the IOs are > split across them. All other IO operations that span across the emulated > area boundary will result in an error. > > The target can be easily created as follows: > dmsetup create <label> --table '0 <size_sects> po2zone /dev/nvme<id>' > > Note that the target does not support partial mapping of the underlying > device. > > Signed-off-by: Pankaj Raghav <p.raghav@samsung.com> > Suggested-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> > Suggested-by: Damien Le Moal <damien.lemoal@wdc.com> > Suggested-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Joel Granados <j.granados@samsung.com> > --- > .../admin-guide/device-mapper/dm-po2zone.rst | 71 +++++ > .../admin-guide/device-mapper/index.rst | 1 + > drivers/md/Kconfig | 10 + > drivers/md/Makefile | 2 + > drivers/md/dm-po2zone-target.c | 245 ++++++++++++++++++ > 5 files changed, 329 insertions(+) > create mode 100644 Documentation/admin-guide/device-mapper/dm-po2zone.rst > create mode 100644 drivers/md/dm-po2zone-target.c > > diff --git a/Documentation/admin-guide/device-mapper/dm-po2zone.rst b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > new file mode 100644 > index 000000000000..19dc215fbcca > --- /dev/null > +++ b/Documentation/admin-guide/device-mapper/dm-po2zone.rst > @@ -0,0 +1,71 @@ > +========== > +dm-po2zone > +========== > +The dm-po2zone device mapper target exposes a zoned block device with a > +non-power-of-2(npo2) number of sectors per zone as a power-of-2(po2) > +number of sectors per zone(zone size). > +The filesystems that support zoned block devices such as F2FS and BTRFS > +assume po2 zone size as the kernel has traditionally only supported > +those devices. However, as the kernel now supports zoned block devices with > +npo2 zone sizes, the filesystems can run on top of the dm-po2zone target before > +adding native support. > + > +Partial mapping of the underlying device is not supported by this target. > + > +Algorithm > +========= > +The device mapper target maps the underlying device's zone size to the > +zone capacity and changes the zone size to the nearest po2 zone size. > +The gap between the zone capacity and the zone size is emulated in the target. > +E.g., a zoned block device with a zone size (and capacity) of 3M will have an > +equivalent target layout with mapping as follows: > + > +:: > + > + 0M 3M 4M 6M 8M > + | | | | | > + +x------------+--+x---------+--+x------- Target > + |x | |x | |x > + x x x > + x x x > + x x x > + x x x > + |x |x |x > + +x------------+x------------+x---------- Device > + | | | > + 0M 3M 6M > + > +A simple remap is performed for all the BIOs that do not cross the > +emulation gap area, i.e., the area between the zone capacity and size. > + > +If a BIO crosses the emulation gap area, the following operations are performed: > + > + Read: > + - If the BIO lies entirely in the emulation gap area, then zero out the BIO and complete it. > + - If the BIO spans the emulation gap area, split the BIO across the zone capacity boundary > + and remap only the BIO within the zone capacity boundary. The other part of the split BIO > + will be zeroed out. > + > + Other operations: > + - Return an error > + > +Table parameters > +================ > + > +:: > + > + <dev path> > + > +Mandatory parameters: > + > + <dev path>: > + Full pathname to the underlying block-device, or a > + "major:minor" device-number. > + > +Examples > +======== > + > +:: > + > + #!/bin/sh > + echo "0 `blockdev --getsz $1` po2zone $1" | dmsetup create po2z > diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst > index cde52cc09645..1fd04b5b0565 100644 > --- a/Documentation/admin-guide/device-mapper/index.rst > +++ b/Documentation/admin-guide/device-mapper/index.rst > @@ -23,6 +23,7 @@ Device Mapper > dm-service-time > dm-uevent > dm-zoned > + dm-po2zone > era > kcopyd > linear > diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig > index 998a5cfdbc4e..638801b2449a 100644 > --- a/drivers/md/Kconfig > +++ b/drivers/md/Kconfig > @@ -518,6 +518,16 @@ config DM_FLAKEY > help > A target that intermittently fails I/O for debugging purposes. > > +config DM_PO2ZONE > + tristate "Zoned block devices target emulating a power-of-2 number of sectors per zone" > + depends on BLK_DEV_DM > + depends on BLK_DEV_ZONED > + help > + A target that converts a zoned block device with non-power-of-2(npo2) > + number of sectors per zone to be power-of-2(po2). Use this target for > + zoned block devices with npo2 number of sectors per zone until native > + support is added to the filesystems and applications. > + > config DM_VERITY > tristate "Verity target support" > depends on BLK_DEV_DM > diff --git a/drivers/md/Makefile b/drivers/md/Makefile > index 84291e38dca8..c23f81cc8789 100644 > --- a/drivers/md/Makefile > +++ b/drivers/md/Makefile > @@ -26,6 +26,7 @@ dm-era-y += dm-era-target.o > dm-clone-y += dm-clone-target.o dm-clone-metadata.o > dm-verity-y += dm-verity-target.o > dm-zoned-y += dm-zoned-target.o dm-zoned-metadata.o dm-zoned-reclaim.o > +dm-po2zone-y += dm-po2zone-target.o > > md-mod-y += md.o md-bitmap.o > raid456-y += raid5.o raid5-cache.o raid5-ppl.o > @@ -60,6 +61,7 @@ obj-$(CONFIG_DM_CRYPT) += dm-crypt.o > obj-$(CONFIG_DM_DELAY) += dm-delay.o > obj-$(CONFIG_DM_DUST) += dm-dust.o > obj-$(CONFIG_DM_FLAKEY) += dm-flakey.o > +obj-$(CONFIG_DM_PO2ZONE) += dm-po2zone.o > obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o > obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o > obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o > diff --git a/drivers/md/dm-po2zone-target.c b/drivers/md/dm-po2zone-target.c > new file mode 100644 > index 000000000000..d7610463245d > --- /dev/null > +++ b/drivers/md/dm-po2zone-target.c > @@ -0,0 +1,245 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (C) 2022 Samsung Electronics Co., Ltd. > + */ > + > +#include <linux/device-mapper.h> > + > +#define DM_MSG_PREFIX "po2zone" > + > +struct dm_po2z_target { > + struct dm_dev *dev; > + sector_t zone_size; /* Actual zone size of the underlying dev*/ > + sector_t zone_size_po2; /* zone_size rounded to the nearest po2 value */ > + unsigned int zone_size_po2_shift; > + sector_t zone_size_diff; /* diff between zone_size_po2 and zone_size */ > + unsigned int nr_zones; > +}; > + > +static inline unsigned int npo2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return div64_u64(sect, dmh->zone_size); > +} > + > +static inline unsigned int po2_zone_no(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect >> dmh->zone_size_po2_shift; > +} > + > +static inline sector_t target_to_device_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect - (po2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +static inline sector_t device_to_target_sect(struct dm_po2z_target *dmh, > + sector_t sect) > +{ > + return sect + (npo2_zone_no(dmh, sect) * dmh->zone_size_diff); > +} > + > +/* > + * This target works on the complete zoned device. Partial mapping is not > + * supported. > + * Construct a zoned po2 logical device: <dev-path> > + */ > +static int dm_po2z_ctr(struct dm_target *ti, unsigned int argc, char **argv) > +{ > + struct dm_po2z_target *dmh = NULL; > + int ret; > + sector_t zone_size; > + sector_t dev_capacity; > + > + if (argc != 1) > + return -EINVAL; > + > + dmh = kmalloc(sizeof(*dmh), GFP_KERNEL); > + if (!dmh) > + return -ENOMEM; > + > + ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), > + &dmh->dev); > + if (ret) { > + ti->error = "Device lookup failed"; > + kfree(dmh); > + return ret; > + } > + > + if (!bdev_is_zoned(dmh->dev->bdev)) { > + DMERR("%pg is not a zoned device", dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + zone_size = bdev_zone_sectors(dmh->dev->bdev); > + dev_capacity = get_capacity(dmh->dev->bdev->bd_disk); > + if (ti->len != dev_capacity || ti->begin) { > + DMERR("%pg Partial mapping of the target not supported", > + dmh->dev->bdev); > + kfree(dmh); > + return -EINVAL; > + } > + > + if (is_power_of_2(zone_size)) > + DMWARN("%pg: underlying device has a power-of-2 number of sectors per zone", > + dmh->dev->bdev); > + > + dmh->zone_size = zone_size; > + dmh->zone_size_po2 = 1 << get_count_order_long(zone_size); > + dmh->zone_size_po2_shift = ilog2(dmh->zone_size_po2); > + dmh->zone_size_diff = dmh->zone_size_po2 - dmh->zone_size; > + ti->private = dmh; > + ti->max_io_len = dmh->zone_size_po2; > + dmh->nr_zones = npo2_zone_no(dmh, ti->len); > + ti->len = dmh->zone_size_po2 * dmh->nr_zones; > + > + return 0; > +} > + > +static int dm_po2z_report_zones_cb(struct blk_zone *zone, unsigned int idx, > + void *data) > +{ > + struct dm_report_zones_args *args = data; > + struct dm_po2z_target *dmh = args->tgt->private; > + > + zone->start = device_to_target_sect(dmh, zone->start); > + zone->wp = device_to_target_sect(dmh, zone->wp); > + zone->len = dmh->zone_size_po2; > + args->next_sector = zone->start + zone->len; > + > + return args->orig_cb(zone, args->zone_idx++, args->orig_data); > +} > + > +static int dm_po2z_report_zones(struct dm_target *ti, > + struct dm_report_zones_args *args, > + unsigned int nr_zones) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t sect = po2_zone_no(dmh, args->next_sector) * dmh->zone_size; > + > + return blkdev_report_zones(dmh->dev->bdev, sect, nr_zones, > + dm_po2z_report_zones_cb, args); > +} > + > +static int dm_po2z_end_io(struct dm_target *ti, struct bio *bio, > + blk_status_t *error) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + if (bio->bi_status == BLK_STS_OK && bio_op(bio) == REQ_OP_ZONE_APPEND) > + bio->bi_iter.bi_sector = > + device_to_target_sect(dmh, bio->bi_iter.bi_sector); > + > + return DM_ENDIO_DONE; > +} > + > +static void dm_po2z_io_hints(struct dm_target *ti, struct queue_limits *limits) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + limits->chunk_sectors = dmh->zone_size_po2; > +} > + > +static bool bio_across_emulated_zone_area(struct dm_po2z_target *dmh, > + struct bio *bio) > +{ > + unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); > + sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > + > + return (bio->bi_iter.bi_sector + nr_sectors) > > + (zone_idx << dmh->zone_size_po2_shift) + dmh->zone_size; > +} > + > +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, > + struct bio *bio) This really should be called dm_po2z_handle_read() since it handles both cases of read commands, with and without the accept partial. Note that this is retesting the need for a split after that was already tested in dm_po2z_map() with bio_across_emulated_zone_area(). So the code should be better organized to avoid that repetition. You can simplify the code by having bio_across_emulated_zone_area() return 0 for a bio that does not cross the zone capacity and return the number of sectors up to the zone capacity from the bio start if there is a cross. That value can then be used to call dm_accept_partial_bio() directly in dm_po2z_map() for read commands. That will make this function much simpler and essentially turn it into dm_po2z_read_zeroes(). > +{ > + sector_t start_sect = bio->bi_iter.bi_sector; > + unsigned int zone_idx = po2_zone_no(dmh, start_sect); > + sector_t relative_sect_in_zone = start_sect - (zone_idx << dmh->zone_size_po2_shift); Long line. Please split it after the "=". > + sector_t split_io_pos; > + > + /* > + * If the starting sector is in the emulated area then fill > + * all the bio with zeros. If bio is across emulated zone boundary, > + * split the bio across boundaries and fill zeros only for the > + * bio that is between the zone capacity and the zone size. > + */ > + if (relative_sect_in_zone < dmh->zone_size) { > + split_io_pos = (zone_idx * dmh->zone_size_po2) + dmh->zone_size; Use bit shift here please. > + dm_accept_partial_bio(bio, split_io_pos - start_sect); > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, start_sect); > + > + return DM_MAPIO_REMAPPED; > + } > + > + zero_fill_bio(bio); > + bio_endio(bio); > + return DM_MAPIO_SUBMITTED; > +} > + > +static int dm_po2z_map(struct dm_target *ti, struct bio *bio) > +{ > + struct dm_po2z_target *dmh = ti->private; > + > + bio_set_dev(bio, dmh->dev->bdev); > + if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { > + if (!bio_across_emulated_zone_area(dmh, bio)) { > + bio->bi_iter.bi_sector = target_to_device_sect(dmh, > + bio->bi_iter.bi_sector); Long line. Split it after "=". > + return DM_MAPIO_REMAPPED; > + } > + /* > + * Read operation on the emulated zone area (between zone capacity > + * and zone size) will fill the bio with zeroes. Any other operation > + * in the emulated area should return an error. > + */ > + if (bio_op(bio) == REQ_OP_READ) > + return dm_po2z_map_read_emulated_area(dmh, bio); > + > + return DM_MAPIO_KILL; > + } > + return DM_MAPIO_REMAPPED; > +} > + > +static int dm_po2z_iterate_devices(struct dm_target *ti, > + iterate_devices_callout_fn fn, void *data) > +{ > + struct dm_po2z_target *dmh = ti->private; > + sector_t len = dmh->nr_zones * dmh->zone_size; > + > + return fn(ti, dmh->dev, 0, len, data); > +} > + > +static struct target_type dm_po2z_target = { > + .name = "po2zone", > + .version = { 1, 0, 0 }, > + .features = DM_TARGET_ZONED_HM | DM_TARGET_EMULATED_ZONES, > + .map = dm_po2z_map, > + .end_io = dm_po2z_end_io, > + .report_zones = dm_po2z_report_zones, > + .iterate_devices = dm_po2z_iterate_devices, > + .module = THIS_MODULE, > + .io_hints = dm_po2z_io_hints, > + .ctr = dm_po2z_ctr, > +}; > + > +static int __init dm_po2z_init(void) > +{ > + return dm_register_target(&dm_po2z_target); > +} > + > +static void __exit dm_po2z_exit(void) > +{ > + dm_unregister_target(&dm_po2z_target); > +} > + > +/* Module hooks */ > +module_init(dm_po2z_init); > +module_exit(dm_po2z_exit); > + > +MODULE_DESCRIPTION(DM_NAME "power-of-2 zoned target"); > +MODULE_AUTHOR("Pankaj Raghav <p.raghav@samsung.com>"); > +MODULE_LICENSE("GPL"); > +
Hi Damien, >> +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, >> + struct bio *bio) > > This really should be called dm_po2z_handle_read() since it handles both cases > of read commands, with and without the accept partial. Note that this is > retesting the need for a split after that was already tested in dm_po2z_map() > with bio_across_emulated_zone_area(). So the code should be better organized to > avoid that repetition. > > You can simplify the code by having bio_across_emulated_zone_area() return 0 for > a bio that does not cross the zone capacity and return the number of sectors up > to the zone capacity from the bio start if there is a cross. That value can then The offset will also be zero if the BIO starts at the zone capacity and dm_po2z_map will assume that the bio did not cross the emulated zone boundary. > be used to call dm_accept_partial_bio() directly in dm_po2z_map() for read > commands. That will make this function much simpler and essentially turn it into > dm_po2z_read_zeroes(). > >> +{ What about something like this? We have one function: bio_in_emulated_zone_area() that checks if a BIO is partly or completely in the emulated zone area and also returns the offset from bio to the emulated zone boundary (zone capacity of the device). /** * bio_in_emulated_zone_area - check if bio is in the emulated zone area * @dmh: pozone target data * @bio: bio * @offset: bio offset to emulated zone boundary * * Check if a @bio is partly or completely in the emulated zone area. If the * @bio is partly in the emulated zone area, @offset * can be used to split the @bio across the emulated zone boundary. @offset * will be negative if the @bio completely lies in the emulated area. * */ static bool bio_in_emulated_zone_area(struct dm_po2z_target *dmh, struct bio *bio, int *offset) { unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; sector_t relative_sect_in_zone = bio->bi_iter.bi_sector - (zone_idx << dmh->zone_size_po2_shift); *offset = dmh->zone_size - relative_sect_in_zone; return (relative_sect_in_zone + nr_sectors) > dmh->zone_size; } static int dm_po2z_read_zeroes(struct bio *bio) { zero_fill_bio(bio); bio_endio(bio); return DM_MAPIO_SUBMITTED; } static int dm_po2z_remap_sector(struct dm_po2z_target *dmh, struct bio *bio) { bio->bi_iter.bi_sector = target_to_device_sect(dmh, bio->bi_iter.bi_sector); return DM_MAPIO_REMAPPED; } static int dm_po2z_map(struct dm_target *ti, struct bio *bio) { struct dm_po2z_target *dmh = ti->private; bio_set_dev(bio, dmh->dev->bdev); if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { int split_io_pos; if (!bio_in_emulated_zone_area(dmh, bio, &split_io_pos)) { return dm_po2z_remap_sector(dmh, bio); } /* * Read operation on the emulated zone area (between zone capacity * and zone size) will fill the bio with zeroes. Any other operation * in the emulated area should return an error. */ if (bio_op(bio) == REQ_OP_READ) { if (split_io_pos > 0) { dm_accept_partial_bio(bio, split_io_pos); return dm_po2z_remap_sector(dmh, bio); } return dm_po2z_read_zeroes(bio); } return DM_MAPIO_KILL; } return DM_MAPIO_REMAPPED; } Let me know what you think.
On 2022/08/15 6:38, Pankaj Raghav wrote: > Hi Damien, > >>> +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, >>> + struct bio *bio) >> >> This really should be called dm_po2z_handle_read() since it handles both cases >> of read commands, with and without the accept partial. Note that this is >> retesting the need for a split after that was already tested in dm_po2z_map() >> with bio_across_emulated_zone_area(). So the code should be better organized to >> avoid that repetition. >> >> You can simplify the code by having bio_across_emulated_zone_area() return 0 for >> a bio that does not cross the zone capacity and return the number of sectors up >> to the zone capacity from the bio start if there is a cross. That value can then > The offset will also be zero if the BIO starts at the zone capacity and > dm_po2z_map will assume that the bio did not cross the emulated zone boundary. > >> be used to call dm_accept_partial_bio() directly in dm_po2z_map() for read >> commands. That will make this function much simpler and essentially turn it into >> dm_po2z_read_zeroes(). >> >>> +{ > > What about something like this? We have one function: > bio_in_emulated_zone_area() that checks if a BIO is partly or completely in the > emulated zone area and also returns the offset from bio to the emulated zone > boundary (zone capacity of the device). > > /** > * bio_in_emulated_zone_area - check if bio is in the emulated zone area > * @dmh: pozone target data > * @bio: bio > * @offset: bio offset to emulated zone boundary > * > * Check if a @bio is partly or completely in the emulated zone area. If the > * @bio is partly in the emulated zone area, @offset > * can be used to split the @bio across the emulated zone boundary. @offset > * will be negative if the @bio completely lies in the emulated area. > * */ > static bool bio_in_emulated_zone_area(struct dm_po2z_target *dmh, > struct bio *bio, int *offset) > { > unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); > sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; > sector_t relative_sect_in_zone = Nit: Could simply call this sector_offset. > bio->bi_iter.bi_sector - (zone_idx << dmh->zone_size_po2_shift); > > *offset = dmh->zone_size - relative_sect_in_zone; > > return (relative_sect_in_zone + nr_sectors) > dmh->zone_size; No need for the parenthesis. > } > > static int dm_po2z_read_zeroes(struct bio *bio) Make that an inline. > { > zero_fill_bio(bio); > bio_endio(bio); > return DM_MAPIO_SUBMITTED; > } > > static int dm_po2z_remap_sector(struct dm_po2z_target *dmh, struct bio *bio) Inline this one too. > { > bio->bi_iter.bi_sector = > target_to_device_sect(dmh, bio->bi_iter.bi_sector); > return DM_MAPIO_REMAPPED; > } > > static int dm_po2z_map(struct dm_target *ti, struct bio *bio) > { > struct dm_po2z_target *dmh = ti->private; > > bio_set_dev(bio, dmh->dev->bdev); > if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { > int split_io_pos; Blank line needed here. > if (!bio_in_emulated_zone_area(dmh, bio, &split_io_pos)) { > return dm_po2z_remap_sector(dmh, bio); > } No need for the curly brackets. > /* > * Read operation on the emulated zone area (between zone capacity > * and zone size) will fill the bio with zeroes. Any other operation > * in the emulated area should return an error. > */ > if (bio_op(bio) == REQ_OP_READ) { > if (split_io_pos > 0) { > dm_accept_partial_bio(bio, split_io_pos); > return dm_po2z_remap_sector(dmh, bio); > } > return dm_po2z_read_zeroes(bio); > } > return DM_MAPIO_KILL; > } > return DM_MAPIO_REMAPPED; > } > > Let me know what you think. I think this is way better. But I would still reorganize this like this: static int dm_po2z_map(struct dm_target *ti, struct bio *bio) { struct dm_po2z_target *dmh = ti->private; int split_io_pos; bio_set_dev(bio, dmh->dev->bdev); if (op_is_zone_mgmt(bio_op(bio))) return dm_po2z_remap_sector(dmh, bio); if (!bio_sectors(bio)) return DM_MAPIO_REMAPPED; /* * Read operation on the emulated zone area (between zone capacity * and zone size) will fill the bio with zeroes. Any other operation * in the emulated area should return an error. */ if (!bio_in_emulated_zone_area(dmh, bio, &split_io_pos)) return dm_po2z_remap_sector(dmh, bio); if (bio_op(bio) == REQ_OP_READ) { if (split_io_pos > 0) { dm_accept_partial_bio(bio, split_io_pos); return dm_po2z_remap_sector(dmh, bio); } return dm_po2z_read_zeroes(bio); } return DM_MAPIO_KILL; } I find the code easier to follow this way.
> > I think this is way better. But I would still reorganize this like this: > > static int dm_po2z_map(struct dm_target *ti, struct bio *bio) > { > struct dm_po2z_target *dmh = ti->private; > int split_io_pos; > > bio_set_dev(bio, dmh->dev->bdev); > > if (op_is_zone_mgmt(bio_op(bio))) > return dm_po2z_remap_sector(dmh, bio); > > if (!bio_sectors(bio)) > return DM_MAPIO_REMAPPED; > > /* > * Read operation on the emulated zone area (between zone capacity > * and zone size) will fill the bio with zeroes. Any other operation > * in the emulated area should return an error. > */ > if (!bio_in_emulated_zone_area(dmh, bio, &split_io_pos)) > return dm_po2z_remap_sector(dmh, bio); > > if (bio_op(bio) == REQ_OP_READ) { > if (split_io_pos > 0) { > dm_accept_partial_bio(bio, split_io_pos); > return dm_po2z_remap_sector(dmh, bio); > } > return dm_po2z_read_zeroes(bio); > } > > return DM_MAPIO_KILL; > } > > I find the code easier to follow this way. > I agree. It is definitely easier to follow. I will make these changes and send the next version soon. > >
diff --git a/Documentation/admin-guide/device-mapper/dm-po2zone.rst b/Documentation/admin-guide/device-mapper/dm-po2zone.rst new file mode 100644 index 000000000000..19dc215fbcca --- /dev/null +++ b/Documentation/admin-guide/device-mapper/dm-po2zone.rst @@ -0,0 +1,71 @@ +========== +dm-po2zone +========== +The dm-po2zone device mapper target exposes a zoned block device with a +non-power-of-2(npo2) number of sectors per zone as a power-of-2(po2) +number of sectors per zone(zone size). +The filesystems that support zoned block devices such as F2FS and BTRFS +assume po2 zone size as the kernel has traditionally only supported +those devices. However, as the kernel now supports zoned block devices with +npo2 zone sizes, the filesystems can run on top of the dm-po2zone target before +adding native support. + +Partial mapping of the underlying device is not supported by this target. + +Algorithm +========= +The device mapper target maps the underlying device's zone size to the +zone capacity and changes the zone size to the nearest po2 zone size. +The gap between the zone capacity and the zone size is emulated in the target. +E.g., a zoned block device with a zone size (and capacity) of 3M will have an +equivalent target layout with mapping as follows: + +:: + + 0M 3M 4M 6M 8M + | | | | | + +x------------+--+x---------+--+x------- Target + |x | |x | |x + x x x + x x x + x x x + x x x + |x |x |x + +x------------+x------------+x---------- Device + | | | + 0M 3M 6M + +A simple remap is performed for all the BIOs that do not cross the +emulation gap area, i.e., the area between the zone capacity and size. + +If a BIO crosses the emulation gap area, the following operations are performed: + + Read: + - If the BIO lies entirely in the emulation gap area, then zero out the BIO and complete it. + - If the BIO spans the emulation gap area, split the BIO across the zone capacity boundary + and remap only the BIO within the zone capacity boundary. The other part of the split BIO + will be zeroed out. + + Other operations: + - Return an error + +Table parameters +================ + +:: + + <dev path> + +Mandatory parameters: + + <dev path>: + Full pathname to the underlying block-device, or a + "major:minor" device-number. + +Examples +======== + +:: + + #!/bin/sh + echo "0 `blockdev --getsz $1` po2zone $1" | dmsetup create po2z diff --git a/Documentation/admin-guide/device-mapper/index.rst b/Documentation/admin-guide/device-mapper/index.rst index cde52cc09645..1fd04b5b0565 100644 --- a/Documentation/admin-guide/device-mapper/index.rst +++ b/Documentation/admin-guide/device-mapper/index.rst @@ -23,6 +23,7 @@ Device Mapper dm-service-time dm-uevent dm-zoned + dm-po2zone era kcopyd linear diff --git a/drivers/md/Kconfig b/drivers/md/Kconfig index 998a5cfdbc4e..638801b2449a 100644 --- a/drivers/md/Kconfig +++ b/drivers/md/Kconfig @@ -518,6 +518,16 @@ config DM_FLAKEY help A target that intermittently fails I/O for debugging purposes. +config DM_PO2ZONE + tristate "Zoned block devices target emulating a power-of-2 number of sectors per zone" + depends on BLK_DEV_DM + depends on BLK_DEV_ZONED + help + A target that converts a zoned block device with non-power-of-2(npo2) + number of sectors per zone to be power-of-2(po2). Use this target for + zoned block devices with npo2 number of sectors per zone until native + support is added to the filesystems and applications. + config DM_VERITY tristate "Verity target support" depends on BLK_DEV_DM diff --git a/drivers/md/Makefile b/drivers/md/Makefile index 84291e38dca8..c23f81cc8789 100644 --- a/drivers/md/Makefile +++ b/drivers/md/Makefile @@ -26,6 +26,7 @@ dm-era-y += dm-era-target.o dm-clone-y += dm-clone-target.o dm-clone-metadata.o dm-verity-y += dm-verity-target.o dm-zoned-y += dm-zoned-target.o dm-zoned-metadata.o dm-zoned-reclaim.o +dm-po2zone-y += dm-po2zone-target.o md-mod-y += md.o md-bitmap.o raid456-y += raid5.o raid5-cache.o raid5-ppl.o @@ -60,6 +61,7 @@ obj-$(CONFIG_DM_CRYPT) += dm-crypt.o obj-$(CONFIG_DM_DELAY) += dm-delay.o obj-$(CONFIG_DM_DUST) += dm-dust.o obj-$(CONFIG_DM_FLAKEY) += dm-flakey.o +obj-$(CONFIG_DM_PO2ZONE) += dm-po2zone.o obj-$(CONFIG_DM_MULTIPATH) += dm-multipath.o dm-round-robin.o obj-$(CONFIG_DM_MULTIPATH_QL) += dm-queue-length.o obj-$(CONFIG_DM_MULTIPATH_ST) += dm-service-time.o diff --git a/drivers/md/dm-po2zone-target.c b/drivers/md/dm-po2zone-target.c new file mode 100644 index 000000000000..d7610463245d --- /dev/null +++ b/drivers/md/dm-po2zone-target.c @@ -0,0 +1,245 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2022 Samsung Electronics Co., Ltd. + */ + +#include <linux/device-mapper.h> + +#define DM_MSG_PREFIX "po2zone" + +struct dm_po2z_target { + struct dm_dev *dev; + sector_t zone_size; /* Actual zone size of the underlying dev*/ + sector_t zone_size_po2; /* zone_size rounded to the nearest po2 value */ + unsigned int zone_size_po2_shift; + sector_t zone_size_diff; /* diff between zone_size_po2 and zone_size */ + unsigned int nr_zones; +}; + +static inline unsigned int npo2_zone_no(struct dm_po2z_target *dmh, + sector_t sect) +{ + return div64_u64(sect, dmh->zone_size); +} + +static inline unsigned int po2_zone_no(struct dm_po2z_target *dmh, + sector_t sect) +{ + return sect >> dmh->zone_size_po2_shift; +} + +static inline sector_t target_to_device_sect(struct dm_po2z_target *dmh, + sector_t sect) +{ + return sect - (po2_zone_no(dmh, sect) * dmh->zone_size_diff); +} + +static inline sector_t device_to_target_sect(struct dm_po2z_target *dmh, + sector_t sect) +{ + return sect + (npo2_zone_no(dmh, sect) * dmh->zone_size_diff); +} + +/* + * This target works on the complete zoned device. Partial mapping is not + * supported. + * Construct a zoned po2 logical device: <dev-path> + */ +static int dm_po2z_ctr(struct dm_target *ti, unsigned int argc, char **argv) +{ + struct dm_po2z_target *dmh = NULL; + int ret; + sector_t zone_size; + sector_t dev_capacity; + + if (argc != 1) + return -EINVAL; + + dmh = kmalloc(sizeof(*dmh), GFP_KERNEL); + if (!dmh) + return -ENOMEM; + + ret = dm_get_device(ti, argv[0], dm_table_get_mode(ti->table), + &dmh->dev); + if (ret) { + ti->error = "Device lookup failed"; + kfree(dmh); + return ret; + } + + if (!bdev_is_zoned(dmh->dev->bdev)) { + DMERR("%pg is not a zoned device", dmh->dev->bdev); + kfree(dmh); + return -EINVAL; + } + + zone_size = bdev_zone_sectors(dmh->dev->bdev); + dev_capacity = get_capacity(dmh->dev->bdev->bd_disk); + if (ti->len != dev_capacity || ti->begin) { + DMERR("%pg Partial mapping of the target not supported", + dmh->dev->bdev); + kfree(dmh); + return -EINVAL; + } + + if (is_power_of_2(zone_size)) + DMWARN("%pg: underlying device has a power-of-2 number of sectors per zone", + dmh->dev->bdev); + + dmh->zone_size = zone_size; + dmh->zone_size_po2 = 1 << get_count_order_long(zone_size); + dmh->zone_size_po2_shift = ilog2(dmh->zone_size_po2); + dmh->zone_size_diff = dmh->zone_size_po2 - dmh->zone_size; + ti->private = dmh; + ti->max_io_len = dmh->zone_size_po2; + dmh->nr_zones = npo2_zone_no(dmh, ti->len); + ti->len = dmh->zone_size_po2 * dmh->nr_zones; + + return 0; +} + +static int dm_po2z_report_zones_cb(struct blk_zone *zone, unsigned int idx, + void *data) +{ + struct dm_report_zones_args *args = data; + struct dm_po2z_target *dmh = args->tgt->private; + + zone->start = device_to_target_sect(dmh, zone->start); + zone->wp = device_to_target_sect(dmh, zone->wp); + zone->len = dmh->zone_size_po2; + args->next_sector = zone->start + zone->len; + + return args->orig_cb(zone, args->zone_idx++, args->orig_data); +} + +static int dm_po2z_report_zones(struct dm_target *ti, + struct dm_report_zones_args *args, + unsigned int nr_zones) +{ + struct dm_po2z_target *dmh = ti->private; + sector_t sect = po2_zone_no(dmh, args->next_sector) * dmh->zone_size; + + return blkdev_report_zones(dmh->dev->bdev, sect, nr_zones, + dm_po2z_report_zones_cb, args); +} + +static int dm_po2z_end_io(struct dm_target *ti, struct bio *bio, + blk_status_t *error) +{ + struct dm_po2z_target *dmh = ti->private; + + if (bio->bi_status == BLK_STS_OK && bio_op(bio) == REQ_OP_ZONE_APPEND) + bio->bi_iter.bi_sector = + device_to_target_sect(dmh, bio->bi_iter.bi_sector); + + return DM_ENDIO_DONE; +} + +static void dm_po2z_io_hints(struct dm_target *ti, struct queue_limits *limits) +{ + struct dm_po2z_target *dmh = ti->private; + + limits->chunk_sectors = dmh->zone_size_po2; +} + +static bool bio_across_emulated_zone_area(struct dm_po2z_target *dmh, + struct bio *bio) +{ + unsigned int zone_idx = po2_zone_no(dmh, bio->bi_iter.bi_sector); + sector_t nr_sectors = bio->bi_iter.bi_size >> SECTOR_SHIFT; + + return (bio->bi_iter.bi_sector + nr_sectors) > + (zone_idx << dmh->zone_size_po2_shift) + dmh->zone_size; +} + +static int dm_po2z_map_read_emulated_area(struct dm_po2z_target *dmh, + struct bio *bio) +{ + sector_t start_sect = bio->bi_iter.bi_sector; + unsigned int zone_idx = po2_zone_no(dmh, start_sect); + sector_t relative_sect_in_zone = start_sect - (zone_idx << dmh->zone_size_po2_shift); + sector_t split_io_pos; + + /* + * If the starting sector is in the emulated area then fill + * all the bio with zeros. If bio is across emulated zone boundary, + * split the bio across boundaries and fill zeros only for the + * bio that is between the zone capacity and the zone size. + */ + if (relative_sect_in_zone < dmh->zone_size) { + split_io_pos = (zone_idx * dmh->zone_size_po2) + dmh->zone_size; + dm_accept_partial_bio(bio, split_io_pos - start_sect); + bio->bi_iter.bi_sector = target_to_device_sect(dmh, start_sect); + + return DM_MAPIO_REMAPPED; + } + + zero_fill_bio(bio); + bio_endio(bio); + return DM_MAPIO_SUBMITTED; +} + +static int dm_po2z_map(struct dm_target *ti, struct bio *bio) +{ + struct dm_po2z_target *dmh = ti->private; + + bio_set_dev(bio, dmh->dev->bdev); + if (bio_sectors(bio) || op_is_zone_mgmt(bio_op(bio))) { + if (!bio_across_emulated_zone_area(dmh, bio)) { + bio->bi_iter.bi_sector = target_to_device_sect(dmh, + bio->bi_iter.bi_sector); + return DM_MAPIO_REMAPPED; + } + /* + * Read operation on the emulated zone area (between zone capacity + * and zone size) will fill the bio with zeroes. Any other operation + * in the emulated area should return an error. + */ + if (bio_op(bio) == REQ_OP_READ) + return dm_po2z_map_read_emulated_area(dmh, bio); + + return DM_MAPIO_KILL; + } + return DM_MAPIO_REMAPPED; +} + +static int dm_po2z_iterate_devices(struct dm_target *ti, + iterate_devices_callout_fn fn, void *data) +{ + struct dm_po2z_target *dmh = ti->private; + sector_t len = dmh->nr_zones * dmh->zone_size; + + return fn(ti, dmh->dev, 0, len, data); +} + +static struct target_type dm_po2z_target = { + .name = "po2zone", + .version = { 1, 0, 0 }, + .features = DM_TARGET_ZONED_HM | DM_TARGET_EMULATED_ZONES, + .map = dm_po2z_map, + .end_io = dm_po2z_end_io, + .report_zones = dm_po2z_report_zones, + .iterate_devices = dm_po2z_iterate_devices, + .module = THIS_MODULE, + .io_hints = dm_po2z_io_hints, + .ctr = dm_po2z_ctr, +}; + +static int __init dm_po2z_init(void) +{ + return dm_register_target(&dm_po2z_target); +} + +static void __exit dm_po2z_exit(void) +{ + dm_unregister_target(&dm_po2z_target); +} + +/* Module hooks */ +module_init(dm_po2z_init); +module_exit(dm_po2z_exit); + +MODULE_DESCRIPTION(DM_NAME "power-of-2 zoned target"); +MODULE_AUTHOR("Pankaj Raghav <p.raghav@samsung.com>"); +MODULE_LICENSE("GPL"); +