mbox series

[v5,00/11] dm: Improve zoned block device support

Message ID 20210525212501.226888-1-damien.lemoal@wdc.com (mailing list archive)
Headers show
Series dm: Improve zoned block device support | expand

Message

Damien Le Moal May 25, 2021, 9:24 p.m. UTC
This series improve device mapper support for zoned block devices and
of targets exposing a zoned device.

The first patch improve support for user requests to reset all zones of
the target device. With the fix, such operation behave similarly to
physical block devices implementation based on the single zone reset
command with the ALL bit set.

The following 2 patches are preparatory block layer patches.

Patch 4 and 5 are 2 small fixes to DM core zoned block device support.

Patch 6 reorganizes DM core code, moving conditionally defined zoned
block device code into the new dm-zone.c file. This avoids sprinkly DM
with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.

Patch 7 improves DM zone report helper functions for target drivers.

Patch 8 fixes a potential problem with BIO requeue on zoned target.

Finally, patch 9 to 11 implement zone append emulation using regular
writes for target drivers that cannot natively support this BIO type.
The only target currently needing this emulation is dm-crypt. With this
change, a zoned dm-crypt device behaves exactly like a regular zoned
block device, correctly executing user zone append BIOs.

This series passes the following tests:
1) zonefs tests on top of dm-crypt with a zoned nullblk device
2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
3) btrfs fstests on top of dm-crypt with zoned nullblk devices.

Comments are as always welcome.

Changes from v4:
* Remove useless extra space in variable initialization in patch 1
* Shorten dm_accept_partial_bio() documentation comment in patch 4
* Added reviewed-by tags

Changes from v3:
* Fixed missing variable initialization in
  blkdev_zone_reset_all_emulated() in patch 1.
* Rebased on rc3
* Added reviewed-by tags

Changes from v2:
* Replace use of spinlock to protect the zone write pointer offset
  array in patch 11 with READ_ONCE/WRITE_ONCE as suggested by Hannes.
* Added reviewed-by tags

Changes from v1:
* Use Christoph proposed approach for patch 1 (split reset all
  processing into different functions)
* Changed helpers introduced in patch 2 to remove the request_queue
  argument
* Improve patch 3 commit message as suggested by Christoph (explaining
  that the flag is a special case that cannot use a REQ_XXX flag)
* Changed DMWARN() into DMDEBUG in patch 11 as suggested by Milan
* Added reviewed-by tags

Damien Le Moal (11):
  block: improve handling of all zones reset operation
  block: introduce bio zone helpers
  block: introduce BIO_ZONE_WRITE_LOCKED bio flag
  dm: Fix dm_accept_partial_bio()
  dm: cleanup device_area_is_invalid()
  dm: move zone related code to dm-zone.c
  dm: Introduce dm_report_zones()
  dm: Forbid requeue of writes to zones
  dm: rearrange core declarations
  dm: introduce zone append emulation
  dm crypt: Fix zoned block device support

 block/blk-zoned.c             | 119 +++++--
 drivers/md/Makefile           |   4 +
 drivers/md/dm-core.h          |  65 ++++
 drivers/md/dm-crypt.c         |  31 +-
 drivers/md/dm-flakey.c        |   7 +-
 drivers/md/dm-linear.c        |   7 +-
 drivers/md/dm-table.c         |  21 +-
 drivers/md/dm-zone.c          | 654 ++++++++++++++++++++++++++++++++++
 drivers/md/dm.c               | 201 +++--------
 drivers/md/dm.h               |  30 +-
 include/linux/blk_types.h     |   1 +
 include/linux/blkdev.h        |  12 +
 include/linux/device-mapper.h |   9 +-
 13 files changed, 954 insertions(+), 207 deletions(-)
 create mode 100644 drivers/md/dm-zone.c

Comments

Damien Le Moal June 1, 2021, 10:57 p.m. UTC | #1
On 2021/05/26 6:25, Damien Le Moal wrote:
> This series improve device mapper support for zoned block devices and
> of targets exposing a zoned device.

Mike, Jens,

Any feedback regarding this series ?

> 
> The first patch improve support for user requests to reset all zones of
> the target device. With the fix, such operation behave similarly to
> physical block devices implementation based on the single zone reset
> command with the ALL bit set.
> 
> The following 2 patches are preparatory block layer patches.
> 
> Patch 4 and 5 are 2 small fixes to DM core zoned block device support.
> 
> Patch 6 reorganizes DM core code, moving conditionally defined zoned
> block device code into the new dm-zone.c file. This avoids sprinkly DM
> with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.
> 
> Patch 7 improves DM zone report helper functions for target drivers.
> 
> Patch 8 fixes a potential problem with BIO requeue on zoned target.
> 
> Finally, patch 9 to 11 implement zone append emulation using regular
> writes for target drivers that cannot natively support this BIO type.
> The only target currently needing this emulation is dm-crypt. With this
> change, a zoned dm-crypt device behaves exactly like a regular zoned
> block device, correctly executing user zone append BIOs.
> 
> This series passes the following tests:
> 1) zonefs tests on top of dm-crypt with a zoned nullblk device
> 2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
> 3) btrfs fstests on top of dm-crypt with zoned nullblk devices.
> 
> Comments are as always welcome.
> 
> Changes from v4:
> * Remove useless extra space in variable initialization in patch 1
> * Shorten dm_accept_partial_bio() documentation comment in patch 4
> * Added reviewed-by tags
> 
> Changes from v3:
> * Fixed missing variable initialization in
>   blkdev_zone_reset_all_emulated() in patch 1.
> * Rebased on rc3
> * Added reviewed-by tags
> 
> Changes from v2:
> * Replace use of spinlock to protect the zone write pointer offset
>   array in patch 11 with READ_ONCE/WRITE_ONCE as suggested by Hannes.
> * Added reviewed-by tags
> 
> Changes from v1:
> * Use Christoph proposed approach for patch 1 (split reset all
>   processing into different functions)
> * Changed helpers introduced in patch 2 to remove the request_queue
>   argument
> * Improve patch 3 commit message as suggested by Christoph (explaining
>   that the flag is a special case that cannot use a REQ_XXX flag)
> * Changed DMWARN() into DMDEBUG in patch 11 as suggested by Milan
> * Added reviewed-by tags
> 
> Damien Le Moal (11):
>   block: improve handling of all zones reset operation
>   block: introduce bio zone helpers
>   block: introduce BIO_ZONE_WRITE_LOCKED bio flag
>   dm: Fix dm_accept_partial_bio()
>   dm: cleanup device_area_is_invalid()
>   dm: move zone related code to dm-zone.c
>   dm: Introduce dm_report_zones()
>   dm: Forbid requeue of writes to zones
>   dm: rearrange core declarations
>   dm: introduce zone append emulation
>   dm crypt: Fix zoned block device support
> 
>  block/blk-zoned.c             | 119 +++++--
>  drivers/md/Makefile           |   4 +
>  drivers/md/dm-core.h          |  65 ++++
>  drivers/md/dm-crypt.c         |  31 +-
>  drivers/md/dm-flakey.c        |   7 +-
>  drivers/md/dm-linear.c        |   7 +-
>  drivers/md/dm-table.c         |  21 +-
>  drivers/md/dm-zone.c          | 654 ++++++++++++++++++++++++++++++++++
>  drivers/md/dm.c               | 201 +++--------
>  drivers/md/dm.h               |  30 +-
>  include/linux/blk_types.h     |   1 +
>  include/linux/blkdev.h        |  12 +
>  include/linux/device-mapper.h |   9 +-
>  13 files changed, 954 insertions(+), 207 deletions(-)
>  create mode 100644 drivers/md/dm-zone.c
>
Mike Snitzer June 2, 2021, 6:32 p.m. UTC | #2
On Tue, Jun 01 2021 at  6:57P -0400,
Damien Le Moal <Damien.LeMoal@wdc.com> wrote:

> On 2021/05/26 6:25, Damien Le Moal wrote:
> > This series improve device mapper support for zoned block devices and
> > of targets exposing a zoned device.
> 
> Mike, Jens,
> 
> Any feedback regarding this series ?
> 
> > 
> > The first patch improve support for user requests to reset all zones of
> > the target device. With the fix, such operation behave similarly to
> > physical block devices implementation based on the single zone reset
> > command with the ALL bit set.
> > 
> > The following 2 patches are preparatory block layer patches.
> > 
> > Patch 4 and 5 are 2 small fixes to DM core zoned block device support.
> > 
> > Patch 6 reorganizes DM core code, moving conditionally defined zoned
> > block device code into the new dm-zone.c file. This avoids sprinkly DM
> > with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.
> > 
> > Patch 7 improves DM zone report helper functions for target drivers.
> > 
> > Patch 8 fixes a potential problem with BIO requeue on zoned target.
> > 
> > Finally, patch 9 to 11 implement zone append emulation using regular
> > writes for target drivers that cannot natively support this BIO type.
> > The only target currently needing this emulation is dm-crypt. With this
> > change, a zoned dm-crypt device behaves exactly like a regular zoned
> > block device, correctly executing user zone append BIOs.
> > 
> > This series passes the following tests:
> > 1) zonefs tests on top of dm-crypt with a zoned nullblk device
> > 2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
> > 3) btrfs fstests on top of dm-crypt with zoned nullblk devices.
> > 
> > Comments are as always welcome.

I've picked up DM patches 4-8 because they didn't depend on the first
3 block patches.

But I'm fine with picking up 1-3 if Jens provides his Acked-by.
And then I can pickup the remaining DM patches 9-11.

Thanks,
Mike
Jens Axboe June 3, 2021, 5:46 p.m. UTC | #3
On 6/2/21 12:32 PM, Mike Snitzer wrote:
> On Tue, Jun 01 2021 at  6:57P -0400,
> Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> 
>> On 2021/05/26 6:25, Damien Le Moal wrote:
>>> This series improve device mapper support for zoned block devices and
>>> of targets exposing a zoned device.
>>
>> Mike, Jens,
>>
>> Any feedback regarding this series ?
>>
>>>
>>> The first patch improve support for user requests to reset all zones of
>>> the target device. With the fix, such operation behave similarly to
>>> physical block devices implementation based on the single zone reset
>>> command with the ALL bit set.
>>>
>>> The following 2 patches are preparatory block layer patches.
>>>
>>> Patch 4 and 5 are 2 small fixes to DM core zoned block device support.
>>>
>>> Patch 6 reorganizes DM core code, moving conditionally defined zoned
>>> block device code into the new dm-zone.c file. This avoids sprinkly DM
>>> with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.
>>>
>>> Patch 7 improves DM zone report helper functions for target drivers.
>>>
>>> Patch 8 fixes a potential problem with BIO requeue on zoned target.
>>>
>>> Finally, patch 9 to 11 implement zone append emulation using regular
>>> writes for target drivers that cannot natively support this BIO type.
>>> The only target currently needing this emulation is dm-crypt. With this
>>> change, a zoned dm-crypt device behaves exactly like a regular zoned
>>> block device, correctly executing user zone append BIOs.
>>>
>>> This series passes the following tests:
>>> 1) zonefs tests on top of dm-crypt with a zoned nullblk device
>>> 2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
>>> 3) btrfs fstests on top of dm-crypt with zoned nullblk devices.
>>>
>>> Comments are as always welcome.
> 
> I've picked up DM patches 4-8 because they didn't depend on the first
> 3 block patches.
> 
> But I'm fine with picking up 1-3 if Jens provides his Acked-by.
> And then I can pickup the remaining DM patches 9-11.

I'm fine with 1-3, you can add my Acked-by to those.
Mike Snitzer June 3, 2021, 10:16 p.m. UTC | #4
On Thu, Jun 03 2021 at  1:46P -0400,
Jens Axboe <axboe@kernel.dk> wrote:

> On 6/2/21 12:32 PM, Mike Snitzer wrote:
> > On Tue, Jun 01 2021 at  6:57P -0400,
> > Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> > 
> >> On 2021/05/26 6:25, Damien Le Moal wrote:
> >>> This series improve device mapper support for zoned block devices and
> >>> of targets exposing a zoned device.
> >>
> >> Mike, Jens,
> >>
> >> Any feedback regarding this series ?
> >>
> >>>
> >>> The first patch improve support for user requests to reset all zones of
> >>> the target device. With the fix, such operation behave similarly to
> >>> physical block devices implementation based on the single zone reset
> >>> command with the ALL bit set.
> >>>
> >>> The following 2 patches are preparatory block layer patches.
> >>>
> >>> Patch 4 and 5 are 2 small fixes to DM core zoned block device support.
> >>>
> >>> Patch 6 reorganizes DM core code, moving conditionally defined zoned
> >>> block device code into the new dm-zone.c file. This avoids sprinkly DM
> >>> with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.
> >>>
> >>> Patch 7 improves DM zone report helper functions for target drivers.
> >>>
> >>> Patch 8 fixes a potential problem with BIO requeue on zoned target.
> >>>
> >>> Finally, patch 9 to 11 implement zone append emulation using regular
> >>> writes for target drivers that cannot natively support this BIO type.
> >>> The only target currently needing this emulation is dm-crypt. With this
> >>> change, a zoned dm-crypt device behaves exactly like a regular zoned
> >>> block device, correctly executing user zone append BIOs.
> >>>
> >>> This series passes the following tests:
> >>> 1) zonefs tests on top of dm-crypt with a zoned nullblk device
> >>> 2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
> >>> 3) btrfs fstests on top of dm-crypt with zoned nullblk devices.
> >>>
> >>> Comments are as always welcome.
> > 
> > I've picked up DM patches 4-8 because they didn't depend on the first
> > 3 block patches.
> > 
> > But I'm fine with picking up 1-3 if Jens provides his Acked-by.
> > And then I can pickup the remaining DM patches 9-11.
> 
> I'm fine with 1-3, you can add my Acked-by to those.

Thanks, did so.

Damien: I've staged this patchset in linux-next via the dm-5.14 branch of linux-dm.git

Might look at optimizing the fast-path of __map_bio further, e.g. this
leaves something to be desired considering how niche this all is:

        /*
         * Check if the IO needs a special mapping due to zone append emulation
         * on zoned target. In this case, dm_zone_map_bio() calls the target
         * map operation.
         */
        if (dm_emulate_zone_append(io->md))
                r = dm_zone_map_bio(tio);
        else
                r = ti->type->map(ti, clone);

Does it make sense to split out a new CONFIG_ that encapsulates legacy
zoned devices?
Damien Le Moal June 3, 2021, 11:44 p.m. UTC | #5
On 2021/06/04 7:16, Mike Snitzer wrote:
> On Thu, Jun 03 2021 at  1:46P -0400,
> Jens Axboe <axboe@kernel.dk> wrote:
> 
>> On 6/2/21 12:32 PM, Mike Snitzer wrote:
>>> On Tue, Jun 01 2021 at  6:57P -0400,
>>> Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
>>>
>>>> On 2021/05/26 6:25, Damien Le Moal wrote:
>>>>> This series improve device mapper support for zoned block devices and
>>>>> of targets exposing a zoned device.
>>>>
>>>> Mike, Jens,
>>>>
>>>> Any feedback regarding this series ?
>>>>
>>>>>
>>>>> The first patch improve support for user requests to reset all zones of
>>>>> the target device. With the fix, such operation behave similarly to
>>>>> physical block devices implementation based on the single zone reset
>>>>> command with the ALL bit set.
>>>>>
>>>>> The following 2 patches are preparatory block layer patches.
>>>>>
>>>>> Patch 4 and 5 are 2 small fixes to DM core zoned block device support.
>>>>>
>>>>> Patch 6 reorganizes DM core code, moving conditionally defined zoned
>>>>> block device code into the new dm-zone.c file. This avoids sprinkly DM
>>>>> with zone related code defined under an #ifdef CONFIG_BLK_DEV_ZONED.
>>>>>
>>>>> Patch 7 improves DM zone report helper functions for target drivers.
>>>>>
>>>>> Patch 8 fixes a potential problem with BIO requeue on zoned target.
>>>>>
>>>>> Finally, patch 9 to 11 implement zone append emulation using regular
>>>>> writes for target drivers that cannot natively support this BIO type.
>>>>> The only target currently needing this emulation is dm-crypt. With this
>>>>> change, a zoned dm-crypt device behaves exactly like a regular zoned
>>>>> block device, correctly executing user zone append BIOs.
>>>>>
>>>>> This series passes the following tests:
>>>>> 1) zonefs tests on top of dm-crypt with a zoned nullblk device
>>>>> 2) zonefs tests on top of dm-crypt+dm-linear with an SMR HDD
>>>>> 3) btrfs fstests on top of dm-crypt with zoned nullblk devices.
>>>>>
>>>>> Comments are as always welcome.
>>>
>>> I've picked up DM patches 4-8 because they didn't depend on the first
>>> 3 block patches.
>>>
>>> But I'm fine with picking up 1-3 if Jens provides his Acked-by.
>>> And then I can pickup the remaining DM patches 9-11.
>>
>> I'm fine with 1-3, you can add my Acked-by to those.
> 
> Thanks, did so.
> 
> Damien: I've staged this patchset in linux-next via the dm-5.14 branch of linux-dm.git

Thanks !

> Might look at optimizing the fast-path of __map_bio further, e.g. this
> leaves something to be desired considering how niche this all is:
> 
>         /*
>          * Check if the IO needs a special mapping due to zone append emulation
>          * on zoned target. In this case, dm_zone_map_bio() calls the target
>          * map operation.
>          */
>         if (dm_emulate_zone_append(io->md))
>                 r = dm_zone_map_bio(tio);
>         else
>                 r = ti->type->map(ti, clone);
> 
> Does it make sense to split out a new CONFIG_ that encapsulates legacy
> zoned devices?

Well, the problem is that there are no "legacy" zoned devices: they all support
zone append commands, either natively (for zns and nullblk) or emulated in their
respective drivers (scsi sd for SMR drives). So I do not think that a new
CONFIG_XXX can be used.

What we could do is have this conditional on either:
(1) the DM targets that need it: dm-crypt only for now (CONFIG_DM_CRYPT)
(2) Zone append command users: btrfs and zonefs only for now (CONFIG_FS_BTRFS
and CONFIG_FS_ZONEFS)

(1) would be the nicest as we can keep this contained in DM and define something
in KConfig. However, given that most distros (if not all) enable dm-crypt, I am
not convinced this will do any good.

Note that for the !CONFIG_BLK_DEV_ZONED case, the "if
(dm_emulate_zone_append(io->md))" is compiled out, resulting in the same code as
without the emulation.