mbox series

[00/17] ZNS Support for Btrfs

Message ID cover.1628690222.git.naohiro.aota@wdc.com (mailing list archive)
Headers show
Series ZNS Support for Btrfs | expand

Message

Naohiro Aota Aug. 11, 2021, 2:16 p.m. UTC
This series extends zoned support for Zoned Namespace (ZNS) SSDs [1].

[1] https://zonedstorage.io/introduction/zns/

This series is available on GitHub at
v1    https://github.com/naota/linux/tree/btrfs-zns-v1
HEAD  https://github.com/naota/linux/tree/btrfs-zns

The ZNS specification introduces extra functionalities listed below.

- No conventional zones
- Zone Append write command
- Zone Capacity
- Active Zones

The first two functionalities are already addressed in the current
zoned support on btrfs. We do not rely on conventional zones, and we
use the zone append write command to write data IOs.

This series implements support for the other ones.

While userland tool needs some tweaks (e.g. using capactiy instead of
the length) to be precise, but it still works fine as it is.

* Zone Capacity Support

A zone capacity is an additional per-zone attribute that indicates the
number of usable logical blocks within each zone, starting from the
first logical block of each zone. It is always smaller or equal to the
zone size.

We can naturally map the capacity to the newly introduced
"zone_capacity" of a block group. Allocations are limited under the
zone capacity instead of the block group's length.

* Active Zones Tracking

The ZNS specification defines a limit on the number of zones that can
be in the implicit open, explicit open or closed conditions. Any zone
with such condition is defined as an active zone and correspond to any
zone that is being written or that has been only partially written. If
the maximum number of active zones is reached, we must either reset or
finish some active zones before being able to chose other zones for
storing data.

In order to not exceed the number of max active zones, we need to
track which zones are active and how the active zones are related to
the block groups. We mark a block group as "active" if the
corresponding device zones are all active. Allocating an extent will
activate a block group, and allocation from an inactive block group is
prohibited. Such active block groups are tracked in a list. Once a
block group is fully written, we deactivate it and remove it from the
list.
  
* Active Zone Aware Sequential Allocator

Handling the active zones will make the allocator complex. Here is a
summary of how find_free_extent_update_loop() behave.
  
1. If enough space is available in an active block group
   -  allocate from it (end, success)
2. If we can activate another zone on a device
   2.1 Try to allocate a new block group and activate it
   2.2 If the activation succeeds
      - allocation will be satisfied from it in the next iteration
   2.3 If the activation failed
      - Try the next cycle. Some writes may free up an active block group
3. If we cannot activate any zones
   3.1 Try to allocate in a small size by checking min_alloc_size
      - btrfs_reserve_extent() will halve the allocation size and
        restart the loop
   3.2 Nothing can be done anymore. Give up. ENOSPC

* Patch series organization

Note: patches 2 and 14 are preparation patches and can be merged
independently.

Patches 1-6 implement zone capacity support.

Patch 7 implements finishing a superblock zone once there is no space
left for new superblock.

Patches 8-13 implement the activation side of the active zone
tracking.

Patches 14 and 15 tweak the allocator to retry with a smaller size if
possible (step 3.1 in the above list)

Patches 16 and 17 implement the deactivation side of the active zone
tracking.

Naohiro Aota (17):
  btrfs: zoned: load zone capacity information from devices
  btrfs: zoned: move btrfs_free_excluded_extents out from
    btrfs_calc_zone_unusable
  btrfs: zoned: calculate free space from zone capacity
  btrfs: zoned: tweak reclaim threshold for zone capacity
  btrfs: zoned: consider zone as full when no more SB can be written
  btrfs: zoned: locate superblock position using zone capacity
  btrfs: zoned: finish superblock zone once no space left for new SB
  btrfs: zoned: load active zone information from devices
  btrfs: zoned: introduce physical_map to btrfs_block_group
  btrfs: zoned: implement active zone tracking
  btrfs: zoned: load active zone info for block group
  btrfs: zoned: activate block group on allocation
  btrfs: zoned: activate new block group
  btrfs: move ffe_ctl one level up
  btrfs: zoned: avoid chunk allocation if active block group has enough
    space
  btrfs: zoned: finish fully written block group
  btrfs: zoned: finish relocating block group

 fs/btrfs/block-group.c      |  29 ++-
 fs/btrfs/block-group.h      |   4 +
 fs/btrfs/ctree.h            |   3 +
 fs/btrfs/disk-io.c          |   6 +-
 fs/btrfs/extent-tree.c      | 204 +++++++++------
 fs/btrfs/extent_io.c        |  11 +-
 fs/btrfs/extent_io.h        |   1 +
 fs/btrfs/free-space-cache.c |  19 +-
 fs/btrfs/inode.c            |   6 +-
 fs/btrfs/relocation.c       |   4 +
 fs/btrfs/zoned.c            | 495 +++++++++++++++++++++++++++++++++---
 fs/btrfs/zoned.h            |  39 ++-
 12 files changed, 692 insertions(+), 129 deletions(-)