[V2] fs: New zonefs file system

zonefs is a very simple file system exposing each zone of a zoned
block device as a file. zonefs is in fact closer to a raw block device
access interface than to a full feature POSIX file system.

The goal of zonefs is to simplify implementation of zoned block device
raw access by applications by allowing switching to the well known POSIX
file API rather than relying on direct block device file ioctls and
read/write. Zonefs, for instance, greatly simplifies the implementation
of LSM (log-structured merge) tree structures (such as used in RocksDB
and LevelDB) on zoned block devices by allowing SSTables to be stored in
a zone file similarly to a regular file system architecture, hence
reducing the amount of change needed in the application.

Zonefs on-disk metadata is reduced to a super block to store a magic
number, a uuid and optional features flags and values. On mount, zonefs
uses blkdev_report_zones() to obtain the device zone configuration and
populates the mount point with a static file tree solely based on this
information. E.g. file sizes come from zone write pointer offset managed
by the device itself.

The zone files created on mount have the following characteristics.
1) Files representing zones of the same type are grouped together
   under a common directory:
  * For conventional zones, the directory "cnv" is used.
  * For sequential write zones, the directory "seq" is used.
  These two directories are the only directories that exist in zonefs.
  Users cannot create other directories and cannot rename nor delete
  the "cnv" and "seq" directories.
2) The name of zone files is by default the number of the file within
   the zone type directory, in order of increasing zone start sector.
3) The size of conventional zone files is fixed to the device zone size.
   Conventional zone files cannot be truncated.
4) The size of sequential zone files represent the file zone write
   pointer position relative to the zone start sector. Truncating these
   files is allowed only down to 0, in wich case, the zone is reset to
   rewind the file zone write pointer position to the start of the zone.
5) All read and write operations to files are not allowed beyond the
   file zone size. Any access exceeding the zone size is failed with
   the -EFBIG error.
6) Creating, deleting, renaming or modifying any attribute of files
   and directories is not allowed. The only exception being the file
   size of sequential zone files which can be modified by write
   operations or truncation to 0.

Several optional features of zonefs can be enabled at format time.
* Conventional zone aggregation: contiguous conventional zones can be
  agregated into a single larger file instead of multiple per-zone
  files.
* File naming: the default file number file name can be switched to
  using the base-10 value of the file zone start sector.
* File ownership: The owner UID and GID of zone files is by default 0
  (root) but can be changed to any valid UID/GID.
* File access permissions: the default 640 access permissions can be
  changed.

The mkzonefs tool is used to format zonefs. This tool is available
on Github at: git@github.com:damien-lemoal/zonefs-tools.git.
zonefs-tools includes a simple test suite which can be run against any
zoned block device, including null_blk block device created with zoned
mode.

Example: the following formats a host-managed SMR HDD with the
conventional zone aggregation feature enabled.

mkzonefs -o aggr_cnv /dev/sdX
mount -t zonefs /dev/sdX /mnt
ls -l /mnt/
total 0
dr-xr-xr-x 2 root root 0 Apr 11 13:00 cnv
dr-xr-xr-x 2 root root 0 Apr 11 13:00 seq

ls -l /mnt/cnv
total 137363456
-rw-rw---- 1 root root 140660178944 Apr 11 13:00 0

ls -Fal -v /mnt/seq
total 14511243264
dr-xr-xr-x 2 root root 15942528 Jul 10 11:53 ./
drwxr-xr-x 4 root root     1152 Jul 10 11:53 ../
-rw-r----- 1 root root        0 Jul 10 11:53 0
-rw-r----- 1 root root 33554432 Jul 10 13:43 1
-rw-r----- 1 root root        0 Jul 10 11:53 2
-rw-r----- 1 root root        0 Jul 10 11:53 3
...

The aggregated conventional zone file can be used as a regular file.
Operations such as the following work.

mkfs.ext4 /mnt/cnv/0
mount -o loop /mnt/cnv/0 /data

Contains contributions from Johannes Thumshirn <jthumshirn@suse.de>
and Christoph Hellwig <hch@lst.de>.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---

Changes from v1:
* Rebased on latest iomap branch iomap-5.4-merge of XFS tree at
  git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git
* Addressed all comments from Dave Chinner and others

 MAINTAINERS                |   10 +
 fs/Kconfig                 |    2 +
 fs/Makefile                |    1 +
 fs/zonefs/Kconfig          |    9 +
 fs/zonefs/Makefile         |    4 +
 fs/zonefs/super.c          | 1049 ++++++++++++++++++++++++++++++++++++
 fs/zonefs/zonefs.h         |  171 ++++++
 include/uapi/linux/magic.h |    1 +
 8 files changed, 1247 insertions(+)
 create mode 100644 fs/zonefs/Kconfig
 create mode 100644 fs/zonefs/Makefile
 create mode 100644 fs/zonefs/super.c
 create mode 100644 fs/zonefs/zonefs.h

Message ID	20190820081249.27353-1-damien.lemoal@wdc.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <SRS0=kQBn=WQ=vger.kernel.org=linux-xfs-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0BBB31805 for <patchwork-linux-xfs@patchwork.kernel.org>; Tue, 20 Aug 2019 08:12:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AECE723A6B for <patchwork-linux-xfs@patchwork.kernel.org>; Tue, 20 Aug 2019 08:12:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="JTS9BVBC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728595AbfHTIMy (ORCPT <rfc822;patchwork-linux-xfs@patchwork.kernel.org>); Tue, 20 Aug 2019 04:12:54 -0400 Received: from esa5.hgst.iphmx.com ([216.71.153.144]:45288 "EHLO esa5.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726049AbfHTIMx (ORCPT <rfc822;linux-xfs@vger.kernel.org>); Tue, 20 Aug 2019 04:12:53 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1566288773; x=1597824773; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=CdOsZ5ek6KyWitmLtDEXI1sewgob+hQgw0rWzlbnSJs=; b=JTS9BVBCI8+Kz6JYQUABUgQYnKnfXKeV/KxK64zXeudjuncHTA9z+gRJ tZ1QkJkkM/3O7wSvhm+Hnf3v1z0CbLQGRDfZo6+uvIpuWCA9yQjKnrbsY yiNFp0sQglohrRTHqdlY63BoF+nJxhYKlwwsw7uMTHp1gEmBoVJJKHGCi Bp4IgNN8ZqvM1N5MTupwCau75uSBmZqod/J4Khpa/56UhI9nI+cFfJIhO rntUY1qf/IJIagN9mjefGJxrQa2dZlTpUIIVphTs87LremcidQ0wJAjnQ lkr1IMlCD7sKvlmyXMb+LYg72zKlpVE74dxpHiz5orzs8gosS5MpeMZi3 A==; IronPort-SDR: PhcaS3tIxsnmw/ZOg2Mqrm+SYjESeVrAl5oZVCqOjHBoYrmzOlDMViSD+CH5Hn+Uuu6jMtkQXK CYWQRj5zBxmWT2O0DqTPlxqYY5sA0frg00nDc3G0dcXIQWExFeGK/Vk9aGflPTbRsmlLr3oswi SjEfzjTu0PgsaqbFciFzxo4cnPbZrSqaW+Ow+N4wM6CWHFNwinVBbobC1cKIL6QmwyaBFbkq14 rPDaDtlBK4HWQTbzST06tztbk0pq8U75dtApOs0+mb0H5FLqJbeX3p3NC/gkphdpJNuDSI0OSB I74= X-IronPort-AV: E=Sophos;i="5.64,408,1559491200"; d="scan'208";a="117149488" Received: from h199-255-45-15.hgst.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 20 Aug 2019 16:12:52 +0800 IronPort-SDR: QAECBOr4h4V1tUUotVglgL2K0e7TUE9csKrxIFJg4wf5ljM7TzfbtFbBpseynM1+v51mPKTEbu qhj0YRoeT6/hNh/dIFzjyuJrChVfcipNUUWTX/cYbBxiKvzIbMqPx3KL4NzujpeGsPuGTF0Q1d s8oorIS+ijUhOMfVIAY/lm8raTlB7K0bR36cXHF0VP3ejZUZGVG+9UoRdVMA0wMHcSssCPGhQd NQZzyCGEVMYh7cfw6hwSKqtcqC8RHuARHoGD3wsHzLnQ8bTPyxBFyEou0cqhzvllboHVpuj0Sl rarC/hPox4xGdXuqvQuXUKbY Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Aug 2019 01:10:15 -0700 IronPort-SDR: KCgfmim0LZoizuNb3tNXUjQ8l56Ob+fXxDfeo5jNv5r8HuJ05+nj1ZrUeuWb1K7BQTDQgUjzyZ NUsvl9IfC6ulaXSqaOfMcbJXfcBCa5r7bZa8KeDHI12hJIjGUxhX2tOzQGKhckjFzNPo4gF+YA dHB8sCl1Ee4BZM9cZ0A69rpx79YZLH4270pt3M0U8FOGjDpXKUJ8b1M3bcyafQoRTCvdv21Ua/ V2kPCPsYP16o8iqLQ+TZl2gNcM4opyA6envdrwd9PbLbuMpwLA5xDNZb2xsxGR0MS6Elsb6iHR qyA= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 20 Aug 2019 01:12:50 -0700 From: Damien Le Moal <damien.lemoal@wdc.com> To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de>, Johannes Thumshirn <jthumshirn@suse.de>, Dave Chinner <david@fromorbit.com>, "Darrick J . Wong" <darrick.wong@oracle.com> Cc: Hannes Reinecke <hare@suse.de>, Matias Bjorling <matias.bjorling@wdc.com> Subject: [PATCH V2] fs: New zonefs file system Date: Tue, 20 Aug 2019 17:12:49 +0900 Message-Id: <20190820081249.27353-1-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: <linux-xfs.vger.kernel.org> X-Mailing-List: linux-xfs@vger.kernel.org
Series	[V2] fs: New zonefs file system \| expand [V2] fs: New zonefs file system

[V2] fs: New zonefs file system

Commit Message

Comments

Patch