[RFC] fs: New zonefs file system

zonefs is a very simple file system exposing each zone of a zoned
block device as a file. This is intended to simplify implementation of
application zoned block device raw access support by allowing switching
to the well known POSIX file API rather than relying on direct block
device file ioctls and read/write. Zonefs, for instance, greatly
simplifies the implementation of LSM (log-structured merge) tree
structures (such as used in RocksDB and LevelDB) on zoned block devices
by allowing SSTables to be stored in a zone file similarly to a regular
file system architecture, hence reducing the amount of change needed in
the application.

Due to its simplicity, zonefs is in fact closer to a raw block device
access interface than to a full feature POSIX file system. This RFC
implementation uses Christoph's work on generic iomap writepage
implementation and is based on Christoph's gfs2 tree available at
http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/gfs2-iomap

Zonefs on-disk metadata is reduced to a super block to store a magic
number, a uuid and optional features flags and values. On mount, zonefs
uses blkdev_report_zones() to obtain the device zone configuration and
populates the mount point with a static file tree solely based on this
information. E.g. file sizes come from zone write pointer offset managed
by the device itself.

The zone files created on mount have the following characteristics.
1) Files representing zones of the same type are grouped together
   under a common directory:
  * For conventional zones, the directory "cnv" is used.
  * For sequential write zones, the directory "seq" is used.
  These two directories are the only directories that exist in zonefs.
  Users cannot create other directories and cannot rename nor delete
  the "cnv" and "seq" directories.
2) The name of zone files is by default the number of the file within
   the zone type directory, in order of increasing zone start sector.
3) The size of conventional zone files is fixed to the device zone size.
   Conventional zone files cannot be truncated.
4) The size of sequential zone files represent the file zone write
   pointer position relative to the zone start sector. Truncating these
   files is allowed only down to 0, in wich case, the zone is reset to
   rewind the file zone write pointer position to the start of the zone.
5) All read and write operations to files are not allowed beyond the
   file zone size. Any access exceeding the zone size is failed with
   the -EFBIG error.
6) Creating, deleting, renaming or modifying any attribute of files
   and directories is not allowed. The only exception being the file
   size of sequential zone files which can be modified by write
   operations or truncation to 0.

Several optional features of zonefs can be enabled at format time.
* Conventional zone aggregation: contiguous conventional zones can be
  agregated into a single larger file instead of multiple per-zone
  files.
* File naming: the default file number file name can be switched to
  using the base-10 value of the file zone start sector.
* File ownership: The owner UID and GID of zone files is by default 0
  (root) but can be changed to any valid UID/GID.
* File access permissions: the default 640 access permissions can be
  changed.

The mkzonefs tool is used to format zonefs. This tool will be available
on Github at:

https://github.com/westerndigitalcorporation/zonefs-tools

Example: the following formats a host-managed SMR HDD with the
conventional zone aggregation feature enabled.

mkzonefs -o aggr_cnv /dev/sdX
mount -t zonefs /dev/sdX /mnt
ls -l /mnt/
total 0
dr-xr-xr-x 2 root root 0 Apr 11 13:00 cnv
dr-xr-xr-x 2 root root 0 Apr 11 13:00 seq

ls -l /mnt/cnv
total 137363456
-rw-rw---- 1 root root 140660178944 Apr 11 13:00 0

ls -Fal -v /mnt/seq
total 14511243264
dr-xr-xr-x 2 root root 15942528 Jul 10 11:53 ./
drwxr-xr-x 4 root root     1152 Jul 10 11:53 ../
-rw-r----- 1 root root        0 Jul 10 11:53 0
-rw-r----- 1 root root 33554432 Jul 10 13:43 1
-rw-r----- 1 root root        0 Jul 10 11:53 2
-rw-r----- 1 root root        0 Jul 10 11:53 3
...

The aggregated conventional zone file can be used as a regular file.
Operations such as the following work.

mkfs.ext4 /mnt/cnv/0
mount -o loop /mnt/cnv/0 /data

Contains contributions from Johannes Thumshirn <jthumshirn@suse.de>
and Christoph Hellwig <hch@lst.de>.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
---
 fs/Kconfig                 |    2 +
 fs/Makefile                |    1 +
 fs/zonefs/Kconfig          |    9 +
 fs/zonefs/Makefile         |    4 +
 fs/zonefs/super.c          | 1004 ++++++++++++++++++++++++++++++++++++
 fs/zonefs/zonefs.h         |  190 +++++++
 include/uapi/linux/magic.h |    1 +
 7 files changed, 1211 insertions(+)
 create mode 100644 fs/zonefs/Kconfig
 create mode 100644 fs/zonefs/Makefile
 create mode 100644 fs/zonefs/super.c
 create mode 100644 fs/zonefs/zonefs.h

Message ID	20190712030017.14321-1-damien.lemoal@wdc.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-xfs-owner@kernel.org> Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C4A53138D for <patchwork-linux-xfs@patchwork.kernel.org>; Fri, 12 Jul 2019 03:00:24 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AD20E2890B for <patchwork-linux-xfs@patchwork.kernel.org>; Fri, 12 Jul 2019 03:00:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9E18028BBB; Fri, 12 Jul 2019 03:00:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.7 required=2.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEXHASH_WORD,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6AF892890B for <patchwork-linux-xfs@patchwork.kernel.org>; Fri, 12 Jul 2019 03:00:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728850AbfGLDAW (ORCPT <rfc822;patchwork-linux-xfs@patchwork.kernel.org>); Thu, 11 Jul 2019 23:00:22 -0400 Received: from esa4.hgst.iphmx.com ([216.71.154.42]:55534 "EHLO esa4.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728485AbfGLDAV (ORCPT <rfc822;linux-xfs@vger.kernel.org>); Thu, 11 Jul 2019 23:00:21 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1562900421; x=1594436421; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=B1w6UltCHLH+fhv2UsRfqVGCShMLffqkWZTuwHv6cvQ=; b=gQDNSyxnCHCmA88QwmJ7IveXJaMWZ9/BfoVRapyTxcTOi3hu00uPwnRV WhE+hsoaEVq/OHuPPOPNGJfkDW88Bamy8BewZ2D6606HOYPu2rh0haRfC y1uGcGKluA14d+3o4FGyF3unoK49eZMOpjh35nXPDSRLhpoonarusvY1K hrF3i/kKRJ2NFMt3vtLvXxDwClkTqaTwHsxNQvWChkIyK7LYjkW4AxpW2 bP/dMfnXr3rd2SZgk+A7iz4HQlTg/92hjWHXHFFSVH8NeO6/khFor2ZRW Rpa6ee9/s1ce3IAzSSpZ1Jzm1j1ZJlnB/rbOz3cM2LXe1vPWfGkpH+Zfd g==; IronPort-SDR: eLfWaxmXqy12Slgz4pc7HXs8+aUsTLp4Z0oIRLG4gjIo5HQSORp0zjYYgRSkbjQNIX4235ozcr YOkaOBSlkzk2OA9kiE/MT0GgFZifhw3x2cYUOapAWJ3MTuk5Y4TSUwnJnHGecuydf8itXs+riS +xSk7jhnn322CXrVf8cvKnaScLNa7Sx1G2XrUQKuJcP401XOQu9cmGC+oAjr43LHhoQ51aKR// p/NInGM3JpfXWDzCpKQFBdq721noDqDZQGD+mhwg87A1tyKWU2V+6R3ZfWzSRaxI0uxfyeHhMS cP8= X-IronPort-AV: E=Sophos;i="5.63,480,1557158400"; d="scan'208";a="112846261" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 12 Jul 2019 11:00:20 +0800 IronPort-SDR: REP+G9zXpQ62lHSydtDT7nhfdudqlNGE/crllAEJF0BPawpOnl/Dnh2Ss9SXww7ESuRBHSLmLg TuZq6+Rwq+cSmnilqylofur61F8T3/CwgjD6/QKIBCoRvt/dKGHIc1uJEG4lpfFcjDdyfYr7+w mlKo5rh/9cyH50tK5is9HiQi0CQ3nf1qflMqJSzq4lCov+IQesPp0J6iPWlA6AIrodztHSv3KN z9VEzwECx1Ww1fWWCjsMC+8E0rA7yV1Ue44h56NJcr6ska0YzEKDieuk7zJ8VkWcvqiKpNPEyV tFfwV15Qb7PNB8TETGj1W9s0 Received: from uls-op-cesaip01.wdc.com ([10.248.3.36]) by uls-op-cesaep02.wdc.com with ESMTP; 11 Jul 2019 19:58:56 -0700 IronPort-SDR: 5wGwJCK5ukhou97OTRbdjDAYt8ep5LzJyNJPHrE2Inn++/fpmCyuXA0l8VCtJuDFGGh8EBB6Pd to5KH44snHIGc+aaBn/p0y3du3YqsHD4F3duSSz3ka/wdEEIJrBRfr40ZGw6VrKuSTmhBnHEai guQEE2AulGfBIYEnmdTFlZzsH9fUvT/2aYg+lxHVdE4NfEyxvGXeemWKVNKI2OYjfQ7VH7OsLk ru5ctG6YZwONyxbItAAI0U9EyYhdAP4koODYEMLeVENgvJ4CK91edlvAYUPbx6YznZzRaIMpcc FyU= Received: from washi.fujisawa.hgst.com ([10.149.53.254]) by uls-op-cesaip01.wdc.com with ESMTP; 11 Jul 2019 20:00:18 -0700 From: Damien Le Moal <damien.lemoal@wdc.com> To: linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de> Cc: Johannes Thumshirn <jthumshirn@suse.de>, Hannes Reinecke <hare@suse.de> Subject: [PATCH RFC] fs: New zonefs file system Date: Fri, 12 Jul 2019 12:00:17 +0900 Message-Id: <20190712030017.14321-1-damien.lemoal@wdc.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: <linux-xfs.vger.kernel.org> X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP
Series	[RFC] fs: New zonefs file system \| expand [RFC] fs: New zonefs file system

[RFC] fs: New zonefs file system

Commit Message

Comments

Patch