[v6] virtio-fs: add virtiofs filesystem

From: Stefan Hajnoczi <stefanha@redhat.com>

From: Stefan Hajnoczi <stefanha@redhat.com>

Michael,

Here's a v6 of the virtiofs code (fuse.git#virtiofs-v6).  I think we've
addressed all your comments.

Would you mind giving it another look, and if you're satisfied acking this
patch?

Thanks,
Miklos

----
Add a basic file system module for virtio-fs.  This does not yet contain
shared data support between host and guest or metadata coherency speedups.
However it is already significantly faster than virtio-9p.

Design Overview
===============

With the goal of designing something with better performance and local file
system semantics, a bunch of ideas were proposed.

 - Use fuse protocol (instead of 9p) for communication between guest and
   host.  Guest kernel will be fuse client and a fuse server will run on
   host to serve the requests.

 - For data access inside guest, mmap portion of file in QEMU address space
   and guest accesses this memory using dax.  That way guest page cache is
   bypassed and there is only one copy of data (on host).  This will also
   enable mmap(MAP_SHARED) between guests.

 - For metadata coherency, there is a shared memory region which contains
   version number associated with metadata and any guest changing metadata
   updates version number and other guests refresh metadata on next access.
   This is yet to be implemented.

How virtio-fs differs from existing approaches
==============================================

The unique idea behind virtio-fs is to take advantage of the co-location of
the virtual machine and hypervisor to avoid communication (vmexits).

DAX allows file contents to be accessed without communication with the
hypervisor.  The shared memory region for metadata avoids communication in
the common case where metadata is unchanged.

By replacing expensive communication with cheaper shared memory accesses,
we expect to achieve better performance than approaches based on network
file system protocols.  In addition, this also makes it easier to achieve
local file system semantics (coherency).

These techniques are not applicable to network file system protocols since
the communications channel is bypassed by taking advantage of shared memory
on a local machine.  This is why we decided to build virtio-fs rather than
focus on 9P or NFS.

Caching Modes
=============

Like virtio-9p, different caching modes are supported which determine the
coherency level as well.  The “cache=FOO” and “writeback” options control
the level of coherence between the guest and host filesystems.

 - cache=none
   metadata, data and pathname lookup are not cached in guest.  They are
   always fetched from host and any changes are immediately pushed to host.

 - cache=always
   metadata, data and pathname lookup are cached in guest and never expire.

 - cache=auto
   metadata and pathname lookup cache expires after a configured amount of
   time (default is 1 second).  Data is cached while the file is open
   (close to open consistency).

 - writeback/no_writeback
   These options control the writeback strategy.  If writeback is disabled,
   then normal writes will immediately be synchronized with the host fs.
   If writeback is enabled, then writes may be cached in the guest until
   the file is closed or an fsync(2) performed.  This option has no effect
   on mmap-ed writes or writes going through the DAX mechanism.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fuse/Kconfig                 |   11 +
 fs/fuse/Makefile                |    1 +
 fs/fuse/fuse_i.h                |    9 +
 fs/fuse/inode.c                 |    4 +
 fs/fuse/virtio_fs.c             | 1195 +++++++++++++++++++++++++++++++
 include/uapi/linux/virtio_fs.h  |   19 +
 include/uapi/linux/virtio_ids.h |    1 +
 7 files changed, 1240 insertions(+)
 create mode 100644 fs/fuse/virtio_fs.c
 create mode 100644 include/uapi/linux/virtio_fs.h

Message ID	20190912141931.30819-1-mszeredi@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=gBjL=XH=vger.kernel.org=linux-fsdevel-owner@kernel.org> Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8CA62912 for <patchwork-linux-fsdevel@patchwork.kernel.org>; Thu, 12 Sep 2019 14:19:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 566F7206A5 for <patchwork-linux-fsdevel@patchwork.kernel.org>; Thu, 12 Sep 2019 14:19:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732613AbfILOTn (ORCPT <rfc822;patchwork-linux-fsdevel@patchwork.kernel.org>); Thu, 12 Sep 2019 10:19:43 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35968 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732250AbfILOTk (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>); Thu, 12 Sep 2019 10:19:40 -0400 Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 31AD5C0546F2 for <linux-fsdevel@vger.kernel.org>; Thu, 12 Sep 2019 14:19:39 +0000 (UTC) Received: by mail-wr1-f70.google.com with SMTP id s5so12019916wrv.23 for <linux-fsdevel@vger.kernel.org>; Thu, 12 Sep 2019 07:19:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=px9tv2b3lkzK00H/9sgzIyJmyYHOZXeZmuvwrRADMQo=; b=Xgh/Nqe6/lWbZOKB/XmBvELg4dFRQrhgCuGzsYYtl+6vj1s6+RXBazU5snH7jhViSz oC3EXMtz5TcL5EBOYk9mVkUiAA/Gr6eaP994UGakucuuHUsIfQLPqYdxHKun3Duhb6s9 VjJcrpNIozKyLC5iga0hoz1MP4Rvx3O+tEG9JHUHol6x9SEGTKREAA9brdnx5xHBXf7N w2r2lLBSar7bNCyLphPv+h0CEZsvv1KuJzgm+JfbUuh+VMZnvOw+CwjWXiqsiChb4EOR wHDuCdyld+eS+Qxg79vv1+11/oRVZf/DopwdGmru3ODX4rIWhssOyX0jHmx1Md9T8bos JeUw== X-Gm-Message-State: APjAAAXPbmlFm0+IwgOvSYadv45nTGZJyD6nSM+zV25u8E8UpLQ0B4Q7 4WN650KGKSfKFA/t5icc+rInUahDBV5EZZ3BjulGoymatMbUUA9juhoKik/kLYDQQqHBgXhvbEx XK7fmjydkx3sxjPv+qLUnsXEgTw== X-Received: by 2002:a5d:574c:: with SMTP id q12mr35601071wrw.69.1568297977055; Thu, 12 Sep 2019 07:19:37 -0700 (PDT) X-Google-Smtp-Source: APXvYqxRZNSgUOSlYzCg0QI+5S9lW4Dn79Mp5IwAKlrVLMIjf0eHjQuljwiWCLEXq8NCAI84OUzDkQ== X-Received: by 2002:a5d:574c:: with SMTP id q12mr35601006wrw.69.1568297976100; Thu, 12 Sep 2019 07:19:36 -0700 (PDT) Received: from miu.piliscsaba.redhat.com (catv-212-96-48-140.catv.broadband.hu. [212.96.48.140]) by smtp.gmail.com with ESMTPSA id o193sm132240wme.39.2019.09.12.07.19.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Sep 2019 07:19:35 -0700 (PDT) From: Miklos Szeredi <mszeredi@redhat.com> To: virtualization@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org Cc: Stefan Hajnoczi <stefanha@redhat.com>, linux-kernel@vger.kernel.org, "Michael S. Tsirkin" <mst@redhat.com>, Vivek Goyal <vgoyal@redhat.com>, "Dr. David Alan Gilbert" <dgilbert@redhat.com> Subject: [PATCH v6] virtio-fs: add virtiofs filesystem Date: Thu, 12 Sep 2019 16:19:31 +0200 Message-Id: <20190912141931.30819-1-mszeredi@redhat.com> X-Mailer: git-send-email 2.21.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: <linux-fsdevel.vger.kernel.org> X-Mailing-List: linux-fsdevel@vger.kernel.org
Series	[v6] virtio-fs: add virtiofs filesystem \| expand [v6] virtio-fs: add virtiofs filesystem

[v6] virtio-fs: add virtiofs filesystem

Commit Message

Comments

Patch