Message ID | 20181210171318.16998-1-vgoyal@redhat.com (mailing list archive) |
---|---|
Headers | show |
Series | virtio-fs: shared file system for virtual machines | expand |
On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote: > Hi, > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > These patches should apply on top of 4.20-rc5. We have also put code for > various components here. > > https://gitlab.com/virtio-fs A draft specification for the virtio-fs device is available here: https://stefanha.github.io/virtio/virtio-fs.html#x1-38800010 (HTML) https://github.com/stefanha/virtio/commit/e1cac3777ef03bc9c5c8ee91bcc6ba478272e6b6 Stefan
On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote: > Hi, > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > These patches should apply on top of 4.20-rc5. We have also put code for > various components here. > > https://gitlab.com/virtio-fs > > Problem Description > =================== > We want to be able to take a directory tree on the host and share it with > guest[s]. Our goal is to be able to do it in a fast, consistent and secure > manner. Our primary use case is kata containers, but it should be usable in > other scenarios as well. > > Containers may rely on local file system semantics for shared volumes, > read-write mounts that multiple containers access simultaneously. File > system changes must be visible to other containers with the same consistency > expected of a local file system, including mmap MAP_SHARED. > > Existing Solutions > ================== > We looked at existing solutions and virtio-9p already provides basic shared > file system functionality although does not offer local file system semantics, > causing some workloads and test suites to fail. In addition, virtio-9p > performance has been an issue for Kata Containers and we believe this cannot > be alleviated without major changes that do not fit into the 9P protocol. > > Design Overview > =============== > With the goal of designing something with better performance and local file > system semantics, a bunch of ideas were proposed. > > - Use fuse protocol (instead of 9p) for communication between guest > and host. Guest kernel will be fuse client and a fuse server will > run on host to serve the requests. Benchmark results (see below) are > encouraging and show this approach performs well (2x to 8x improvement > depending on test being run). > > - For data access inside guest, mmap portion of file in QEMU address > space and guest accesses this memory using dax. That way guest page > cache is bypassed and there is only one copy of data (on host). This > will also enable mmap(MAP_SHARED) between guests. > > - For metadata coherency, there is a shared memory region which contains > version number associated with metadata and any guest changing metadata > updates version number and other guests refresh metadata on next > access. This is still experimental and implementation is not complete. What about Windows guests or BSD ones? Is there a plan to make that work with them as well? What about the Virtio spec? Plans to make changes there as well?
On Wed, Dec 12, 2018 at 03:30:49PM -0500, Konrad Rzeszutek Wilk wrote: > On Mon, Dec 10, 2018 at 12:12:26PM -0500, Vivek Goyal wrote: > > Hi, > > > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > > > These patches should apply on top of 4.20-rc5. We have also put code for > > various components here. > > > > https://gitlab.com/virtio-fs > > > > Problem Description > > =================== > > We want to be able to take a directory tree on the host and share it with > > guest[s]. Our goal is to be able to do it in a fast, consistent and secure > > manner. Our primary use case is kata containers, but it should be usable in > > other scenarios as well. > > > > Containers may rely on local file system semantics for shared volumes, > > read-write mounts that multiple containers access simultaneously. File > > system changes must be visible to other containers with the same consistency > > expected of a local file system, including mmap MAP_SHARED. > > > > Existing Solutions > > ================== > > We looked at existing solutions and virtio-9p already provides basic shared > > file system functionality although does not offer local file system semantics, > > causing some workloads and test suites to fail. In addition, virtio-9p > > performance has been an issue for Kata Containers and we believe this cannot > > be alleviated without major changes that do not fit into the 9P protocol. > > > > Design Overview > > =============== > > With the goal of designing something with better performance and local file > > system semantics, a bunch of ideas were proposed. > > > > - Use fuse protocol (instead of 9p) for communication between guest > > and host. Guest kernel will be fuse client and a fuse server will > > run on host to serve the requests. Benchmark results (see below) are > > encouraging and show this approach performs well (2x to 8x improvement > > depending on test being run). > > > > - For data access inside guest, mmap portion of file in QEMU address > > space and guest accesses this memory using dax. That way guest page > > cache is bypassed and there is only one copy of data (on host). This > > will also enable mmap(MAP_SHARED) between guests. > > > > - For metadata coherency, there is a shared memory region which contains > > version number associated with metadata and any guest changing metadata > > updates version number and other guests refresh metadata on next > > access. This is still experimental and implementation is not complete. > > What about Windows guests or BSD ones? Is there a plan to make that work with them as well? Hi Konrad, I have not thought much about making it work on Windows or BSD yet. Does Fuse work with windows. I am assuming it does with BSD. As long as FUSE works, I am assuming that atleast basic mode can be made to work. > > What about the Virtio spec? Plans to make changes there as well? There are plans to change that. Stefan posted a proposal here. https://lists.oasis-open.org/archives/virtio-dev/201812/msg00073.html Thanks Vivek
Vivek Goyal <vgoyal@redhat.com> writes: > Hi, > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > These patches should apply on top of 4.20-rc5. We have also put code for > various components here. > > https://gitlab.com/virtio-fs > > Problem Description > =================== > We want to be able to take a directory tree on the host and share it with > guest[s]. Our goal is to be able to do it in a fast, consistent and secure > manner. Our primary use case is kata containers, but it should be usable in > other scenarios as well. > > Containers may rely on local file system semantics for shared volumes, > read-write mounts that multiple containers access simultaneously. File > system changes must be visible to other containers with the same consistency > expected of a local file system, including mmap MAP_SHARED. > > Existing Solutions > ================== > We looked at existing solutions and virtio-9p already provides basic shared > file system functionality although does not offer local file system semantics, > causing some workloads and test suites to fail. Can you elaborate on this? Is this with 9p2000.L ? We did quiet a lot of work to make sure posix test suite pass on 9p file system. Also was the mount option with cache=loose? -aneesh
On Tue, Feb 12, 2019 at 09:26:48PM +0530, Aneesh Kumar K.V wrote: > Vivek Goyal <vgoyal@redhat.com> writes: > > > Hi, > > > > Here are RFC patches for virtio-fs. Looking for feedback on this approach. > > > > These patches should apply on top of 4.20-rc5. We have also put code for > > various components here. > > > > https://gitlab.com/virtio-fs > > > > Problem Description > > =================== > > We want to be able to take a directory tree on the host and share it with > > guest[s]. Our goal is to be able to do it in a fast, consistent and secure > > manner. Our primary use case is kata containers, but it should be usable in > > other scenarios as well. > > > > Containers may rely on local file system semantics for shared volumes, > > read-write mounts that multiple containers access simultaneously. File > > system changes must be visible to other containers with the same consistency > > expected of a local file system, including mmap MAP_SHARED. > > > > Existing Solutions > > ================== > > We looked at existing solutions and virtio-9p already provides basic shared > > file system functionality although does not offer local file system semantics, > > causing some workloads and test suites to fail. > > Can you elaborate on this? Is this with 9p2000.L ? We did quiet a lot of > work to make sure posix test suite pass on 9p file system. Also > was the mount option with cache=loose? Hi Aneesh, Yes this is with 9p2000.L and cache=loose. I used following mount option. mount -t 9p -o trans=virtio hostShared /mnt/virtio-9p/ -oversion=9p2000.L,posixacl,cache=loose We noticed primarily two issues. - Ran pjdfstests and a lot of them are failing. I think even kata container folks also experienced pjdfstests failures. I have never looked into details of why it is failing. - We thought mmap(MAP_SHARED) will not work with virtio-9p when two clients are running in two different VMs and mapped same file with MAP_SHARED. Having said that, biggest concern with virtio-9p seems to be performance. We are looking for ways to improve performance with virtio-fs. Hoping DAX can provide faster data access and fuse protocol itself seems to be faster (in primilinary testing results). Thanks Vivek