Message ID | 20190219115136.29952-1-boaz@plexistor.com (mailing list archive) |
---|---|
Headers | show |
Series | zuf: ZUFS Zero-copy User-mode FileSystem | expand |
On Tue, Feb 19, 2019 at 01:51:19PM +0200, Boaz harrosh wrote: > Please see first patch for License of this project > > Current status: There are a couple of trivial open-source filesystem > implementations and a full blown proprietary implementation from Netapp. I regard this patchset as being an attempt to avoid your obligations under the GPL. As such, I will not be reviewing this code and I oppose its inclusion.
On 19/02/19 14:15, Matthew Wilcox wrote: > On Tue, Feb 19, 2019 at 01:51:19PM +0200, Boaz harrosh wrote: >> Please see first patch for License of this project >> >> Current status: There are a couple of trivial open-source filesystem >> implementations and a full blown proprietary implementation from Netapp. > > I regard this patchset as being an attempt to avoid your obligations > under the GPL. As such, I will not be reviewing this code and I oppose > its inclusion. > Dearest Matthew One day We'll sit on a bear and you explain to me. I do trust your opinion, but I do not understand. Specifically the above "full blown proprietary implementation from Netapp" does not break the GPL at all. Parts of it are written in languages alien to the Kernel and parts using user-mode libs and code IP that are not able to live in the Kernel. At the beginning we had code to inject the FS into the application of choice vi ld.so and only selected apps like a DB would have a view of the filesystem. But you can imagine how this is a nightmare for IT. Being POSIX under the Kernel is just so much less inventing the wheel say: backup, disaster-recovery, cloud .... Now actually if you look at the code submitted you will see that we are using very very little out of the Kernel. Actually for comparison FUSE is using the Kernel much heavier. Utilizing page-cache, Kernel re-claimers. Smart write-back the lot. In ZUFS we take the upper most interfaces and send it down stream as is. Where ever there is depth of stack we take the top most level and push that to server as is completely synchronous to the app threads. The only real novelty in this project is something completely new to this submission, it is the new RPC we invented here that utilizes per-cpu Technics to show a kind of performance never seen before between two processes. You are a Kernel contributor, you have IP in the Kernel. Your opinion is very important to me and to Netapp. Please point me to these areas that you feel I have stepped on your IP, and have not respected the GPL? And I would want very much to fix it. Or maybe my sin is that I am to successful? Is the GPL guarded by speed? I mean say FUSE it is already doing all these sins. And or other subsystems that bridge Kernel functionality to user-mode. There are other user-mode "drivers" all over the place. But they are all so slooooooow. So a serious FS or server needs to sit in Kernel. With zufs we can now delegate to user-mode. The kernel becomes a micro-kernel, very-fast-bridge, and moves out of the way. Creating space for serious servers to sit in userland. To summarize. I take your statement very seriously. Please state what service of the GPLed Kernel am I exposing and circumventing and I will want to fix it ASAP. I thought, and my philosophy was to take the POSIX interfaces as high as possible and shove them to userland. In an RPC manner that I invented that is very fast. If there are such areas that I am not doing so. Please show me? Best regards Boaz
From: Boaz Harrosh <boazh@netapp.com> I would please like to present the ZUFS file system and the Kernel code part in this patchset. The Kernel code presented here can be found at: https://github.com/NetApp/zufs-zuf And the User-mode Server + example FSs here: https://github.com/NetApp/zufs-zus ZUFS - stands for Zero-copy User-mode FS * It is geared towards true zero copy end to end of both data and meta data. * It is geared towards very *low latency*, very high CPU locality, lock-less parallelism. * Synchronous operations (for low latency) * Numa awareness Short description: ZUFS is a from scratch implementation of a filesystem-in-user-space, which tries to address the above goals. from the get go it is aimed for pmem based FSs. But can easily support other type of FSs that can utilize x10 latency and parallelism improvements. The novelty of this project is that the interface is designed with a modern multi-core NUMA machine in mind down to the ABI, so to reach these goals. Please see first patch for License of this project Current status: There are a couple of trivial open-source filesystem implementations and a full blown proprietary implementation from Netapp. Together with the Kernel module submitted here the User-mode-Server and the zusFSs User-mode plugins, this code pass Netapp QA including xfstests + internal QA tests. And was released to costumers as Maxdata 1.2. So it is very stable. In the git repository above there is also a backport for rhel 7.6. Including rpm packages for Kernel and Server components. (Also available evaluation licenses of Maxdata 1.2 for developers. Please contact Amit Golander <Amit.Golander@netapp.com> if you need one) Just to get some points across as I said this project is all about performance and low latency. Here below are some results I have run: [fuse] threads wr_iops wr_bw wr_lat 1 33606 134424 26.53226 2 57056 228224 30.38476 3 73142 292571 35.75727 4 88667 354668 40.12783 5 102280 409122 42.13261 6 110122 440488 48.29697 7 116561 466245 53.98572 8 129134 516539 55.6134 [fuse-splice] threads wr_iops wr_bw wr_lat 1 39670 158682 21.8399 2 51100 204400 34.63294 3 62385 249542 39.28847 4 75220 300882 47.42344 5 84522 338088 52.97299 6 93042 372168 57.40804 7 97706 390825 63.04435 8 98034 392137 73.24263 [xfs-dax] threads wr_iops wr_bw wr_lat 1 19449 77799 48.03282 2 37704 150819 37.2343 3 55415 221663 30.59375 4 72285 289142 26.08636 5 90348 361392 23.89037 6 103696 414787 22.38045 7 120638 482552 21.38869 8 134157 536630 21.1426 [Maxdata-1.2-zufs] threads wr_iops wr_bw wr_lat 1 57506 230026 14.387113 2 98624 394498 16.790232 3 142276 569106 17.344622 4 187984 751936 17.527123 5 190304 761219 19.504314 6 221407 885628 20.862000 7 211579 846316 23.262040 8 246029 984116 24.630604 [*1] These good results are when an mm patch is applied which introduces VM_LOCAL_CPU flag that eliminates vm_zap_ptes from scheduling on all CPUs when creating a per-cpu VMA. This patch was not accepted by the Linux Kernel community and is not presented in this patchset. (Patch available for review on demand) But a few weeks from now I will submit some incremental changes to the code which will return the numbers to above, and even better for some benchmarks. (without the mm patch) I have used an 8 way KVM-qemu with 2 NUMA nodes. Running fio with 4k random writes O_DIRECT | O_SYNC to a DRAM simulated pmem. (memmap=! at grub), Fuse-fs was a memcpy same 4k null-FS fio was then run with more and more threads (see threads column) to test for scalability. We are still > x2 slower than I would like to. (Compared to an in-kernel pmem-base FS) But I believe I can shave off another 1-2 us by farther optimizing the app-to-server thread switch by developing a new scheduler-object so to avoid going through the scheduler all together (and its locks) when switching VMs. (Currently using couple of wait_queue_head_t with wait_event() calls See relay.h in patches) Please Review and ask any question big or trivial. I would love to iron this code, and submit it upstream. Thank you for reading Boaz ~~~~~~~~~~~~~~~~~~ Boaz Harrosh (17): fs: Add the ZUF filesystem to the build + License zuf: Preliminary Documentation zuf: zuf-rootfs zuf: zuf-core The ZTs zuf: Multy Devices zuf: mounting zuf: Namei and directory operations zuf: readdir operation zuf: symlink zuf: More file operation zuf: Write/Read implementation zuf: mmap & sync zuf: ioctl implementation zuf: xattr implementation zuf: ACL support zuf: Special IOCTL fadvise (TODO) zuf: Support for dynamic-debug of zusFSs Documentation/filesystems/zufs.txt | 351 ++++++++ fs/Kconfig | 1 + fs/Makefile | 1 + fs/zuf/Kconfig | 23 + fs/zuf/Makefile | 23 + fs/zuf/_extern.h | 166 ++++ fs/zuf/_pr.h | 62 ++ fs/zuf/acl.c | 281 +++++++ fs/zuf/directory.c | 167 ++++ fs/zuf/file.c | 527 ++++++++++++ fs/zuf/inode.c | 648 ++++++++++++++ fs/zuf/ioctl.c | 306 +++++++ fs/zuf/md.c | 761 +++++++++++++++++ fs/zuf/md.h | 318 +++++++ fs/zuf/md_def.h | 145 ++++ fs/zuf/mmap.c | 336 ++++++++ fs/zuf/module.c | 28 + fs/zuf/namei.c | 435 ++++++++++ fs/zuf/relay.h | 88 ++ fs/zuf/rw.c | 705 ++++++++++++++++ fs/zuf/super.c | 771 +++++++++++++++++ fs/zuf/symlink.c | 74 ++ fs/zuf/t1.c | 138 +++ fs/zuf/t2.c | 375 +++++++++ fs/zuf/t2.h | 68 ++ fs/zuf/xattr.c | 310 +++++++ fs/zuf/zuf-core.c | 1257 ++++++++++++++++++++++++++++ fs/zuf/zuf-root.c | 431 ++++++++++ fs/zuf/zuf.h | 414 +++++++++ fs/zuf/zus_api.h | 869 +++++++++++++++++++ 30 files changed, 10079 insertions(+) create mode 100644 Documentation/filesystems/zufs.txt create mode 100644 fs/zuf/Kconfig create mode 100644 fs/zuf/Makefile create mode 100644 fs/zuf/_extern.h create mode 100644 fs/zuf/_pr.h create mode 100644 fs/zuf/acl.c create mode 100644 fs/zuf/directory.c create mode 100644 fs/zuf/file.c create mode 100644 fs/zuf/inode.c create mode 100644 fs/zuf/ioctl.c create mode 100644 fs/zuf/md.c create mode 100644 fs/zuf/md.h create mode 100644 fs/zuf/md_def.h create mode 100644 fs/zuf/mmap.c create mode 100644 fs/zuf/module.c create mode 100644 fs/zuf/namei.c create mode 100644 fs/zuf/relay.h create mode 100644 fs/zuf/rw.c create mode 100644 fs/zuf/super.c create mode 100644 fs/zuf/symlink.c create mode 100644 fs/zuf/t1.c create mode 100644 fs/zuf/t2.c create mode 100644 fs/zuf/t2.h create mode 100644 fs/zuf/xattr.c create mode 100644 fs/zuf/zuf-core.c create mode 100644 fs/zuf/zuf-root.c create mode 100644 fs/zuf/zuf.h create mode 100644 fs/zuf/zus_api.h