From patchwork Sun Feb 4 23:05:30 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dan Williams X-Patchwork-Id: 10199541 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E827E60318 for ; Sun, 4 Feb 2018 23:14:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DE39528437 for ; Sun, 4 Feb 2018 23:14:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D0EB128451; Sun, 4 Feb 2018 23:14:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.1 Received: from ml01.01.org (ml01.01.org [198.145.21.10]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 5A42A28437 for ; Sun, 4 Feb 2018 23:14:38 +0000 (UTC) Received: from [127.0.0.1] (localhost [IPv6:::1]) by ml01.01.org (Postfix) with ESMTP id 1AF8D223AF83F; Sun, 4 Feb 2018 15:08:57 -0800 (PST) X-Original-To: linux-nvdimm@lists.01.org Delivered-To: linux-nvdimm@lists.01.org Received-SPF: Pass (sender SPF authorized) identity=mailfrom; client-ip=134.134.136.100; helo=mga07.intel.com; envelope-from=dan.j.williams@intel.com; receiver=linux-nvdimm@lists.01.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id 5827C2237A4DE for ; Sun, 4 Feb 2018 15:08:55 -0800 (PST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Feb 2018 15:14:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,462,1511856000"; d="scan'208";a="31943670" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga002.jf.intel.com with ESMTP; 04 Feb 2018 15:14:36 -0800 Subject: [PATCH 3/3] vfio: disable filesystem-dax page pinning From: Dan Williams To: alex.williamson@redhat.com Date: Sun, 04 Feb 2018 15:05:30 -0800 Message-ID: <151778553083.7139.6601964812589807125.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <151778551496.7139.17808629759104553625.stgit@dwillia2-desk3.amr.corp.intel.com> References: <151778551496.7139.17808629759104553625.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 X-BeenThere: linux-nvdimm@lists.01.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Linux-nvdimm developer list." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michal Hocko , jack@suse.cz, kvm@vger.kernel.org, linux-nvdimm@lists.01.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, linux-fsdevel@vger.kernel.org, hch@lst.de Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" X-Virus-Scanned: ClamAV using ClamSMTP Filesystem-DAX is incompatible with 'longterm' page pinning. Without page cache indirection a DAX mapping maps filesystem blocks directly. This means that the filesystem must not modify a file's block map while any page in a mapping is pinned. In order to prevent the situation of userspace holding of filesystem operations indefinitely, disallow 'longterm' Filesystem-DAX mappings. RDMA has the same conflict and the plan there is to add a 'with lease' mechanism to allow the kernel to notify userspace that the mapping is being torn down for block-map maintenance. Perhaps something similar can be put in place for vfio. Note that xfs and ext4 still report: "DAX enabled. Warning: EXPERIMENTAL, use at your own risk" ...at mount time, and resolving the dax-dma-vs-truncate problem is one of the last hurdles to remove that designation. Cc: Alex Williamson Cc: Michal Hocko Cc: Christoph Hellwig Cc: kvm@vger.kernel.org Cc: Reported-by: Haozhong Zhang Fixes: d475c6346a38 ("dax,ext2: replace XIP read and write with DAX I/O") Signed-off-by: Dan Williams Acked-by: Alex Williamson --- drivers/vfio/vfio_iommu_type1.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index e30e29ae4819..45657e2b1ff7 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -338,11 +338,12 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, { struct page *page[1]; struct vm_area_struct *vma; + struct vm_area_struct *vmas[1]; int ret; if (mm == current->mm) { - ret = get_user_pages_fast(vaddr, 1, !!(prot & IOMMU_WRITE), - page); + ret = get_user_pages_longterm(vaddr, 1, !!(prot & IOMMU_WRITE), + page, vmas); } else { unsigned int flags = 0; @@ -351,7 +352,18 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, down_read(&mm->mmap_sem); ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page, - NULL, NULL); + vmas, NULL); + /* + * The lifetime of a vaddr_get_pfn() page pin is + * userspace-controlled. In the fs-dax case this could + * lead to indefinite stalls in filesystem operations. + * Disallow attempts to pin fs-dax pages via this + * interface. + */ + if (ret > 0 && vma_is_fsdax(vmas[0])) { + ret = -EOPNOTSUPP; + put_page(page[0]); + } up_read(&mm->mmap_sem); }