From patchwork Fri Jan 28 00:25:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727598 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4F41C43219 for ; Fri, 28 Jan 2022 00:26:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344531AbiA1A02 (ORCPT ); Thu, 27 Jan 2022 19:26:28 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46948 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236199AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=V/05GzOrgOl/UyUBKWfOm8imWYPgACNjFeMcbOGPBpc=; b=tLr1nf2CgmrXo3UylgDfejDLSA e7xAEnjM8MryfrJln0g42DeGCO4Tau2qcnIyG0UAgpBgHzPO155HM9LjSV55iVUuzjEEw/Aw0kTPf VoJL03pfwmN1Z9/T5DG4l3j86mVC9N/D0lVn2KL/Fre6jUPLw23Ngw4A2wNNmC+Cj7XRA3I29ze9G +/3EKIHzq2e29X3Gi5t4Ar5qNunlgnF/Jn2SL3QEX2T8gRsF9UlXfYMUHtDay/KcBT+wL1lQ/ToF1 Zkql/xSSCKKCXzk8vdecuhOKVjmehApGrLxNwOwyAj8sVjRx6Ycgw1RCx9yNmydD9HwLubB9aHPm/ h739BC5Q==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5p-005OcX-U8; Thu, 27 Jan 2022 17:26:23 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cD-7l; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Alex Sierra , Theodore Ts'o , "Darrick J . Wong" Date: Thu, 27 Jan 2022 17:25:51 -0700 Message-Id: <20220128002614.6136-2-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, christian.koenig@amd.com, alex.sierra@amd.com, tytso@mit.edu, helgaas@kernel.org, djwong@kernel.org X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 01/24] ext4/xfs: add page refcount helper X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Ralph Campbell There are several places where ZONE_DEVICE struct pages assume a reference count == 1 means the page is idle and free. Instead of open coding this, add a helper function to hide this detail. Signed-off-by: Ralph Campbell Signed-off-by: Alex Sierra Reviewed-by: Christoph Hellwig Acked-by: Theodore Ts'o Acked-by: Darrick J. Wong Reviewed-by: Chaitanya Kulkarni --- fs/dax.c | 4 ++-- fs/ext4/inode.c | 5 +---- fs/fuse/dax.c | 4 +--- fs/xfs/xfs_file.c | 4 +--- include/linux/dax.h | 10 ++++++++++ 5 files changed, 15 insertions(+), 12 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index cd03485867a7..d9b856cf6436 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -369,7 +369,7 @@ static void dax_disassociate_entry(void *entry, struct address_space *mapping, for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - WARN_ON_ONCE(trunc && page_ref_count(page) > 1); + WARN_ON_ONCE(trunc && !dax_page_unused(page)); WARN_ON_ONCE(page->mapping && page->mapping != mapping); page->mapping = NULL; page->index = 0; @@ -383,7 +383,7 @@ static struct page *dax_busy_page(void *entry) for_each_mapped_pfn(entry, pfn) { struct page *page = pfn_to_page(pfn); - if (page_ref_count(page) > 1) + if (!dax_page_unused(page)) return page; } return NULL; diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5f79d265d06a..99e3ed7bcd06 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3899,10 +3899,7 @@ int ext4_break_layouts(struct inode *inode) if (!page) return 0; - error = ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, - TASK_INTERRUPTIBLE, 0, 0, - ext4_wait_dax_page(inode)); + error = dax_wait_page(inode, page, ext4_wait_dax_page); } while (error == 0); return error; diff --git a/fs/fuse/dax.c b/fs/fuse/dax.c index 182b24a14804..f8b3f30d2303 100644 --- a/fs/fuse/dax.c +++ b/fs/fuse/dax.c @@ -676,9 +676,7 @@ static int __fuse_dax_break_layouts(struct inode *inode, bool *retry, return 0; *retry = true; - return ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, - 0, 0, fuse_wait_dax_page(inode)); + return dax_wait_page(inode, page, fuse_wait_dax_page); } /* dmap_end == 0 leads to unmapping of whole file */ diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 22ad207bedf4..57a440666935 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -859,9 +859,7 @@ xfs_break_dax_layouts( return 0; *retry = true; - return ___wait_var_event(&page->_refcount, - atomic_read(&page->_refcount) == 1, TASK_INTERRUPTIBLE, - 0, 0, xfs_wait_dax_page(inode)); + return dax_wait_page(inode, page, xfs_wait_dax_page); } int diff --git a/include/linux/dax.h b/include/linux/dax.h index 9fc5f99a0ae2..479939b3be40 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -201,6 +201,16 @@ static inline bool dax_mapping(struct address_space *mapping) return mapping->host && IS_DAX(mapping->host); } +static inline bool dax_page_unused(struct page *page) +{ + return page_ref_count(page) == 1; +} + +#define dax_wait_page(_inode, _page, _wait_cb) \ + ___wait_var_event(&(_page)->_refcount, \ + dax_page_unused(_page), \ + TASK_INTERRUPTIBLE, 0, 0, _wait_cb(_inode)) + #ifdef CONFIG_DEV_DAX_HMEM_DEVICES void hmem_register_device(int target_nid, struct resource *r); #else From patchwork Fri Jan 28 00:25:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727600 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF0B7C433F5 for ; Fri, 28 Jan 2022 00:26:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344535AbiA1A03 (ORCPT ); Thu, 27 Jan 2022 19:26:29 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46946 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235879AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=U7vSOPFFw/mnzHvzDSg8Uaihku8ItnIpN1mcud0fVL4=; b=m3q8Q5z4Xfvua2WxNsEGOK+1JI QwaCMvNl25/IOvUykUm57bn+9Lq6ORFqZJcaNAkukdG+ggqCVFPR09gylNfwcRDykqwWUtunjmObu r2MGL1jyePn0GNfXQly6yyKpqlyfIHS32BJ0BEMHM609iFZaZM6CsccvOjH3JhYW+4S7qnFB7nYXK rPJp0+muqlB8ev8vXJKaRVsk6FT0vMb/D9qPH1OjZAs3orJoNHtxO91tspMfRHFFjaZU34dpNHpde jhKdxf0ZMTQtMn5bCxhnrScpuR6gKQY6qiJv5IY3kW8gIOO/m0CCH7ZKNzERQJJONeP6KSWQZF9Vj 1wTkHRng==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5p-005OcY-Lh; Thu, 27 Jan 2022 17:26:23 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cF-CW; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Alex Sierra , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:25:52 -0700 Message-Id: <20220128002614.6136-3-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, christian.koenig@amd.com, alex.sierra@amd.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 02/24] mm: remove extra ZONE_DEVICE struct page refcount X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org From: Ralph Campbell ZONE_DEVICE struct pages have an extra reference count that complicates the code for put_page() and several places in the kernel that need to check the reference count to see that a page is not being used (gup, compaction, migration, etc.). Clean up the code so the reference count doesn't need to be treated specially for ZONE_DEVICE. [logang: dropped no longer used section from mm.h including page_is_devmap_managed, rebased on v5.17-rc1 (possibly poorly)] Signed-off-by: Ralph Campbell Signed-off-by: Alex Sierra Signed-off-by: Logan Gunthorpe Reviewed-by: Christoph Hellwig --- arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +- fs/dax.c | 4 +- include/linux/dax.h | 2 +- include/linux/memremap.h | 7 +-- include/linux/mm.h | 44 ---------------- lib/test_hmm.c | 2 +- mm/internal.h | 8 +++ mm/memcontrol.c | 6 +-- mm/memremap.c | 70 +++++++------------------- mm/migrate.c | 5 -- mm/page_alloc.c | 3 ++ mm/swap.c | 45 ++--------------- 13 files changed, 46 insertions(+), 154 deletions(-) diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index e414ca44839f..ec9b6f08943b 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -712,7 +712,7 @@ static struct page *kvmppc_uvmem_get_page(unsigned long gpa, struct kvm *kvm) dpage = pfn_to_page(uvmem_pfn); dpage->zone_device_data = pvt; - get_page(dpage); + init_page_count(dpage); lock_page(dpage); return dpage; out_clear: diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c index 3828aafd3ac4..24cae839e5a4 100644 --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c @@ -324,7 +324,7 @@ nouveau_dmem_page_alloc_locked(struct nouveau_drm *drm) return NULL; } - get_page(page); + init_page_count(page); lock_page(page); return page; } diff --git a/fs/dax.c b/fs/dax.c index d9b856cf6436..565ebff24e6e 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -572,14 +572,14 @@ static void *grab_mapping_entry(struct xa_state *xas, /** * dax_layout_busy_page_range - find first pinned page in @mapping - * @mapping: address space to scan for a page with ref count > 1 + * @mapping: address space to scan for a page with ref count > 0 * @start: Starting offset. Page containing 'start' is included. * @end: End offset. Page containing 'end' is included. If 'end' is LLONG_MAX, * pages from 'start' till the end of file are included. * * DAX requires ZONE_DEVICE mapped pages. These pages are never * 'onlined' to the page allocator so they are considered idle when - * page->count == 1. A filesystem uses this interface to determine if + * page->count == 0. A filesystem uses this interface to determine if * any page in the mapping is busy, i.e. for DMA, or other * get_user_pages() usages. * diff --git a/include/linux/dax.h b/include/linux/dax.h index 479939b3be40..5e80f3092d72 100644 --- a/include/linux/dax.h +++ b/include/linux/dax.h @@ -203,7 +203,7 @@ static inline bool dax_mapping(struct address_space *mapping) static inline bool dax_page_unused(struct page *page) { - return page_ref_count(page) == 1; + return page_ref_count(page) == 0; } #define dax_wait_page(_inode, _page, _wait_cb) \ diff --git a/include/linux/memremap.h b/include/linux/memremap.h index 1fafcc38acba..9965f6c6282a 100644 --- a/include/linux/memremap.h +++ b/include/linux/memremap.h @@ -66,9 +66,10 @@ enum memory_type { struct dev_pagemap_ops { /* - * Called once the page refcount reaches 1. (ZONE_DEVICE pages never - * reach 0 refcount unless there is a refcount bug. This allows the - * device driver to implement its own memory management.) + * Called once the page refcount reaches 0. The reference count + * should be reset to one with init_page_count(page) before reusing + * the page. This allows the device driver to implement its own + * memory management. */ void (*page_free)(struct page *page); diff --git a/include/linux/mm.h b/include/linux/mm.h index e1a84b1e6787..1ab20ed73678 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1094,39 +1094,6 @@ static inline bool is_zone_movable_page(const struct page *page) return page_zonenum(page) == ZONE_MOVABLE; } -#ifdef CONFIG_DEV_PAGEMAP_OPS -void free_devmap_managed_page(struct page *page); -DECLARE_STATIC_KEY_FALSE(devmap_managed_key); - -static inline bool page_is_devmap_managed(struct page *page) -{ - if (!static_branch_unlikely(&devmap_managed_key)) - return false; - if (!is_zone_device_page(page)) - return false; - switch (page->pgmap->type) { - case MEMORY_DEVICE_PRIVATE: - case MEMORY_DEVICE_FS_DAX: - return true; - default: - break; - } - return false; -} - -void put_devmap_managed_page(struct page *page); - -#else /* CONFIG_DEV_PAGEMAP_OPS */ -static inline bool page_is_devmap_managed(struct page *page) -{ - return false; -} - -static inline void put_devmap_managed_page(struct page *page) -{ -} -#endif /* CONFIG_DEV_PAGEMAP_OPS */ - static inline bool is_device_private_page(const struct page *page) { return IS_ENABLED(CONFIG_DEV_PAGEMAP_OPS) && @@ -1223,17 +1190,6 @@ static inline void put_page(struct page *page) { struct folio *folio = page_folio(page); - /* - * For devmap managed pages we need to catch refcount transition from - * 2 to 1, when refcount reach one it means the page is free and we - * need to inform the device driver through callback. See - * include/linux/memremap.h and HMM for details. - */ - if (page_is_devmap_managed(&folio->page)) { - put_devmap_managed_page(&folio->page); - return; - } - folio_put(folio); } diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 767538089a62..e75e34bc16b8 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -563,7 +563,7 @@ static struct page *dmirror_devmem_alloc_page(struct dmirror_device *mdevice) } dpage->zone_device_data = rpage; - get_page(dpage); + init_page_count(dpage); lock_page(dpage); return dpage; diff --git a/mm/internal.h b/mm/internal.h index d80300392a19..05614c3571ad 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -718,4 +718,12 @@ void vunmap_range_noflush(unsigned long start, unsigned long end); int numa_migrate_prep(struct page *page, struct vm_area_struct *vma, unsigned long addr, int page_nid, int *flags); +#ifdef CONFIG_DEV_PAGEMAP_OPS +void free_zone_device_page(struct page *page); +#else +static inline void free_zone_device_page(struct page *page) +{ +} +#endif + #endif /* __MM_INTERNAL_H */ diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 09d342c7cbd0..8da3f1b23382 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5509,11 +5509,7 @@ static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, */ if (is_device_private_entry(ent)) { page = pfn_swap_entry_to_page(ent); - /* - * MEMORY_DEVICE_PRIVATE means ZONE_DEVICE page and which have - * a refcount of 1 when free (unlike normal page) - */ - if (!page_ref_add_unless(page, 1, 1)) + if (!get_page_unless_zero(page)) return NULL; return page; } diff --git a/mm/memremap.c b/mm/memremap.c index 6aa5f0c2d11f..0fc8b85d792e 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -12,6 +12,7 @@ #include #include #include +#include "internal.h" static DEFINE_XARRAY(pgmap_array); @@ -37,32 +38,6 @@ unsigned long memremap_compat_align(void) EXPORT_SYMBOL_GPL(memremap_compat_align); #endif -#ifdef CONFIG_DEV_PAGEMAP_OPS -DEFINE_STATIC_KEY_FALSE(devmap_managed_key); -EXPORT_SYMBOL(devmap_managed_key); - -static void devmap_managed_enable_put(struct dev_pagemap *pgmap) -{ - if (pgmap->type == MEMORY_DEVICE_PRIVATE || - pgmap->type == MEMORY_DEVICE_FS_DAX) - static_branch_dec(&devmap_managed_key); -} - -static void devmap_managed_enable_get(struct dev_pagemap *pgmap) -{ - if (pgmap->type == MEMORY_DEVICE_PRIVATE || - pgmap->type == MEMORY_DEVICE_FS_DAX) - static_branch_inc(&devmap_managed_key); -} -#else -static void devmap_managed_enable_get(struct dev_pagemap *pgmap) -{ -} -static void devmap_managed_enable_put(struct dev_pagemap *pgmap) -{ -} -#endif /* CONFIG_DEV_PAGEMAP_OPS */ - static void pgmap_array_delete(struct range *range) { xa_store_range(&pgmap_array, PHYS_PFN(range->start), PHYS_PFN(range->end), @@ -102,23 +77,12 @@ static unsigned long pfn_end(struct dev_pagemap *pgmap, int range_id) return (range->start + range_len(range)) >> PAGE_SHIFT; } -static unsigned long pfn_next(struct dev_pagemap *pgmap, unsigned long pfn) -{ - if (pfn % (1024 << pgmap->vmemmap_shift)) - cond_resched(); - return pfn + pgmap_vmemmap_nr(pgmap); -} - static unsigned long pfn_len(struct dev_pagemap *pgmap, unsigned long range_id) { return (pfn_end(pgmap, range_id) - pfn_first(pgmap, range_id)) >> pgmap->vmemmap_shift; } -#define for_each_device_pfn(pfn, map, i) \ - for (pfn = pfn_first(map, i); pfn < pfn_end(map, i); \ - pfn = pfn_next(map, pfn)) - static void pageunmap_range(struct dev_pagemap *pgmap, int range_id) { struct range *range = &pgmap->ranges[range_id]; @@ -147,13 +111,12 @@ static void pageunmap_range(struct dev_pagemap *pgmap, int range_id) void memunmap_pages(struct dev_pagemap *pgmap) { - unsigned long pfn; int i; percpu_ref_kill(&pgmap->ref); for (i = 0; i < pgmap->nr_range; i++) - for_each_device_pfn(pfn, pgmap, i) - put_page(pfn_to_page(pfn)); + percpu_ref_put_many(&pgmap->ref, pfn_end(pgmap, i) - + pfn_first(pgmap, i)); wait_for_completion(&pgmap->done); percpu_ref_exit(&pgmap->ref); @@ -161,7 +124,6 @@ void memunmap_pages(struct dev_pagemap *pgmap) pageunmap_range(pgmap, i); WARN_ONCE(pgmap->altmap.alloc, "failed to free all reserved pages\n"); - devmap_managed_enable_put(pgmap); } EXPORT_SYMBOL_GPL(memunmap_pages); @@ -350,8 +312,6 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) if (error) return ERR_PTR(error); - devmap_managed_enable_get(pgmap); - /* * Clear the pgmap nr_range as it will be incremented for each * successfully processed range. This communicates how many @@ -466,16 +426,9 @@ struct dev_pagemap *get_dev_pagemap(unsigned long pfn, EXPORT_SYMBOL_GPL(get_dev_pagemap); #ifdef CONFIG_DEV_PAGEMAP_OPS -void free_devmap_managed_page(struct page *page) +static void free_device_page(struct page *page) { - /* notify page idle for dax */ - if (!is_device_private_page(page)) { - wake_up_var(&page->_refcount); - return; - } - __ClearPageWaiters(page); - mem_cgroup_uncharge(page_folio(page)); /* @@ -502,4 +455,19 @@ void free_devmap_managed_page(struct page *page) page->mapping = NULL; page->pgmap->ops->page_free(page); } + +void free_zone_device_page(struct page *page) +{ + switch (page->pgmap->type) { + case MEMORY_DEVICE_PRIVATE: + free_device_page(page); + return; + case MEMORY_DEVICE_FS_DAX: + /* notify page idle */ + wake_up_var(&page->_refcount); + return; + default: + return; + } +} #endif /* CONFIG_DEV_PAGEMAP_OPS */ diff --git a/mm/migrate.c b/mm/migrate.c index c7da064b4781..359a698b6b0a 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -341,11 +341,6 @@ static int expected_page_refs(struct address_space *mapping, struct page *page) { int expected_count = 1; - /* - * Device private pages have an extra refcount as they are - * ZONE_DEVICE pages. - */ - expected_count += is_device_private_page(page); if (mapping) expected_count += compound_nr(page) + page_has_private(page); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 3589febc6d31..c6a766916137 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6589,6 +6589,9 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn, __init_single_page(page, pfn, zone_idx, nid); + /* ZONE_DEVICE pages start with a zero reference count. */ + set_page_count(page, 0); + /* * Mark page reserved as it will need to wait for onlining * phase for it to be fully associated with a zone. diff --git a/mm/swap.c b/mm/swap.c index bcf3ac288b56..f5d53ed7e303 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -115,12 +115,11 @@ static void __put_compound_page(struct page *page) void __put_page(struct page *page) { if (is_zone_device_page(page)) { - put_dev_pagemap(page->pgmap); - /* * The page belongs to the device that created pgmap. Do * not return it to page allocator. */ + free_zone_device_page(page); return; } @@ -925,29 +924,18 @@ void release_pages(struct page **pages, int nr) if (is_huge_zero_page(page)) continue; + if (!put_page_testzero(page)) + continue; + if (is_zone_device_page(page)) { if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - /* - * ZONE_DEVICE pages that return 'false' from - * page_is_devmap_managed() do not require special - * processing, and instead, expect a call to - * put_page_testzero(). - */ - if (page_is_devmap_managed(page)) { - put_devmap_managed_page(page); - continue; - } - if (put_page_testzero(page)) - put_dev_pagemap(page->pgmap); + free_zone_device_page(page); continue; } - if (!put_page_testzero(page)) - continue; - if (PageCompound(page)) { if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); @@ -1153,26 +1141,3 @@ void __init swap_setup(void) * _really_ don't want to cluster much more */ } - -#ifdef CONFIG_DEV_PAGEMAP_OPS -void put_devmap_managed_page(struct page *page) -{ - int count; - - if (WARN_ON_ONCE(!page_is_devmap_managed(page))) - return; - - count = page_ref_dec_return(page); - - /* - * devmap page refcounts are 1-based, rather than 0-based: if - * refcount is 1, then the page is free and the refcount is - * stable because nobody holds a reference on the page. - */ - if (count == 1) - free_devmap_managed_page(page); - else if (!count) - __put_page(page); -} -EXPORT_SYMBOL(put_devmap_managed_page); -#endif From patchwork Fri Jan 28 00:25:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727602 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A95C4C433EF for ; Fri, 28 Jan 2022 00:26:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344554AbiA1A0a (ORCPT ); Thu, 27 Jan 2022 19:26:30 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46960 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232332AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=rs0zdLb19gC/hi9gcePRI1HlpCpDH9cKr/rV+O/g6PI=; b=BqqxQzy6I+a6VISPJzx9XqEL9k OP7HDA2YovJANZLSQ7Z4l/US+dxVgLkCa1dQQeu3MorRsjxynTmtGjQ+FWTR/w9EtwT+sW087npWh RpeKkJGFVw0CcB7A8t++0OeVZaCisfZQ6XfWk9NiBLaGP3aUtBLMp4CQrnvredRgsIgP4DWQMIOjN b/f49PK9y6KBoGB36zSvStK9i1ho40q4bBxV65AzIK8GfZ5ftCI5Il/FNIsgj9WJdsjg4lK5Kl3ee KeFEkjYMxkz8zvkGXIn4EqwCUotBUKsZWjOo2psRUDI+OfTVcJANPDDUt5de7J+FW8LquV1P0IGS9 c8eNbSpg==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5p-005OcZ-LW; Thu, 27 Jan 2022 17:26:23 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cI-IR; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Chaitanya Kulkarni Date: Thu, 27 Jan 2022 17:25:53 -0700 Message-Id: <20220128002614.6136-4-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, kch@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 03/24] lib/scatterlist: add flag for indicating P2PDMA segments in an SGL X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Make use of the third free LSB in scatterlist's page_link on 64bit systems. The extra bit will be used by dma_[un]map_sg_p2pdma() to determine when a given SGL segments dma_address points to a PCI bus address. dma_unmap_sg_p2pdma() will need to perform different cleanup when a segment is marked as a bus address. Create a CONFIG_NEED_SG_DMA_BUS_ADDR_FLAG bool which depends on CONFIG_64BIT (so there is space in the page link for the new flag). CONFIG_PCI_P2PDMA will then depend on this so this means PCI P2PDMA will require CONFIG_64BIT. This should be acceptable as the majority of P2PDMA use cases are restricted to newer root complexes and roughly require the extra address space for memory BARs used in the transactions. Signed-off-by: Logan Gunthorpe Reviewed-by: Chaitanya Kulkarni --- drivers/pci/Kconfig | 5 +++++ include/linux/scatterlist.h | 44 ++++++++++++++++++++++++++++++++++++- 2 files changed, 48 insertions(+), 1 deletion(-) diff --git a/drivers/pci/Kconfig b/drivers/pci/Kconfig index d98fafdd0f99..3e837d9e1600 100644 --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -164,6 +164,11 @@ config PCI_PASID config PCI_P2PDMA bool "PCI peer-to-peer transfer support" depends on ZONE_DEVICE + # + # The need for the scatterlist DMA bus address flag means PCI P2PDMA + # requires 64bit + # + depends on 64BIT select GENERIC_ALLOCATOR help Enableѕ drivers to do PCI peer-to-peer transactions to and from diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h index 7ff9d6386c12..6561ca8aead8 100644 --- a/include/linux/scatterlist.h +++ b/include/linux/scatterlist.h @@ -64,12 +64,24 @@ struct sg_append_table { #define SG_CHAIN 0x01UL #define SG_END 0x02UL +/* + * bit 2 is the third free bit in the page_link on 64bit systems which + * is used by dma_unmap_sg() to determine if the dma_address is a + * bus address when doing P2PDMA. + */ +#ifdef CONFIG_PCI_P2PDMA +#define SG_DMA_BUS_ADDRESS 0x04UL +static_assert(__alignof__(struct page) >= 8); +#else +#define SG_DMA_BUS_ADDRESS 0x00UL +#endif + /* * We overload the LSB of the page pointer to indicate whether it's * a valid sg entry, or whether it points to the start of a new scatterlist. * Those low bits are there for everyone! (thanks mason :-) */ -#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END) +#define SG_PAGE_LINK_MASK (SG_CHAIN | SG_END | SG_DMA_BUS_ADDRESS) static inline unsigned int __sg_flags(struct scatterlist *sg) { @@ -91,6 +103,11 @@ static inline bool sg_is_last(struct scatterlist *sg) return __sg_flags(sg) & SG_END; } +static inline bool sg_is_dma_bus_address(struct scatterlist *sg) +{ + return __sg_flags(sg) & SG_DMA_BUS_ADDRESS; +} + /** * sg_assign_page - Assign a given page to an SG entry * @sg: SG entry @@ -245,6 +262,31 @@ static inline void sg_unmark_end(struct scatterlist *sg) sg->page_link &= ~SG_END; } +/** + * sg_dma_mark_bus address - Mark the scatterlist entry as a bus address + * @sg: SG entryScatterlist + * + * Description: + * Marks the passed in sg entry to indicate that the dma_address is + * a bus address and doesn't need to be unmapped. + **/ +static inline void sg_dma_mark_bus_address(struct scatterlist *sg) +{ + sg->page_link |= SG_DMA_BUS_ADDRESS; +} + +/** + * sg_unmark_pci_p2pdma - Unmark the scatterlist entry as a bus address + * @sg: SG entryScatterlist + * + * Description: + * Clears the bus address mark. + **/ +static inline void sg_dma_unmark_bus_address(struct scatterlist *sg) +{ + sg->page_link &= ~SG_DMA_BUS_ADDRESS; +} + /** * sg_phys - Return physical address of an sg entry * @sg: SG entry From patchwork Fri Jan 28 00:25:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727604 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0550C4707E for ; Fri, 28 Jan 2022 00:26:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344563AbiA1A0a (ORCPT ); Thu, 27 Jan 2022 19:26:30 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46938 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236048AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=HHTYT5k3QiiDV6LMv4vKtzyhJ9FCFmq5gXJCns7NBlU=; b=nH+QGcWcTfLDJqbFBTkackrqrG IP6cWyAeLl9RuOzP5vD7BPyQU58XFn47tuDx4R7Thwosc+4soLlQVecHvAgk4qyxAjLmiqxSW3I5g SqIvy9WWdkwSYLgfzKnqG+dBpmOVLy4K6320JbcNr95+oWJo7UZV/NITfylMgQ8/RTfCcifnSfZBX r41I8tSoanWnwJpvJUpUw/NlhzAT4jhPRHUIZoBCgMZi//CDOJpWdTYlF2dkrkYW2sa/+iJk+Oy+2 a2caIhrPLspGcwjE+BtlAnMEszIzOsYbKM+4zKJzw5grV9y8OoAAbzhfGXuxs5Cs9EpHAAqLDRLRB NXs+TrNA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5p-005Oca-Lc; Thu, 27 Jan 2022 17:26:22 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cL-MP; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Bjorn Helgaas , Chaitanya Kulkarni Date: Thu, 27 Jan 2022 17:25:54 -0700 Message-Id: <20220128002614.6136-5-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, bhelgaas@google.com, jhubbard@nvidia.com, rcampbell@nvidia.com, kch@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 04/24] PCI/P2PDMA: Attempt to set map_type if it has not been set X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Attempt to find the mapping type for P2PDMA pages on the first DMA map attempt if it has not been done ahead of time. Previously, the mapping type was expected to be calculated ahead of time, but if pages are to come from userspace then there's no way to ensure the path was checked ahead of time. This change will calculate the mapping type if it hasn't pre-calculated so it is no longer invalid to call pci_p2pdma_map_sg() before the mapping type is calculated, so drop the WARN_ON when that is he case. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Reviewed-by: Chaitanya Kulkarni --- drivers/pci/p2pdma.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 1015274bd2fe..bb8900ce073a 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -848,6 +848,7 @@ static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, struct pci_dev *provider = to_p2p_pgmap(pgmap)->provider; struct pci_dev *client; struct pci_p2pdma *p2pdma; + int dist; if (!provider->p2pdma) return PCI_P2PDMA_MAP_NOT_SUPPORTED; @@ -864,6 +865,10 @@ static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, type = xa_to_value(xa_load(&p2pdma->map_types, map_types_idx(client))); rcu_read_unlock(); + + if (type == PCI_P2PDMA_MAP_UNKNOWN) + return calc_map_type_and_dist(provider, client, &dist, true); + return type; } @@ -906,7 +911,7 @@ int pci_p2pdma_map_sg_attrs(struct device *dev, struct scatterlist *sg, case PCI_P2PDMA_MAP_BUS_ADDR: return __pci_p2pdma_map_sg(p2p_pgmap, dev, sg, nents); default: - WARN_ON_ONCE(1); + /* Mapping is not Supported */ return 0; } } From patchwork Fri Jan 28 00:25:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727608 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33A6EC4321E for ; Fri, 28 Jan 2022 00:26:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235879AbiA1A0l (ORCPT ); Thu, 27 Jan 2022 19:26:41 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47234 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344636AbiA1A0g (ORCPT ); Thu, 27 Jan 2022 19:26:36 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=HHp/udGn7DOd58DraTUExtOtELgyrWSCODGG+didgEo=; b=REQ1q6keF1k3wmIvHrJ2N1tUZF /8akmrb1y4+4Z+n8KDFTApTCuHTzYuTKoSOQ1Hz0hLwVn+cEXd021htW4RxREBjpYLCk2osB2ZkeQ +XpTeIAMs56GK/2+jr7+6JvI1fUbDZTIw2R2iFwKGVs7XFKLRPq56p2eCCIRr6D8Ntmq9vk0nUH55 btpQhL1YhY4uqH2fP0BOBi/X50wctXiZPq6lMW5hWLTy6kOdr2fr/KwgXPBCIG2vM1AwmUaDa/eZR biVZtYLIJTN10PnA5CSNOnpgt61NBiVZBYgA1T+TT8zWfdfCj2i1lSz2xfnXrJtHs/zmzrEsdi9rZ D4pR5lGA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5w-005Oca-5a; Thu, 27 Jan 2022 17:26:34 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cO-QF; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Bjorn Helgaas , Jason Gunthorpe , Chaitanya Kulkarni Date: Thu, 27 Jan 2022 17:25:55 -0700 Message-Id: <20220128002614.6136-6-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, bhelgaas@google.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com, kch@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 05/24] PCI/P2PDMA: Expose pci_p2pdma_map_type() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org pci_p2pdma_map_type() will be needed by the dma-iommu map_sg implementation because it will need to determine the mapping type ahead of actually doing the mapping to create the actual IOMMU mapping. Prototypes for this helper are added to dma-map-ops.h as they are only useful to dma map implementations and don't need to pollute the public pci-p2pdma header Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Reviewed-by: Jason Gunthorpe Reviewed-by: Chaitanya Kulkarni --- drivers/pci/p2pdma.c | 25 +++++++++++++-------- include/linux/dma-map-ops.h | 45 +++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 9 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index bb8900ce073a..751d682e66ae 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -10,6 +10,7 @@ #define pr_fmt(fmt) "pci-p2pdma: " fmt #include +#include #include #include #include @@ -20,13 +21,6 @@ #include #include -enum pci_p2pdma_map_type { - PCI_P2PDMA_MAP_UNKNOWN = 0, - PCI_P2PDMA_MAP_NOT_SUPPORTED, - PCI_P2PDMA_MAP_BUS_ADDR, - PCI_P2PDMA_MAP_THRU_HOST_BRIDGE, -}; - struct pci_p2pdma { struct gen_pool *pool; bool p2pmem_published; @@ -841,8 +835,21 @@ void pci_p2pmem_publish(struct pci_dev *pdev, bool publish) } EXPORT_SYMBOL_GPL(pci_p2pmem_publish); -static enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, - struct device *dev) +/** + * pci_p2pdma_map_type - return the type of mapping that should be used for + * a given device and pgmap + * @pgmap: the pagemap of a page to determine the mapping type for + * @dev: device that is mapping the page + * + * Returns one of: + * PCI_P2PDMA_MAP_NOT_SUPPORTED - The mapping should not be done + * PCI_P2PDMA_MAP_BUS_ADDR - The mapping should use the PCI bus address + * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE - The mapping should be done normally + * using the CPU physical address (in dma-direct) or an IOVA + * mapping for the IOMMU. + */ +enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, + struct device *dev) { enum pci_p2pdma_map_type type = PCI_P2PDMA_MAP_NOT_SUPPORTED; struct pci_dev *provider = to_p2p_pgmap(pgmap)->provider; diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 0d5b06b3a4a6..d693a0e33bac 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -379,4 +379,49 @@ static inline void debug_dma_dump_mappings(struct device *dev) extern const struct dma_map_ops dma_dummy_ops; +enum pci_p2pdma_map_type { + /* + * PCI_P2PDMA_MAP_UNKNOWN: Used internally for indicating the mapping + * type hasn't been calculated yet. Functions that return this enum + * never return this value. + */ + PCI_P2PDMA_MAP_UNKNOWN = 0, + + /* + * PCI_P2PDMA_MAP_NOT_SUPPORTED: Indicates the transaction will + * traverse the host bridge and the host bridge is not in the + * allowlist. DMA Mapping routines should return an error when + * this is returned. + */ + PCI_P2PDMA_MAP_NOT_SUPPORTED, + + /* + * PCI_P2PDMA_BUS_ADDR: Indicates that two devices can talk to + * each other directly through a PCI switch and the transaction will + * not traverse the host bridge. Such a mapping should program + * the DMA engine with PCI bus addresses. + */ + PCI_P2PDMA_MAP_BUS_ADDR, + + /* + * PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: Indicates two devices can talk + * to each other, but the transaction traverses a host bridge on the + * allowlist. In this case, a normal mapping either with CPU physical + * addresses (in the case of dma-direct) or IOVA addresses (in the + * case of IOMMUs) should be used to program the DMA engine. + */ + PCI_P2PDMA_MAP_THRU_HOST_BRIDGE, +}; + +#ifdef CONFIG_PCI_P2PDMA +enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, + struct device *dev); +#else /* CONFIG_PCI_P2PDMA */ +static inline enum pci_p2pdma_map_type +pci_p2pdma_map_type(struct dev_pagemap *pgmap, struct device *dev) +{ + return PCI_P2PDMA_MAP_NOT_SUPPORTED; +} +#endif /* CONFIG_PCI_P2PDMA */ + #endif /* _LINUX_DMA_MAP_OPS_H */ From patchwork Fri Jan 28 00:25:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727619 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DA11C35273 for ; Fri, 28 Jan 2022 00:26:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344839AbiA1A0z (ORCPT ); Thu, 27 Jan 2022 19:26:55 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47164 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344605AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=9y7uxxCijIQhRgYYWdLYi/EHC9No4LME+9YvR+dIhu0=; b=l4d3EpTtcxL3EaodXSNshKzMis 1/jQH/RITkY0nTE3stsu/LrD8K8tHp5sBnCU4mUgyGT2sccFE9pG+vOSGhS2KQPG9z4RWATIO//8o 7Wii1JtGU5ngbOHFd+Ug+xwKAa767JeAdM9RRtkkBgdq5Qp1nsV4zhTumRTg+ZmuDXJFqi7VAU8ge 0M4tD1wbs1NmBbQ4Y5wufnnmwq7NwOABY7OOiKuVwr2f8JFg1Y0Zt2GyhZouLVvYzcaEHa8YIFQ8g HBJp1H7DxAZwwhQ9WbP6mDyRl30Jkciv8xdaY6XSsOozDBMNYVVORQRpeptIiUeDHoVy0GM40wX8z ZWJkcFhg==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5v-005OcY-KP; Thu, 27 Jan 2022 17:26:28 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5m-0001cR-UW; Thu, 27 Jan 2022 17:26:18 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Bjorn Helgaas Date: Thu, 27 Jan 2022 17:25:56 -0700 Message-Id: <20220128002614.6136-7-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com, bhelgaas@google.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 06/24] PCI/P2PDMA: Introduce helpers for dma_map_sg implementations X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add pci_p2pdma_map_segment() as a helper for simple dma_map_sg() implementations. It takes an scatterlist segment that must point to a pci_p2pdma struct page and will map it if the mapping requires a bus address. The return value indicates whether the mapping required a bus address or whether the caller still needs to map the segment normally. If the segment should not be mapped, -EREMOTEIO is returned. This helper uses a state structure to track the changes to the pgmap across calls and avoid needing to lookup into the xarray for every page. Also add pci_p2pdma_map_bus_segment() which is useful for IOMMU dma_map_sg() implementations where the sg segment containing the page differs from the sg segment containing the DMA address. Prototypes for these helpers are added to dma-map-ops.h as they are only useful to dma map implementations and don't need to pollute the public pci-p2pdma header. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas --- drivers/pci/p2pdma.c | 59 +++++++++++++++++++++++++++++++++++++ include/linux/dma-map-ops.h | 21 +++++++++++++ 2 files changed, 80 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 751d682e66ae..2b2cf00664e7 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -945,6 +945,65 @@ void pci_p2pdma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg, } EXPORT_SYMBOL_GPL(pci_p2pdma_unmap_sg_attrs); +/** + * pci_p2pdma_map_segment - map an sg segment determining the mapping type + * @state: State structure that should be declared outside of the for_each_sg() + * loop and initialized to zero. + * @dev: DMA device that's doing the mapping operation + * @sg: scatterlist segment to map + * + * This is a helper to be used by non-IOMMU dma_map_sg() implementations where + * the sg segment is the same for the page_link and the dma_address. + * + * Attempt to map a single segment in an SGL with the PCI bus address. + * The segment must point to a PCI P2PDMA page and thus must be + * wrapped in a is_pci_p2pdma_page(sg_page(sg)) check. + * + * Returns the type of mapping used and maps the page if the type is + * PCI_P2PDMA_MAP_BUS_ADDR. + */ +enum pci_p2pdma_map_type +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, + struct scatterlist *sg) +{ + if (state->pgmap != sg_page(sg)->pgmap) { + state->pgmap = sg_page(sg)->pgmap; + state->map = pci_p2pdma_map_type(state->pgmap, dev); + state->bus_off = to_p2p_pgmap(state->pgmap)->bus_offset; + } + + if (state->map == PCI_P2PDMA_MAP_BUS_ADDR) { + sg->dma_address = sg_phys(sg) + state->bus_off; + sg_dma_len(sg) = sg->length; + sg_dma_mark_bus_address(sg); + } + + return state->map; +} + +/** + * pci_p2pdma_map_bus_segment - map an sg segment pre determined to + * be mapped with PCI_P2PDMA_MAP_BUS_ADDR + * @pg_sg: scatterlist segment with the page to map + * @dma_sg: scatterlist segment to assign a DMA address to + * + * This is a helper for iommu dma_map_sg() implementations when the + * segment for the DMA address differs from the segment containing the + * source page. + * + * pci_p2pdma_map_type() must have already been called on the pg_sg and + * returned PCI_P2PDMA_MAP_BUS_ADDR. + */ +void pci_p2pdma_map_bus_segment(struct scatterlist *pg_sg, + struct scatterlist *dma_sg) +{ + struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(sg_page(pg_sg)->pgmap); + + dma_sg->dma_address = sg_phys(pg_sg) + pgmap->bus_offset; + sg_dma_len(dma_sg) = pg_sg->length; + sg_dma_mark_bus_address(dma_sg); +} + /** * pci_p2pdma_enable_store - parse a configfs/sysfs attribute store * to enable p2pdma diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index d693a0e33bac..752f91e5eb5d 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -413,15 +413,36 @@ enum pci_p2pdma_map_type { PCI_P2PDMA_MAP_THRU_HOST_BRIDGE, }; +struct pci_p2pdma_map_state { + struct dev_pagemap *pgmap; + int map; + u64 bus_off; +}; + #ifdef CONFIG_PCI_P2PDMA enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, struct device *dev); +enum pci_p2pdma_map_type +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, + struct scatterlist *sg); +void pci_p2pdma_map_bus_segment(struct scatterlist *pg_sg, + struct scatterlist *dma_sg); #else /* CONFIG_PCI_P2PDMA */ static inline enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, struct device *dev) { return PCI_P2PDMA_MAP_NOT_SUPPORTED; } +static inline enum pci_p2pdma_map_type +pci_p2pdma_map_segment(struct pci_p2pdma_map_state *state, struct device *dev, + struct scatterlist *sg) +{ + return PCI_P2PDMA_MAP_NOT_SUPPORTED; +} +static inline void pci_p2pdma_map_bus_segment(struct scatterlist *pg_sg, + struct scatterlist *dma_sg) +{ +} #endif /* CONFIG_PCI_P2PDMA */ #endif /* _LINUX_DMA_MAP_OPS_H */ From patchwork Fri Jan 28 00:25:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727618 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FE4FC35271 for ; Fri, 28 Jan 2022 00:26:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344843AbiA1A0z (ORCPT ); Thu, 27 Jan 2022 19:26:55 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47180 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344613AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=3aZ2AYK60CEaIE70gWhhj4/fir+nnPw5quNT7e3qCog=; b=TbtXdvF1neCjSLnA8Y5arFrbEW 7X/js5p1SwkTImoZEymFLHDsRMoXgV9VaQUQR/6s0f1UsJzlIF7Z8FqTJ1ZK2mfP93+kmYBdBDHHa XI4wTZ33KDZyUDOwz+wnhjRlfReo0msjPUyHb8cL/3TGCM79MP8K7cSrxbGpamidLLtozYRagtUFq xaDJeZpSuxVZr2tuGN9hdddXdaGaYTZmt0HgULTITqjAX1SFjLRBwuLjMeylPa1W4OTEmecbmOJsr ChjJuWMVLjqUvwVOpkz2YQvSyyjicCkT5tp4WIw8QaSxoMVqHCWoV0M2o4CqG9SETfDvdeYQ7EdRj f0xTlhuQ==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5v-005OcX-Kr; Thu, 27 Jan 2022 17:26:31 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001cU-2X; Thu, 27 Jan 2022 17:26:19 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Jason Gunthorpe Date: Thu, 27 Jan 2022 17:25:57 -0700 Message-Id: <20220128002614.6136-8-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 07/24] dma-mapping: allow EREMOTEIO return code for P2PDMA transfers X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add EREMOTEIO error return to dma_map_sgtable() which will be used by .map_sg() implementations that detect P2PDMA pages that the underlying DMA device cannot access. Signed-off-by: Logan Gunthorpe Reviewed-by: Jason Gunthorpe --- kernel/dma/mapping.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index 9478eccd1c8e..c056a1468189 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -197,7 +197,7 @@ static int __dma_map_sg_attrs(struct device *dev, struct scatterlist *sg, if (ents > 0) debug_dma_map_sg(dev, sg, nents, ents, dir, attrs); else if (WARN_ON_ONCE(ents != -EINVAL && ents != -ENOMEM && - ents != -EIO)) + ents != -EIO && ents != -EREMOTEIO)) return -EIO; return ents; @@ -255,6 +255,8 @@ EXPORT_SYMBOL(dma_map_sg_attrs); * complete the mapping. Should succeed if retried later. * -EIO Legacy error code with an unknown meaning. eg. this is * returned if a lower level call returned DMA_MAPPING_ERROR. + * -EREMOTEIO The DMA device cannot access P2PDMA memory specified in + * the sg_table. This will not succeed if retried. */ int dma_map_sgtable(struct device *dev, struct sg_table *sgt, enum dma_data_direction dir, unsigned long attrs) From patchwork Fri Jan 28 00:25:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727616 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73018C3527D for ; Fri, 28 Jan 2022 00:26:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344644AbiA1A0y (ORCPT ); Thu, 27 Jan 2022 19:26:54 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47184 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344611AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=bX2MPLdONywt9Jg2kBZ4+U7MTlH6jUj7zAM9CGcnTno=; b=niKQOBObQMpHq2+bvomGg6XFgp e1uacA/QTmWjbufbUaQHCaMQmWbnjcZWnSLHUq9QeghzBCGHaPx51JjPpjwQi2gEklYFI7GgCyw7D CNTSkJaGcq3FrfD8eWZC0hG8DPiRZcD756F/qbelT7Bc7Em5rX/s1g1pxSLMW+CG3ACNgvR2otysh F77hKp129+rLZEG054Gymw8ud2/4rdBTZ+Df4k6e3BoqJJsvNquMm7fZqXD1kgwn3dlcYGF44MSu2 CftfO2vzYEunYEwZP5js2Dgy6yn2OaLmG3u/saGDENWjr2+DYN6dHFVEwVmIUmAzYzaPXBBo8q/A0 wXF9HkkA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5v-005OcZ-0x; Thu, 27 Jan 2022 17:26:28 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001cX-6K; Thu, 27 Jan 2022 17:26:19 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:25:58 -0700 Message-Id: <20220128002614.6136-9-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 08/24] dma-direct: support PCI P2PDMA pages in dma-direct map_sg X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add PCI P2PDMA support for dma_direct_map_sg() so that it can map PCI P2PDMA pages directly without a hack in the callers. This allows for heterogeneous SGLs that contain both P2PDMA and regular pages. A P2PDMA page may have three possible outcomes when being mapped: 1) If the data path between the two devices doesn't go through the root port, then it should be mapped with a PCI bus address 2) If the data path goes through the host bridge, it should be mapped normally, as though it were a CPU physical address 3) It is not possible for the two devices to communicate and thus the mapping operation should fail (and it will return -EREMOTEIO). SGL segments that contain PCI bus addresses are marked with sg_dma_mark_pci_p2pdma() and are ignored when unmapped. P2PDMA mappings are also failed if swiotlb needs to be used on the mapping. Signed-off-by: Logan Gunthorpe --- kernel/dma/direct.c | 43 +++++++++++++++++++++++++++++++++++++------ kernel/dma/direct.h | 7 ++++++- 2 files changed, 43 insertions(+), 7 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 50f48e9e4598..975df5f3aaf9 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/direct.c @@ -461,29 +461,60 @@ void dma_direct_sync_sg_for_cpu(struct device *dev, arch_sync_dma_for_cpu_all(); } +/* + * Unmaps segments, except for ones marked as pci_p2pdma which do not + * require any further action as they contain a bus address. + */ void dma_direct_unmap_sg(struct device *dev, struct scatterlist *sgl, int nents, enum dma_data_direction dir, unsigned long attrs) { struct scatterlist *sg; int i; - for_each_sg(sgl, sg, nents, i) - dma_direct_unmap_page(dev, sg->dma_address, sg_dma_len(sg), dir, - attrs); + for_each_sg(sgl, sg, nents, i) { + if (sg_is_dma_bus_address(sg)) + sg_dma_unmark_bus_address(sg); + else + dma_direct_unmap_page(dev, sg->dma_address, + sg_dma_len(sg), dir, attrs); + } } #endif int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, enum dma_data_direction dir, unsigned long attrs) { - int i; + struct pci_p2pdma_map_state p2pdma_state = {}; + enum pci_p2pdma_map_type map; struct scatterlist *sg; + int i, ret; for_each_sg(sgl, sg, nents, i) { + if (is_pci_p2pdma_page(sg_page(sg))) { + map = pci_p2pdma_map_segment(&p2pdma_state, dev, sg); + switch (map) { + case PCI_P2PDMA_MAP_BUS_ADDR: + continue; + case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: + /* + * Any P2P mapping that traverses the PCI + * host bridge must be mapped with CPU physical + * address and not PCI bus addresses. This is + * done with dma_direct_map_page() below. + */ + break; + default: + ret = -EREMOTEIO; + goto out_unmap; + } + } + sg->dma_address = dma_direct_map_page(dev, sg_page(sg), sg->offset, sg->length, dir, attrs); - if (sg->dma_address == DMA_MAPPING_ERROR) + if (sg->dma_address == DMA_MAPPING_ERROR) { + ret = -EIO; goto out_unmap; + } sg_dma_len(sg) = sg->length; } @@ -491,7 +522,7 @@ int dma_direct_map_sg(struct device *dev, struct scatterlist *sgl, int nents, out_unmap: dma_direct_unmap_sg(dev, sgl, i, dir, attrs | DMA_ATTR_SKIP_CPU_SYNC); - return -EIO; + return ret; } dma_addr_t dma_direct_map_resource(struct device *dev, phys_addr_t paddr, diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h index 4632b0f4f72e..a33152d79069 100644 --- a/kernel/dma/direct.h +++ b/kernel/dma/direct.h @@ -87,10 +87,15 @@ static inline dma_addr_t dma_direct_map_page(struct device *dev, phys_addr_t phys = page_to_phys(page) + offset; dma_addr_t dma_addr = phys_to_dma(dev, phys); - if (is_swiotlb_force_bounce(dev)) + if (is_swiotlb_force_bounce(dev)) { + if (is_pci_p2pdma_page(page)) + return DMA_MAPPING_ERROR; return swiotlb_map(dev, phys, size, dir, attrs); + } if (unlikely(!dma_capable(dev, dma_addr, size, true))) { + if (is_pci_p2pdma_page(page)) + return DMA_MAPPING_ERROR; if (swiotlb_force != SWIOTLB_NO_FORCE) return swiotlb_map(dev, phys, size, dir, attrs); From patchwork Fri Jan 28 00:25:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727611 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C636C433F5 for ; Fri, 28 Jan 2022 00:26:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344816AbiA1A0w (ORCPT ); Thu, 27 Jan 2022 19:26:52 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47192 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344609AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=lVFs012FiYfAIteDPcswFBDVkI4GWpBL3gjXvYfdIeM=; b=FjLyPEO07VIEgdk1M1VHg4N/Lw TiuHtSuS8jaAFifUe0IoNiB+0FF8DOWx6dkN/cNU8hNzU4r5utojED5jH+50Km0hrjnzDl6gtCHnO LZC4XIFiTY8O3bwJMdZud/U0XQcXnotiqL7RmeFLKoKE6yo6fskQml7+4vY4gFGNSxzT6tnMyJwFt qiUtcGd7ed4gkX6193Aa1nJwX1BqZKHJ3SipaYK+EvHqe7aeZGDcO5TmqV3CQc11P8Q7fAfNwcRQu NiE+0B6xXbsbWls/loZxm0fCYPoPfaMdlDvtVBEewdYOmAkL7x5rk0p0T114v0/4w+SUuBcM7BMCT IPQDBXXg==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5u-005Oca-UW; Thu, 27 Jan 2022 17:26:27 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001cd-Ek; Thu, 27 Jan 2022 17:26:19 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Jason Gunthorpe Date: Thu, 27 Jan 2022 17:25:59 -0700 Message-Id: <20220128002614.6136-10-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 09/24] dma-mapping: add flags to dma_map_ops to indicate PCI P2PDMA support X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add a flags member to the dma_map_ops structure with one flag to indicate support for PCI P2PDMA. Also, add a helper to check if a device supports PCI P2PDMA. Signed-off-by: Logan Gunthorpe Reviewed-by: Jason Gunthorpe --- include/linux/dma-map-ops.h | 10 ++++++++++ include/linux/dma-mapping.h | 5 +++++ kernel/dma/mapping.c | 18 ++++++++++++++++++ 3 files changed, 33 insertions(+) diff --git a/include/linux/dma-map-ops.h b/include/linux/dma-map-ops.h index 752f91e5eb5d..4d4161d58ce0 100644 --- a/include/linux/dma-map-ops.h +++ b/include/linux/dma-map-ops.h @@ -11,7 +11,17 @@ struct cma; +/* + * Values for struct dma_map_ops.flags: + * + * DMA_F_PCI_P2PDMA_SUPPORTED: Indicates the dma_map_ops implementation can + * handle PCI P2PDMA pages in the map_sg/unmap_sg operation. + */ +#define DMA_F_PCI_P2PDMA_SUPPORTED (1 << 0) + struct dma_map_ops { + unsigned int flags; + void *(*alloc)(struct device *dev, size_t size, dma_addr_t *dma_handle, gfp_t gfp, unsigned long attrs); diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h index dca2b1355bb1..f7c61b2b4b5e 100644 --- a/include/linux/dma-mapping.h +++ b/include/linux/dma-mapping.h @@ -140,6 +140,7 @@ int dma_mmap_attrs(struct device *dev, struct vm_area_struct *vma, unsigned long attrs); bool dma_can_mmap(struct device *dev); int dma_supported(struct device *dev, u64 mask); +bool dma_pci_p2pdma_supported(struct device *dev); int dma_set_mask(struct device *dev, u64 mask); int dma_set_coherent_mask(struct device *dev, u64 mask); u64 dma_get_required_mask(struct device *dev); @@ -250,6 +251,10 @@ static inline int dma_supported(struct device *dev, u64 mask) { return 0; } +static inline bool dma_pci_p2pdma_supported(struct device *dev) +{ + return false; +} static inline int dma_set_mask(struct device *dev, u64 mask) { return -EIO; diff --git a/kernel/dma/mapping.c b/kernel/dma/mapping.c index c056a1468189..74858326ef94 100644 --- a/kernel/dma/mapping.c +++ b/kernel/dma/mapping.c @@ -724,6 +724,24 @@ int dma_supported(struct device *dev, u64 mask) } EXPORT_SYMBOL(dma_supported); +bool dma_pci_p2pdma_supported(struct device *dev) +{ + const struct dma_map_ops *ops = get_dma_ops(dev); + + /* if ops is not set, dma direct will be used which supports P2PDMA */ + if (!ops) + return true; + + /* + * Note: dma_ops_bypass is not checked here because P2PDMA should + * not be used with dma mapping ops that do not have support even + * if the specific device is bypassing them. + */ + + return ops->flags & DMA_F_PCI_P2PDMA_SUPPORTED; +} +EXPORT_SYMBOL_GPL(dma_pci_p2pdma_supported); + #ifdef CONFIG_ARCH_HAS_DMA_SET_MASK void arch_dma_set_mask(struct device *dev, u64 mask); #else From patchwork Fri Jan 28 00:26:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76358C3526C for ; Fri, 28 Jan 2022 00:26:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344630AbiA1A0p (ORCPT ); Thu, 27 Jan 2022 19:26:45 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47154 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344600AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=zSy9mwfEIlK+jaDg9YmgnsaBF+WL751apuEme/37PWY=; b=riVg7sgV4OLRdKgU3OUfjU5CVW Ya4eBnygZ3h4W0wM9Zbf/k1feYOFn+nti56gTb+3wwsqjpdSyUhk724HpXyRsASDV0BYlzWliXhqQ yoRTk8jRD0JEUvBfsik7TUT/SM14N+tc1FLN6cXmk7Fg51NHv7rvJZl7l1/zwiNVq9sILCYEXI6CD gKu6piHINhnS9+89quJhi2zWVIYTmmK8wAW1nKP9sWmECIJrmgGNtEh0SnLnc0BL8H2RrRRk6ixLH GOxULsoi/boKOZhnA4UtlFGf9fAGq0MmmmSdn3pwmgbVRYkfgeTXCBv8aYy/smqqNtEs1VU1RZUpG HRFJBk3w==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5u-005OcV-VV; Thu, 27 Jan 2022 17:26:28 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001ch-KA; Thu, 27 Jan 2022 17:26:19 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Jason Gunthorpe Date: Thu, 27 Jan 2022 17:26:00 -0700 Message-Id: <20220128002614.6136-11-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 10/24] iommu/dma: support PCI P2PDMA pages in dma-iommu map_sg X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a PCI P2PDMA page is seen, set the IOVA length of the segment to zero so that it is not mapped into the IOVA. Then, in finalise_sg(), apply the appropriate bus address to the segment. The IOVA is not created if the scatterlist only consists of P2PDMA pages. A P2PDMA page may have three possible outcomes when being mapped: 1) If the data path between the two devices doesn't go through the root port, then it should be mapped with a PCI bus address 2) If the data path goes through the host bridge, it should be mapped normally with an IOMMU IOVA. 3) It is not possible for the two devices to communicate and thus the mapping operation should fail (and it will return -EREMOTEIO). Similar to dma-direct, the sg_dma_mark_pci_p2pdma() flag is used to indicate bus address segments. On unmap, P2PDMA segments are skipped over when determining the start and end IOVA addresses. With this change, the flags variable in the dma_map_ops is set to DMA_F_PCI_P2PDMA_SUPPORTED to indicate support for P2PDMA pages. Signed-off-by: Logan Gunthorpe Reviewed-by: Jason Gunthorpe --- drivers/iommu/dma-iommu.c | 67 +++++++++++++++++++++++++++++++++++---- 1 file changed, 60 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index d85d54f2b549..434e70105180 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -1043,6 +1043,16 @@ static int __finalise_sg(struct device *dev, struct scatterlist *sg, int nents, sg_dma_address(s) = DMA_MAPPING_ERROR; sg_dma_len(s) = 0; + if (is_pci_p2pdma_page(sg_page(s)) && !s_iova_len) { + if (i > 0) + cur = sg_next(cur); + + pci_p2pdma_map_bus_segment(s, cur); + count++; + cur_len = 0; + continue; + } + /* * Now fill in the real DMA data. If... * - there is a valid output segment to append to @@ -1139,6 +1149,8 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, struct iova_domain *iovad = &cookie->iovad; struct scatterlist *s, *prev = NULL; int prot = dma_info_to_prot(dir, dev_is_dma_coherent(dev), attrs); + struct dev_pagemap *pgmap = NULL; + enum pci_p2pdma_map_type map_type; dma_addr_t iova; size_t iova_len = 0; unsigned long mask = dma_get_seg_boundary(dev); @@ -1174,6 +1186,35 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, s_length = iova_align(iovad, s_length + s_iova_off); s->length = s_length; + if (is_pci_p2pdma_page(sg_page(s))) { + if (sg_page(s)->pgmap != pgmap) { + pgmap = sg_page(s)->pgmap; + map_type = pci_p2pdma_map_type(pgmap, dev); + } + + switch (map_type) { + case PCI_P2PDMA_MAP_BUS_ADDR: + /* + * A zero length will be ignored by + * iommu_map_sg() and then can be detected + * in __finalise_sg() to actually map the + * bus address. + */ + s->length = 0; + continue; + case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: + /* + * Mapping through host bridge should be + * mapped with regular IOVAs, thus we + * do nothing here and continue below. + */ + break; + default: + ret = -EREMOTEIO; + goto out_restore_sg; + } + } + /* * Due to the alignment of our single IOVA allocation, we can * depend on these assumptions about the segment boundary mask: @@ -1196,6 +1237,9 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, prev = s; } + if (!iova_len) + return __finalise_sg(dev, sg, nents, 0); + iova = iommu_dma_alloc_iova(domain, iova_len, dma_get_mask(dev), dev); if (!iova) { ret = -ENOMEM; @@ -1217,7 +1261,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, out_restore_sg: __invalidate_sg(sg, nents); out: - if (ret != -ENOMEM) + if (ret != -ENOMEM && ret != -EREMOTEIO) return -EINVAL; return ret; } @@ -1225,7 +1269,7 @@ static int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg, static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, int nents, enum dma_data_direction dir, unsigned long attrs) { - dma_addr_t start, end; + dma_addr_t end, start = DMA_MAPPING_ERROR; struct scatterlist *tmp; int i; @@ -1241,14 +1285,22 @@ static void iommu_dma_unmap_sg(struct device *dev, struct scatterlist *sg, * The scatterlist segments are mapped into a single * contiguous IOVA allocation, so this is incredibly easy. */ - start = sg_dma_address(sg); - for_each_sg(sg_next(sg), tmp, nents - 1, i) { + for_each_sg(sg, tmp, nents, i) { + if (sg_is_dma_bus_address(tmp)) { + sg_dma_unmark_bus_address(tmp); + continue; + } if (sg_dma_len(tmp) == 0) break; - sg = tmp; + + if (start == DMA_MAPPING_ERROR) + start = sg_dma_address(tmp); + + end = sg_dma_address(tmp) + sg_dma_len(tmp); } - end = sg_dma_address(sg) + sg_dma_len(sg); - __iommu_dma_unmap(dev, start, end - start); + + if (start != DMA_MAPPING_ERROR) + __iommu_dma_unmap(dev, start, end - start); } static dma_addr_t iommu_dma_map_resource(struct device *dev, phys_addr_t phys, @@ -1441,6 +1493,7 @@ static unsigned long iommu_dma_get_merge_boundary(struct device *dev) } static const struct dma_map_ops iommu_dma_ops = { + .flags = DMA_F_PCI_P2PDMA_SUPPORTED, .alloc = iommu_dma_alloc, .free = iommu_dma_free, .alloc_pages = dma_common_alloc_pages, From patchwork Fri Jan 28 00:26:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727615 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28725C46467 for ; Fri, 28 Jan 2022 00:26:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344750AbiA1A0x (ORCPT ); Thu, 27 Jan 2022 19:26:53 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47140 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344598AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=f6Gceh5S5jL/nBaUJCDzdn+tmuAbFKWfwLoFX7deOH8=; b=titcmnMGmGT6rJBuUoaxsU3FRv fhU9hatq+5bu8PuY1JxWWSy0DyHNJoJfbNg5UKzxTAf1nXRCVaZyimOoF9Lgvce1Z5kjdXoNEGpx+ RAiD7KwkuPazykuSDNcM7rEyD2ZAh4kEjpFz2jgXtnYe0j+N+E4l15cY1wMHugqfUAjEho91h6xXE dI8ctOdivU2niYrip3XS8jwbsdQBKak2n/A9z/YC32RWvd+CIOKsJJIjon/alW1KThIecaZH5VHGp Xf2+ddYB0UF82Dst8XOw73KuEaG31MOSgyzEeKV9XSNlGKM3l9I3qIulrLze8U1X6oENJe0l/yr4N hTwcqvVw==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5u-005OcY-3f; Thu, 27 Jan 2022 17:26:27 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001cl-QN; Thu, 27 Jan 2022 17:26:19 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Chaitanya Kulkarni Date: Thu, 27 Jan 2022 17:26:01 -0700 Message-Id: <20220128002614.6136-12-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, kch@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 11/24] nvme-pci: check DMA ops when indicating support for PCI P2PDMA X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Introduce a supports_pci_p2pdma() operation in nvme_ctrl_ops to replace the fixed NVME_F_PCI_P2PDMA flag such that the dma_map_ops flags can be checked for PCI P2PDMA support. Signed-off-by: Logan Gunthorpe Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/core.c | 3 ++- drivers/nvme/host/nvme.h | 2 +- drivers/nvme/host/pci.c | 11 +++++++++-- 3 files changed, 12 insertions(+), 4 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 5e0bfda04bd7..ecb01984b55d 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3850,7 +3850,8 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, unsigned nsid, blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, ns->queue); blk_queue_flag_set(QUEUE_FLAG_NONROT, ns->queue); - if (ctrl->ops->flags & NVME_F_PCI_P2PDMA) + if (ctrl->ops->supports_pci_p2pdma && + ctrl->ops->supports_pci_p2pdma(ctrl)) blk_queue_flag_set(QUEUE_FLAG_PCI_P2PDMA, ns->queue); ns->ctrl = ctrl; diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index a162f6c6da6e..97efc6d0b146 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -486,7 +486,6 @@ struct nvme_ctrl_ops { unsigned int flags; #define NVME_F_FABRICS (1 << 0) #define NVME_F_METADATA_SUPPORTED (1 << 1) -#define NVME_F_PCI_P2PDMA (1 << 2) int (*reg_read32)(struct nvme_ctrl *ctrl, u32 off, u32 *val); int (*reg_write32)(struct nvme_ctrl *ctrl, u32 off, u32 val); int (*reg_read64)(struct nvme_ctrl *ctrl, u32 off, u64 *val); @@ -494,6 +493,7 @@ struct nvme_ctrl_ops { void (*submit_async_event)(struct nvme_ctrl *ctrl); void (*delete_ctrl)(struct nvme_ctrl *ctrl); int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size); + bool (*supports_pci_p2pdma)(struct nvme_ctrl *ctrl); }; /* diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index d8585df2c2fd..35e9b1e13d9f 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2965,17 +2965,24 @@ static int nvme_pci_get_address(struct nvme_ctrl *ctrl, char *buf, int size) return snprintf(buf, size, "%s\n", dev_name(&pdev->dev)); } +static bool nvme_pci_supports_pci_p2pdma(struct nvme_ctrl *ctrl) +{ + struct nvme_dev *dev = to_nvme_dev(ctrl); + + return dma_pci_p2pdma_supported(dev->dev); +} + static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = { .name = "pcie", .module = THIS_MODULE, - .flags = NVME_F_METADATA_SUPPORTED | - NVME_F_PCI_P2PDMA, + .flags = NVME_F_METADATA_SUPPORTED, .reg_read32 = nvme_pci_reg_read32, .reg_write32 = nvme_pci_reg_write32, .reg_read64 = nvme_pci_reg_read64, .free_ctrl = nvme_pci_free_ctrl, .submit_async_event = nvme_pci_submit_async_event, .get_address = nvme_pci_get_address, + .supports_pci_p2pdma = nvme_pci_supports_pci_p2pdma, }; static int nvme_dev_map(struct nvme_dev *dev) From patchwork Fri Jan 28 00:26:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727612 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9186FC3527C for ; Fri, 28 Jan 2022 00:26:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344805AbiA1A0v (ORCPT ); Thu, 27 Jan 2022 19:26:51 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47176 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344607AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=bknNiaxTRKCojqssTUG527pOjxKMAX8LUy+W9FwO/PY=; b=oNUy9ufdnfnYcAX619KDfJ+6yp NQX97tijue4Ha3ofYG+DvbU/hSlmRHKv+Q3S7pxRsT5amm6NGiz0TIYTVnbtfSYVQ3xO/ZHlC4XIw COTkFnwNdC8P1PXk/CoXTcSQyR5YugJoFRHrva1z4QlGh201OkfWvfuha20eyE93ob6XH5O8jV3oR xwij2OSgWC8gLbTOCpPihFnxRKRZRwi/K76ZDju8s6fQsIjvLjGRk3iUJn5ibrXYjz7jTOFuS4/9t Z1tupTv6UgmKLpi6mn4b7jEYWT6bGWSmNj8Z9x0g1RfieypLAiu/1nQ5Jnh6N+Xb+pKq+jCb6Vbnq bcJzmJow==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5u-005OcX-1K; Thu, 27 Jan 2022 17:26:27 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5n-0001cp-W7; Thu, 27 Jan 2022 17:26:20 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Max Gurtovoy Date: Thu, 27 Jan 2022 17:26:02 -0700 Message-Id: <20220128002614.6136-13-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, mgurtovoy@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 12/24] nvme-pci: convert to using dma_map_sgtable() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org The dma_map operations now support P2PDMA pages directly. So remove the calls to pci_p2pdma_[un]map_sg_attrs() and replace them with calls to dma_map_sgtable(). dma_map_sgtable() returns more complete error codes than dma_map_sg() and allows differentiating EREMOTEIO errors in case an unsupported P2PDMA transfer is requested. When this happens, return BLK_STS_TARGET so the request isn't retried. Signed-off-by: Logan Gunthorpe Reviewed-by: Max Gurtovoy Reviewed-by: Chaitanya Kulkarni --- drivers/nvme/host/pci.c | 69 +++++++++++++++++------------------------ 1 file changed, 29 insertions(+), 40 deletions(-) diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 35e9b1e13d9f..330515886dfc 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -229,11 +229,10 @@ struct nvme_iod { bool use_sgl; int aborted; int npages; /* In the PRP list. 0 means small pool in use */ - int nents; /* Used in scatterlist */ dma_addr_t first_dma; unsigned int dma_len; /* length of single DMA segment mapping */ dma_addr_t meta_dma; - struct scatterlist *sg; + struct sg_table sgt; }; static inline unsigned int nvme_dbbuf_size(struct nvme_dev *dev) @@ -522,7 +521,7 @@ static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) static void **nvme_pci_iod_list(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - return (void **)(iod->sg + blk_rq_nr_phys_segments(req)); + return (void **)(iod->sgt.sgl + blk_rq_nr_phys_segments(req)); } static inline bool nvme_pci_use_sgls(struct nvme_dev *dev, struct request *req) @@ -574,17 +573,6 @@ static void nvme_free_sgls(struct nvme_dev *dev, struct request *req) } } -static void nvme_unmap_sg(struct nvme_dev *dev, struct request *req) -{ - struct nvme_iod *iod = blk_mq_rq_to_pdu(req); - - if (is_pci_p2pdma_page(sg_page(iod->sg))) - pci_p2pdma_unmap_sg(dev->dev, iod->sg, iod->nents, - rq_dma_dir(req)); - else - dma_unmap_sg(dev->dev, iod->sg, iod->nents, rq_dma_dir(req)); -} - static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -595,9 +583,10 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) return; } - WARN_ON_ONCE(!iod->nents); + WARN_ON_ONCE(!iod->sgt.nents); + + dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0); - nvme_unmap_sg(dev, req); if (iod->npages == 0) dma_pool_free(dev->prp_small_pool, nvme_pci_iod_list(req)[0], iod->first_dma); @@ -605,7 +594,7 @@ static void nvme_unmap_data(struct nvme_dev *dev, struct request *req) nvme_free_sgls(dev, req); else nvme_free_prps(dev, req); - mempool_free(iod->sg, dev->iod_mempool); + mempool_free(iod->sgt.sgl, dev->iod_mempool); } static void nvme_print_sgl(struct scatterlist *sgl, int nents) @@ -628,7 +617,7 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev, struct nvme_iod *iod = blk_mq_rq_to_pdu(req); struct dma_pool *pool; int length = blk_rq_payload_bytes(req); - struct scatterlist *sg = iod->sg; + struct scatterlist *sg = iod->sgt.sgl; int dma_len = sg_dma_len(sg); u64 dma_addr = sg_dma_address(sg); int offset = dma_addr & (NVME_CTRL_PAGE_SIZE - 1); @@ -701,16 +690,16 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev, dma_len = sg_dma_len(sg); } done: - cmnd->dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sg)); + cmnd->dptr.prp1 = cpu_to_le64(sg_dma_address(iod->sgt.sgl)); cmnd->dptr.prp2 = cpu_to_le64(iod->first_dma); return BLK_STS_OK; free_prps: nvme_free_prps(dev, req); return BLK_STS_RESOURCE; bad_sgl: - WARN(DO_ONCE(nvme_print_sgl, iod->sg, iod->nents), + WARN(DO_ONCE(nvme_print_sgl, iod->sgt.sgl, iod->sgt.nents), "Invalid SGL for payload:%d nents:%d\n", - blk_rq_payload_bytes(req), iod->nents); + blk_rq_payload_bytes(req), iod->sgt.nents); return BLK_STS_IOERR; } @@ -736,12 +725,13 @@ static void nvme_pci_sgl_set_seg(struct nvme_sgl_desc *sge, } static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev, - struct request *req, struct nvme_rw_command *cmd, int entries) + struct request *req, struct nvme_rw_command *cmd) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); struct dma_pool *pool; struct nvme_sgl_desc *sg_list; - struct scatterlist *sg = iod->sg; + struct scatterlist *sg = iod->sgt.sgl; + unsigned int entries = iod->sgt.nents; dma_addr_t sgl_dma; int i = 0; @@ -839,7 +829,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); blk_status_t ret = BLK_STS_RESOURCE; - int nr_mapped; + int rc; if (blk_rq_nr_phys_segments(req) == 1) { struct bio_vec bv = req_bvec(req); @@ -857,26 +847,25 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, } iod->dma_len = 0; - iod->sg = mempool_alloc(dev->iod_mempool, GFP_ATOMIC); - if (!iod->sg) + iod->sgt.sgl = mempool_alloc(dev->iod_mempool, GFP_ATOMIC); + if (!iod->sgt.sgl) return BLK_STS_RESOURCE; - sg_init_table(iod->sg, blk_rq_nr_phys_segments(req)); - iod->nents = blk_rq_map_sg(req->q, req, iod->sg); - if (!iod->nents) + sg_init_table(iod->sgt.sgl, blk_rq_nr_phys_segments(req)); + iod->sgt.orig_nents = blk_rq_map_sg(req->q, req, iod->sgt.sgl); + if (!iod->sgt.orig_nents) goto out_free_sg; - if (is_pci_p2pdma_page(sg_page(iod->sg))) - nr_mapped = pci_p2pdma_map_sg_attrs(dev->dev, iod->sg, - iod->nents, rq_dma_dir(req), DMA_ATTR_NO_WARN); - else - nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, - rq_dma_dir(req), DMA_ATTR_NO_WARN); - if (!nr_mapped) + rc = dma_map_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), + DMA_ATTR_NO_WARN); + if (rc) { + if (rc == -EREMOTEIO) + ret = BLK_STS_TARGET; goto out_free_sg; + } iod->use_sgl = nvme_pci_use_sgls(dev, req); if (iod->use_sgl) - ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw, nr_mapped); + ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw); else ret = nvme_pci_setup_prps(dev, req, &cmnd->rw); if (ret != BLK_STS_OK) @@ -884,9 +873,9 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, return BLK_STS_OK; out_unmap_sg: - nvme_unmap_sg(dev, req); + dma_unmap_sgtable(dev->dev, &iod->sgt, rq_dma_dir(req), 0); out_free_sg: - mempool_free(iod->sg, dev->iod_mempool); + mempool_free(iod->sgt.sgl, dev->iod_mempool); return ret; } @@ -910,7 +899,7 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) iod->aborted = 0; iod->npages = -1; - iod->nents = 0; + iod->sgt.nents = 0; ret = nvme_setup_cmd(req->q->queuedata, req); if (ret) From patchwork Fri Jan 28 00:26:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727617 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFA27C47080 for ; Fri, 28 Jan 2022 00:26:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344851AbiA1A04 (ORCPT ); Thu, 27 Jan 2022 19:26:56 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47160 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344602AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=SthFkrlch6Gbtf7JBQRiLJWdMmfUwEHz21EJfjIo+fM=; b=lVrUYYfTM6XGioXpaDhU8TWgMB vgyWwjfvd+30vJr+Otba/J2QVHspa3Wu4+iov+2m7rQKOg5weIdMqzazQUbzDHYz/fRPbobbTjO5b UNyFiauFRkM9u2G++t+v4M1vLtkmowFWMS9373R6QZsbG3PhE5YTIr3/j4GDVPVe3OLyDlBMdmJqN mFzzocv2QiaF5iml93QU7vLdFzdslWKLSEW57Lr82qElnNWvYeCTRV0ZIo+wIg2pZ9fvKyNQPfiOM XaVP54CtEyu7JavZqVHJGtJIBrobriUCTzscqeKhItJJolN6bghr2ERmQE1rWPH+JQMp6U7Sa+gRR IfM3RwOQ==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5t-005OcZ-MY; Thu, 27 Jan 2022 17:26:26 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5o-0001ct-6H; Thu, 27 Jan 2022 17:26:20 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Jason Gunthorpe , Max Gurtovoy Date: Thu, 27 Jan 2022 17:26:03 -0700 Message-Id: <20220128002614.6136-14-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com, mgurtovoy@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 13/24] RDMA/core: introduce ib_dma_pci_p2p_dma_supported() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Introduce the helper function ib_dma_pci_p2p_dma_supported() to check if a given ib_device can be used in P2PDMA transfers. This ensures the ib_device is not using virt_dma and also that the underlying dma_device supports P2PDMA. Use the new helper in nvme-rdma to replace the existing check for ib_uses_virt_dma(). Adding the dma_pci_p2pdma_supported() check allows switching away from pci_p2pdma_[un]map_sg(). Signed-off-by: Logan Gunthorpe Reviewed-by: Jason Gunthorpe Reviewed-by: Max Gurtovoy --- drivers/nvme/target/rdma.c | 2 +- include/rdma/ib_verbs.h | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c index 1deb4043e242..22519739a874 100644 --- a/drivers/nvme/target/rdma.c +++ b/drivers/nvme/target/rdma.c @@ -415,7 +415,7 @@ static int nvmet_rdma_alloc_rsp(struct nvmet_rdma_device *ndev, if (ib_dma_mapping_error(ndev->device, r->send_sge.addr)) goto out_free_rsp; - if (!ib_uses_virt_dma(ndev->device)) + if (ib_dma_pci_p2p_dma_supported(ndev->device)) r->req.p2p_client = &ndev->device->dev; r->send_sge.length = sizeof(*r->req.cqe); r->send_sge.lkey = ndev->pd->local_dma_lkey; diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index 69d883f7fb41..79609ab73014 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -4003,6 +4003,17 @@ static inline bool ib_uses_virt_dma(struct ib_device *dev) return IS_ENABLED(CONFIG_INFINIBAND_VIRT_DMA) && !dev->dma_device; } +/* + * Check if a IB device's underlying DMA mapping supports P2PDMA transfers. + */ +static inline bool ib_dma_pci_p2p_dma_supported(struct ib_device *dev) +{ + if (ib_uses_virt_dma(dev)) + return false; + + return dma_pci_p2pdma_supported(dev->dma_device); +} + /** * ib_dma_mapping_error - check a DMA addr for error * @dev: The device for which the dma_addr was created From patchwork Fri Jan 28 00:26:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727606 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 844DFC41535 for ; Fri, 28 Jan 2022 00:26:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344694AbiA1A0j (ORCPT ); Thu, 27 Jan 2022 19:26:39 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47158 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344599AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=l3AgzlVFISW/9ucMB3Q/AoBaOHeVal7cwU+QcbWrG+0=; b=GHKRyVjtVmbSOmEi2k8PR+Md59 Kpz13qxphM7mdsMjmKSbaRcW0nzpUU20WsV1rOF1sA1CTsKNYE/r7ke6A8E0qUApKCKl+OHU3IJ4m PzKjL1nt/5lt7BQslZNPQ1gDmvxNnuMfunw4UfU7Xg1u/Rv/I3U947R9VknuNUH6pkUJssCPBCnEu xnOOgMTPCvAvpXzoChdfkBXahu6Zbykb2wsKeTSmudeQ+Yl64E9ciK6PfXxZNEYTWIMbAQCYtmjth hqunlqfxUc4cQCx+UeY3Pf/0pF9k+NdMgSpWLJfTOsl7OtuJebfiX1DbjBB5kjFF+tG1T839oylVO pXUyCMKw==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5t-005OcV-Gh; Thu, 27 Jan 2022 17:26:26 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5o-0001cx-Bg; Thu, 27 Jan 2022 17:26:20 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Jason Gunthorpe Date: Thu, 27 Jan 2022 17:26:04 -0700 Message-Id: <20220128002614.6136-15-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 14/24] RDMA/rw: drop pci_p2pdma_[un]map_sg() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org dma_map_sg() now supports the use of P2PDMA pages so pci_p2pdma_map_sg() is no longer necessary and may be dropped. This means the rdma_rw_[un]map_sg() helpers are no longer necessary. Remove it all. Signed-off-by: Logan Gunthorpe Reviewed-by: Jason Gunthorpe --- drivers/infiniband/core/rw.c | 45 ++++++++---------------------------- 1 file changed, 9 insertions(+), 36 deletions(-) diff --git a/drivers/infiniband/core/rw.c b/drivers/infiniband/core/rw.c index 5a3bd41b331c..d4517b68d1ca 100644 --- a/drivers/infiniband/core/rw.c +++ b/drivers/infiniband/core/rw.c @@ -273,33 +273,6 @@ static int rdma_rw_init_single_wr(struct rdma_rw_ctx *ctx, struct ib_qp *qp, return 1; } -static void rdma_rw_unmap_sg(struct ib_device *dev, struct scatterlist *sg, - u32 sg_cnt, enum dma_data_direction dir) -{ - if (is_pci_p2pdma_page(sg_page(sg))) - pci_p2pdma_unmap_sg(dev->dma_device, sg, sg_cnt, dir); - else - ib_dma_unmap_sg(dev, sg, sg_cnt, dir); -} - -static int rdma_rw_map_sgtable(struct ib_device *dev, struct sg_table *sgt, - enum dma_data_direction dir) -{ - int nents; - - if (is_pci_p2pdma_page(sg_page(sgt->sgl))) { - if (WARN_ON_ONCE(ib_uses_virt_dma(dev))) - return 0; - nents = pci_p2pdma_map_sg(dev->dma_device, sgt->sgl, - sgt->orig_nents, dir); - if (!nents) - return -EIO; - sgt->nents = nents; - return 0; - } - return ib_dma_map_sgtable_attrs(dev, sgt, dir, 0); -} - /** * rdma_rw_ctx_init - initialize a RDMA READ/WRITE context * @ctx: context to initialize @@ -326,7 +299,7 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u32 port_num, }; int ret; - ret = rdma_rw_map_sgtable(dev, &sgt, dir); + ret = ib_dma_map_sgtable_attrs(dev, &sgt, dir, 0); if (ret) return ret; sg_cnt = sgt.nents; @@ -365,7 +338,7 @@ int rdma_rw_ctx_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, u32 port_num, return ret; out_unmap_sg: - rdma_rw_unmap_sg(dev, sgt.sgl, sgt.orig_nents, dir); + ib_dma_unmap_sgtable_attrs(dev, &sgt, dir, 0); return ret; } EXPORT_SYMBOL(rdma_rw_ctx_init); @@ -413,12 +386,12 @@ int rdma_rw_ctx_signature_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, return -EINVAL; } - ret = rdma_rw_map_sgtable(dev, &sgt, dir); + ret = ib_dma_map_sgtable_attrs(dev, &sgt, dir, 0); if (ret) return ret; if (prot_sg_cnt) { - ret = rdma_rw_map_sgtable(dev, &prot_sgt, dir); + ret = ib_dma_map_sgtable_attrs(dev, &prot_sgt, dir, 0); if (ret) goto out_unmap_sg; } @@ -485,9 +458,9 @@ int rdma_rw_ctx_signature_init(struct rdma_rw_ctx *ctx, struct ib_qp *qp, kfree(ctx->reg); out_unmap_prot_sg: if (prot_sgt.nents) - rdma_rw_unmap_sg(dev, prot_sgt.sgl, prot_sgt.orig_nents, dir); + ib_dma_unmap_sgtable_attrs(dev, &prot_sgt, dir, 0); out_unmap_sg: - rdma_rw_unmap_sg(dev, sgt.sgl, sgt.orig_nents, dir); + ib_dma_unmap_sgtable_attrs(dev, &sgt, dir, 0); return ret; } EXPORT_SYMBOL(rdma_rw_ctx_signature_init); @@ -620,7 +593,7 @@ void rdma_rw_ctx_destroy(struct rdma_rw_ctx *ctx, struct ib_qp *qp, break; } - rdma_rw_unmap_sg(qp->pd->device, sg, sg_cnt, dir); + ib_dma_unmap_sg(qp->pd->device, sg, sg_cnt, dir); } EXPORT_SYMBOL(rdma_rw_ctx_destroy); @@ -648,8 +621,8 @@ void rdma_rw_ctx_destroy_signature(struct rdma_rw_ctx *ctx, struct ib_qp *qp, kfree(ctx->reg); if (prot_sg_cnt) - rdma_rw_unmap_sg(qp->pd->device, prot_sg, prot_sg_cnt, dir); - rdma_rw_unmap_sg(qp->pd->device, sg, sg_cnt, dir); + ib_dma_unmap_sg(qp->pd->device, prot_sg, prot_sg_cnt, dir); + ib_dma_unmap_sg(qp->pd->device, sg, sg_cnt, dir); } EXPORT_SYMBOL(rdma_rw_ctx_destroy_signature); From patchwork Fri Jan 28 00:26:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07F76C3527B for ; Fri, 28 Jan 2022 00:26:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344710AbiA1A0k (ORCPT ); Thu, 27 Jan 2022 19:26:40 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47212 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344617AbiA1A0f (ORCPT ); Thu, 27 Jan 2022 19:26:35 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=zN5ussKdK8rGe+8Rg1RjZA3LfU+HxpiV6i9+iy1gYwI=; b=JHb2sZwS3A32e7amabmVHJP6ni EM2MuyjnjcVSQi3Ge9tvmK/htrNqzw91M2/m3nQ0uHQFBdGqwMsaqy1XTX7hOwADCobDISD8T0BTE +szOJHzVtL+HbmFDfBo6ES1CO2C3zLlLAmQUEBwt9Falio9RLopO7N6kagpASGnrgOIxetUBWi2uv nEbAXrp0oh0PcFH/dXouma2mH5TjrxQNRCknYeHtTjRxuCSL9jxJxadFZn8w0rqWWiSg/WsnLamR/ kjZehAIDKxxVooHSJdbvnCVGeRpNI1NeGV4OigtGuYn5lycJ1lwpQB4wE7WF7hnWqXmZiSosRjq0n Qagmgp5g==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5t-005Oca-AT; Thu, 27 Jan 2022 17:26:26 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5o-0001d1-IU; Thu, 27 Jan 2022 17:26:20 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Bjorn Helgaas , Jason Gunthorpe , Max Gurtovoy Date: Thu, 27 Jan 2022 17:26:05 -0700 Message-Id: <20220128002614.6136-16-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, logang@deltatee.com, bhelgaas@google.com, jhubbard@nvidia.com, rcampbell@nvidia.com, jgg@nvidia.com, mgurtovoy@nvidia.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 15/24] PCI/P2PDMA: Remove pci_p2pdma_[un]map_sg() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org This interface is superseded by support in dma_map_sg() which now supports heterogeneous scatterlists. There are no longer any users, so remove it. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas Reviewed-by: Jason Gunthorpe Reviewed-by: Max Gurtovoy --- drivers/pci/p2pdma.c | 66 -------------------------------------- include/linux/pci-p2pdma.h | 27 ---------------- 2 files changed, 93 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 2b2cf00664e7..e66694cc9e14 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -879,72 +879,6 @@ enum pci_p2pdma_map_type pci_p2pdma_map_type(struct dev_pagemap *pgmap, return type; } -static int __pci_p2pdma_map_sg(struct pci_p2pdma_pagemap *p2p_pgmap, - struct device *dev, struct scatterlist *sg, int nents) -{ - struct scatterlist *s; - int i; - - for_each_sg(sg, s, nents, i) { - s->dma_address = sg_phys(s) + p2p_pgmap->bus_offset; - sg_dma_len(s) = s->length; - } - - return nents; -} - -/** - * pci_p2pdma_map_sg_attrs - map a PCI peer-to-peer scatterlist for DMA - * @dev: device doing the DMA request - * @sg: scatter list to map - * @nents: elements in the scatterlist - * @dir: DMA direction - * @attrs: DMA attributes passed to dma_map_sg() (if called) - * - * Scatterlists mapped with this function should be unmapped using - * pci_p2pdma_unmap_sg_attrs(). - * - * Returns the number of SG entries mapped or 0 on error. - */ -int pci_p2pdma_map_sg_attrs(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir, unsigned long attrs) -{ - struct pci_p2pdma_pagemap *p2p_pgmap = - to_p2p_pgmap(sg_page(sg)->pgmap); - - switch (pci_p2pdma_map_type(sg_page(sg)->pgmap, dev)) { - case PCI_P2PDMA_MAP_THRU_HOST_BRIDGE: - return dma_map_sg_attrs(dev, sg, nents, dir, attrs); - case PCI_P2PDMA_MAP_BUS_ADDR: - return __pci_p2pdma_map_sg(p2p_pgmap, dev, sg, nents); - default: - /* Mapping is not Supported */ - return 0; - } -} -EXPORT_SYMBOL_GPL(pci_p2pdma_map_sg_attrs); - -/** - * pci_p2pdma_unmap_sg_attrs - unmap a PCI peer-to-peer scatterlist that was - * mapped with pci_p2pdma_map_sg() - * @dev: device doing the DMA request - * @sg: scatter list to map - * @nents: number of elements returned by pci_p2pdma_map_sg() - * @dir: DMA direction - * @attrs: DMA attributes passed to dma_unmap_sg() (if called) - */ -void pci_p2pdma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir, unsigned long attrs) -{ - enum pci_p2pdma_map_type map_type; - - map_type = pci_p2pdma_map_type(sg_page(sg)->pgmap, dev); - - if (map_type == PCI_P2PDMA_MAP_THRU_HOST_BRIDGE) - dma_unmap_sg_attrs(dev, sg, nents, dir, attrs); -} -EXPORT_SYMBOL_GPL(pci_p2pdma_unmap_sg_attrs); - /** * pci_p2pdma_map_segment - map an sg segment determining the mapping type * @state: State structure that should be declared outside of the for_each_sg() diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h index 8318a97c9c61..2c07aa6b7665 100644 --- a/include/linux/pci-p2pdma.h +++ b/include/linux/pci-p2pdma.h @@ -30,10 +30,6 @@ struct scatterlist *pci_p2pmem_alloc_sgl(struct pci_dev *pdev, unsigned int *nents, u32 length); void pci_p2pmem_free_sgl(struct pci_dev *pdev, struct scatterlist *sgl); void pci_p2pmem_publish(struct pci_dev *pdev, bool publish); -int pci_p2pdma_map_sg_attrs(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir, unsigned long attrs); -void pci_p2pdma_unmap_sg_attrs(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir, unsigned long attrs); int pci_p2pdma_enable_store(const char *page, struct pci_dev **p2p_dev, bool *use_p2pdma); ssize_t pci_p2pdma_enable_show(char *page, struct pci_dev *p2p_dev, @@ -83,17 +79,6 @@ static inline void pci_p2pmem_free_sgl(struct pci_dev *pdev, static inline void pci_p2pmem_publish(struct pci_dev *pdev, bool publish) { } -static inline int pci_p2pdma_map_sg_attrs(struct device *dev, - struct scatterlist *sg, int nents, enum dma_data_direction dir, - unsigned long attrs) -{ - return 0; -} -static inline void pci_p2pdma_unmap_sg_attrs(struct device *dev, - struct scatterlist *sg, int nents, enum dma_data_direction dir, - unsigned long attrs) -{ -} static inline int pci_p2pdma_enable_store(const char *page, struct pci_dev **p2p_dev, bool *use_p2pdma) { @@ -119,16 +104,4 @@ static inline struct pci_dev *pci_p2pmem_find(struct device *client) return pci_p2pmem_find_many(&client, 1); } -static inline int pci_p2pdma_map_sg(struct device *dev, struct scatterlist *sg, - int nents, enum dma_data_direction dir) -{ - return pci_p2pdma_map_sg_attrs(dev, sg, nents, dir, 0); -} - -static inline void pci_p2pdma_unmap_sg(struct device *dev, - struct scatterlist *sg, int nents, enum dma_data_direction dir) -{ - pci_p2pdma_unmap_sg_attrs(dev, sg, nents, dir, 0); -} - #endif /* _LINUX_PCI_P2P_H */ From patchwork Fri Jan 28 00:26:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727614 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F367C3527A for ; Fri, 28 Jan 2022 00:26:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344717AbiA1A0x (ORCPT ); Thu, 27 Jan 2022 19:26:53 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47152 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344601AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=uQsXlm3k3BsGKUE9AcXsDkkM/IeKE+SiwnMMrQUs4ow=; b=hrw+tXZiDf1t/jPl6BGYffAfg1 TSx6zkydAWPh2OjlZjMW/EoVppTVL005Je5Cu2rpgeTkXKCISo7BFPQ4pu1J6XnWW5qlfmqXDsKuS KeoGYTx/Q1Asl7/aWcq2oinFsubmteSVEHCswc59bbr835y6HjmmhKg7EKhFacOMfebxAnjhG/HKu 9K10iU35orwOCvMgfRbBfQm3RjZkJ7DrKHpEECxR8bjB3oYykPqhhWXweUVShn1Mm8XlLfOr3wZTF py/3HQCUszOAEpqeDW4TmGoUIaQFiEA5ueu+ovczMAWThSaldWB7f8w5eqaTpHfQcii5CF7oOlLYb tAyP/nGQ==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5t-005OcY-1M; Thu, 27 Jan 2022 17:26:25 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5o-0001d5-PB; Thu, 27 Jan 2022 17:26:20 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:06 -0700 Message-Id: <20220128002614.6136-17-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 16/24] mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Callers that expect PCI P2PDMA pages can now set FOLL_PCI_P2PDMA to allow obtaining P2PDMA pages. If a caller does not set this flag and tries to map P2PDMA pages it will fail. This is implemented by checking failing if PCI p2pdma pages are found when FOLL_PCI_P2PDMA is set. This is only done if pte_devmap() is set. FOLL_PCI_P2PDMA cannot be set if FOLL_LONGTERM is set. Signed-off-by: Logan Gunthorpe --- include/linux/mm.h | 1 + mm/gup.c | 22 +++++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 1ab20ed73678..24f44230dcbf 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2888,6 +2888,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, #define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ #define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ #define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup */ +#define FOLL_PCI_P2PDMA 0x100000 /* allow returning PCI P2PDMA pages */ /* * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each diff --git a/mm/gup.c b/mm/gup.c index f0af462ac1e2..66e8cbd168b6 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -527,6 +527,12 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, page = pte_page(pte); else goto no_page; + + if (unlikely(!(flags & FOLL_PCI_P2PDMA) && + is_pci_p2pdma_page(page))) { + page = ERR_PTR(-EREMOTEIO); + goto out; + } } else if (unlikely(!page)) { if (flags & FOLL_DUMP) { /* Avoid special (like zero) pages in core dumps */ @@ -985,6 +991,9 @@ static int check_vma_flags(struct vm_area_struct *vma, unsigned long gup_flags) if ((gup_flags & FOLL_LONGTERM) && vma_is_fsdax(vma)) return -EOPNOTSUPP; + if ((gup_flags & FOLL_LONGTERM) && (gup_flags & FOLL_PCI_P2PDMA)) + return -EOPNOTSUPP; + if (vma_is_secretmem(vma)) return -EFAULT; @@ -2304,6 +2313,10 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end, VM_BUG_ON(!pfn_valid(pte_pfn(pte))); page = pte_page(pte); + if (unlikely(pte_devmap(pte) && !(flags & FOLL_PCI_P2PDMA) && + is_pci_p2pdma_page(page))) + goto pte_unmap; + head = try_grab_compound_head(page, 1, flags); if (!head) goto pte_unmap; @@ -2381,6 +2394,12 @@ static int __gup_device_huge(unsigned long pfn, unsigned long addr, undo_dev_pagemap(nr, nr_start, flags, pages); break; } + + if (!(flags & FOLL_PCI_P2PDMA) && is_pci_p2pdma_page(page)) { + undo_dev_pagemap(nr, nr_start, flags, pages); + break; + } + SetPageReferenced(page); pages[*nr] = page; if (unlikely(!try_grab_page(page, flags))) { @@ -2849,7 +2868,8 @@ static int internal_get_user_pages_fast(unsigned long start, if (WARN_ON_ONCE(gup_flags & ~(FOLL_WRITE | FOLL_LONGTERM | FOLL_FORCE | FOLL_PIN | FOLL_GET | - FOLL_FAST_ONLY | FOLL_NOFAULT))) + FOLL_FAST_ONLY | FOLL_NOFAULT | + FOLL_PCI_P2PDMA))) return -EINVAL; if (gup_flags & FOLL_PIN) From patchwork Fri Jan 28 00:26:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727613 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB4DBC4167E for ; Fri, 28 Jan 2022 00:26:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344818AbiA1A0w (ORCPT ); Thu, 27 Jan 2022 19:26:52 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47188 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344610AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=YF+tFaHkg5cUqirL/bZSIqfFWw9UNFd41dtgFgGRixo=; b=GNd2ovXaYfT4BSta3Z5NMwqhh8 csryim3gluAjras3z5p36hM6G0N09HwPD2rF3dR2DRlJgtCrsK0kKUXEymF0HvgEdo2WCUQ4mHOkV KLQ6ROa/3DtauBHOBEIK9Llvudy6cQbheCk42dRYSeedq1yKQuUcYgynftDp8Zf5Jo8IRRngxqnSH bQ+/UMtoo0T9GDimp7PbaBoli3sJdx8RHQKlnQOThAMXQX4srV6EL+eZGlikyp/Kcw0csVinKlvjf GK0c6mMDsIw5YdrdhuYvUwaS7ExirDnRDA1ig9/X2wNRbGVJozznHYOICl09Zd6gdvcDbJBWAEqJV MT0c1AJw==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5s-005OcX-LP; Thu, 27 Jan 2022 17:26:25 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001d9-1W; Thu, 27 Jan 2022 17:26:21 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:07 -0700 Message-Id: <20220128002614.6136-18-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 17/24] iov_iter: introduce iov_iter_get_pages_[alloc_]flags() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Add iov_iter_get_pages_flags() and iov_iter_get_pages_alloc_flags() which take a flags argument that is passed to get_user_pages_fast(). This is so that FOLL_PCI_P2PDMA can be passed when appropriate. Signed-off-by: Logan Gunthorpe --- include/linux/uio.h | 6 ++++++ lib/iov_iter.c | 25 +++++++++++++++++++------ 2 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/linux/uio.h b/include/linux/uio.h index 1198a2bfc9bf..22cf1db3a6c5 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -232,8 +232,14 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count); void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray, loff_t start, size_t count); +ssize_t iov_iter_get_pages_flags(struct iov_iter *i, struct page **pages, + size_t maxsize, unsigned maxpages, size_t *start, + unsigned int gup_flags); ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, size_t *start); +ssize_t iov_iter_get_pages_alloc_flags(struct iov_iter *i, + struct page ***pages, size_t maxsize, size_t *start, + unsigned int gup_flags); ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, size_t maxsize, size_t *start); int iov_iter_npages(const struct iov_iter *i, int maxpages); diff --git a/lib/iov_iter.c b/lib/iov_iter.c index b0e0acdf96c1..9eeea74f85a7 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1513,9 +1513,9 @@ static struct page *first_bvec_segment(const struct iov_iter *i, return page; } -ssize_t iov_iter_get_pages(struct iov_iter *i, +ssize_t iov_iter_get_pages_flags(struct iov_iter *i, struct page **pages, size_t maxsize, unsigned maxpages, - size_t *start) + size_t *start, unsigned int gup_flags) { size_t len; int n, res; @@ -1526,7 +1526,6 @@ ssize_t iov_iter_get_pages(struct iov_iter *i, return 0; if (likely(iter_is_iovec(i))) { - unsigned int gup_flags = 0; unsigned long addr; if (iov_iter_rw(i) != WRITE) @@ -1556,6 +1555,13 @@ ssize_t iov_iter_get_pages(struct iov_iter *i, return iter_xarray_get_pages(i, pages, maxsize, maxpages, start); return -EFAULT; } +EXPORT_SYMBOL_GPL(iov_iter_get_pages_flags); + +ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages, + size_t maxsize, unsigned maxpages, size_t *start) +{ + return iov_iter_get_pages_flags(i, pages, maxsize, maxpages, start, 0); +} EXPORT_SYMBOL(iov_iter_get_pages); static struct page **get_pages_array(size_t n) @@ -1638,9 +1644,9 @@ static ssize_t iter_xarray_get_pages_alloc(struct iov_iter *i, return actual; } -ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, +ssize_t iov_iter_get_pages_alloc_flags(struct iov_iter *i, struct page ***pages, size_t maxsize, - size_t *start) + size_t *start, unsigned int gup_flags) { struct page **p; size_t len; @@ -1652,7 +1658,6 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, return 0; if (likely(iter_is_iovec(i))) { - unsigned int gup_flags = 0; unsigned long addr; if (iov_iter_rw(i) != WRITE) @@ -1665,6 +1670,7 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, p = get_pages_array(n); if (!p) return -ENOMEM; + res = get_user_pages_fast(addr, n, gup_flags, p); if (unlikely(res <= 0)) { kvfree(p); @@ -1692,6 +1698,13 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, return iter_xarray_get_pages_alloc(i, pages, maxsize, start); return -EFAULT; } +EXPORT_SYMBOL_GPL(iov_iter_get_pages_alloc_flags); + +ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages, + size_t maxsize, size_t *start) +{ + return iov_iter_get_pages_alloc_flags(i, pages, maxsize, start, 0); +} EXPORT_SYMBOL(iov_iter_get_pages_alloc); size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, From patchwork Fri Jan 28 00:26:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727609 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1581C4707E for ; Fri, 28 Jan 2022 00:26:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344766AbiA1A0o (ORCPT ); Thu, 27 Jan 2022 19:26:44 -0500 Received: from ale.deltatee.com ([204.191.154.188]:47182 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344612AbiA1A0e (ORCPT ); Thu, 27 Jan 2022 19:26:34 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=06xrVvtoAjmIinAQk61eUYk8K5xu+g+hinoJXPXmfIY=; b=OfzXlltJMxq3gS2+grdKIdu4qj pNrW4b+9M5LpdqTxnk1XYOcDzd5sgEQAjsyP2AMJ90RwWXPlUE0qwyvBvkFe/uwG24thUIFowHlWz AgwDoi5KSSjGXB+pFolbCqFgldtLxrIfDDXdZh+GY0lTgBK/AH6nSD5Euur1kyMGDNNQkjW6YPZ+y qtvw6xepTrb2L4huQSSyb/3moKvDDRsD4oGQalhPJZgXyviy1FReSq3qj6fVDQqRBB1JYGOGZoM+n wLZrhunhed8GYxvvt+5L1BFHkUlwcfmZKpM1FBe9/DodXasZIyyXlHjkT8id4sGEdzYjg14sWtmnj +OgqnvBQ==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5s-005OcZ-JQ; Thu, 27 Jan 2022 17:26:25 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001dD-7X; Thu, 27 Jan 2022 17:26:21 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:08 -0700 Message-Id: <20220128002614.6136-19-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 18/24] block: add check when merging zone device pages X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Consecutive zone device pages should not be merged into the same sgl or bvec segment with other types of pages or if they belong to different pgmaps. Otherwise getting the pgmap of a given segment is not possible without scanning the entire segment. This helper returns true either if both pages are not zone device pages or both pages are zone device pages with the same pgmap. Add a helper to determine if zone device pages are mergeable and use this helper in page_is_mergeable(). Signed-off-by: Logan Gunthorpe --- block/bio.c | 2 ++ include/linux/mm.h | 23 +++++++++++++++++++++++ 2 files changed, 25 insertions(+) diff --git a/block/bio.c b/block/bio.c index 4312a8085396..055fcf159461 100644 --- a/block/bio.c +++ b/block/bio.c @@ -806,6 +806,8 @@ static inline bool page_is_mergeable(const struct bio_vec *bv, return false; if (xen_domain() && !xen_biovec_phys_mergeable(bv, page)) return false; + if (!zone_device_pages_have_same_pgmap(bv->bv_page, page)) + return false; *same_page = ((vec_end_addr & PAGE_MASK) == page_addr); if (*same_page) diff --git a/include/linux/mm.h b/include/linux/mm.h index 24f44230dcbf..4a8e8cddd910 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1080,6 +1080,24 @@ static inline bool is_zone_device_page(const struct page *page) { return page_zonenum(page) == ZONE_DEVICE; } + +/* + * Consecutive zone device pages should not be merged into the same sgl + * or bvec segment with other types of pages or if they belong to different + * pgmaps. Otherwise getting the pgmap of a given segment is not possible + * without scanning the entire segment. This helper returns true either if + * both pages are not zone device pages or both pages are zone device pages + * with the same pgmap. + */ +static inline bool zone_device_pages_have_same_pgmap(const struct page *a, + const struct page *b) +{ + if (is_zone_device_page(a) != is_zone_device_page(b)) + return false; + if (!is_zone_device_page(a)) + return true; + return a->pgmap == b->pgmap; +} extern void memmap_init_zone_device(struct zone *, unsigned long, unsigned long, struct dev_pagemap *); #else @@ -1087,6 +1105,11 @@ static inline bool is_zone_device_page(const struct page *page) { return false; } +static inline bool zone_device_pages_have_same_pgmap(const struct page *a, + const struct page *b) +{ + return true; +} #endif static inline bool is_zone_movable_page(const struct page *page) From patchwork Fri Jan 28 00:26:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727603 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA525C43217 for ; Fri, 28 Jan 2022 00:26:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344568AbiA1A0b (ORCPT ); Thu, 27 Jan 2022 19:26:31 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46998 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344485AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=UO35KE8YVx2v+yNTFMhmKCQE8ptUbM1lG6ODy8MFD4U=; b=sBbcFg82SAk1otTpMskAScMiG7 RyAKAQkLTz1xmy8xTlaHSy36dHLKBWxmZ36QoMwA4kkP1Jl1psr1WnOOxIkWmE9XtLADFmwdP9Al8 ICSCs2cktw8nLKLELqHswQ/9zbp0P2ctwkft2z7KTrR3Sn8uIMRmhiPnKpbppo9L3h5AIBakUCRp6 ACGKOFIT2MqbGv8JvJsn9xNK3WT0DFPSr3AFe/PofI/wklbT7AxT8jWUom0daiNDBGZ6rcIONs0MR Hx6YbD9XhVp29jfmuXGV9Ri0UA0Ly2AAO8I6iJbJsPYwK+DBu9x+77HhVcbEqqwc0x0ilK1AB7U1m XP4zUiFw==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5r-005OcV-Sd; Thu, 27 Jan 2022 17:26:25 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001dH-Dg; Thu, 27 Jan 2022 17:26:21 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:09 -0700 Message-Id: <20220128002614.6136-20-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 19/24] lib/scatterlist: add check when merging zone device pages X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Consecutive zone device pages should not be merged into the same sgl or bvec segment with other types of pages or if they belong to different pgmaps. Otherwise getting the pgmap of a given segment is not possible without scanning the entire segment. This helper returns true either if both pages are not zone device pages or both pages are zone device pages with the same pgmap. Factor out the check for page mergability into a pages_are_mergable() helper and add a check with zone_device_pages_are_mergeable(). Signed-off-by: Logan Gunthorpe --- lib/scatterlist.c | 25 +++++++++++++++---------- 1 file changed, 15 insertions(+), 10 deletions(-) diff --git a/lib/scatterlist.c b/lib/scatterlist.c index d5e82e4a57ad..af53a0984f76 100644 --- a/lib/scatterlist.c +++ b/lib/scatterlist.c @@ -410,6 +410,15 @@ static struct scatterlist *get_next_sg(struct sg_append_table *table, return new_sg; } +static bool pages_are_mergeable(struct page *a, struct page *b) +{ + if (page_to_pfn(a) != page_to_pfn(b) + 1) + return false; + if (!zone_device_pages_have_same_pgmap(a, b)) + return false; + return true; +} + /** * sg_alloc_append_table_from_pages - Allocate and initialize an append sg * table from an array of pages @@ -447,6 +456,7 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, unsigned int chunks, cur_page, seg_len, i, prv_len = 0; unsigned int added_nents = 0; struct scatterlist *s = sgt_append->prv; + struct page *last_pg; /* * The algorithm below requires max_segment to be aligned to PAGE_SIZE @@ -460,21 +470,17 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, return -EOPNOTSUPP; if (sgt_append->prv) { - unsigned long paddr = - (page_to_pfn(sg_page(sgt_append->prv)) * PAGE_SIZE + - sgt_append->prv->offset + sgt_append->prv->length) / - PAGE_SIZE; - if (WARN_ON(offset)) return -EINVAL; /* Merge contiguous pages into the last SG */ prv_len = sgt_append->prv->length; - while (n_pages && page_to_pfn(pages[0]) == paddr) { + last_pg = sg_page(sgt_append->prv); + while (n_pages && pages_are_mergeable(last_pg, pages[0])) { if (sgt_append->prv->length + PAGE_SIZE > max_segment) break; sgt_append->prv->length += PAGE_SIZE; - paddr++; + last_pg = pages[0]; pages++; n_pages--; } @@ -488,7 +494,7 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, for (i = 1; i < n_pages; i++) { seg_len += PAGE_SIZE; if (seg_len >= max_segment || - page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1) { + !pages_are_mergeable(pages[i], pages[i - 1])) { chunks++; seg_len = 0; } @@ -504,8 +510,7 @@ int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append, for (j = cur_page + 1; j < n_pages; j++) { seg_len += PAGE_SIZE; if (seg_len >= max_segment || - page_to_pfn(pages[j]) != - page_to_pfn(pages[j - 1]) + 1) + !pages_are_mergeable(pages[j], pages[j - 1])) break; } From patchwork Fri Jan 28 00:26:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727599 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1FC90C41535 for ; Fri, 28 Jan 2022 00:26:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344534AbiA1A03 (ORCPT ); Thu, 27 Jan 2022 19:26:29 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46994 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236017AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=aCaIC1aGEU3aE4W87JAzdHtyIfK4Tw21zQ04iz9J0qw=; b=Z69R0PP15szxfRgF3lQdDjdoCU wo/mXI4lYi8n/BvLZ9xCK6yUFuAwlswjoRaY/u6opnOYHIK/Gx7JB2KPTj2h0TXcG3h5PRViuSnc/ 5j2UmCO1qqgon+0L5+MSOS7FHf0YXhTl3m1hb68GLJ52n1SG/G7AlNuIqfINyLJbCyQsJ8n5o3Mn8 0s+EKnXr6vLLVTG3bmSE7BwIhTELu+o1NYDb7m/nmgMUPMIF3uuM6tH3PiDhzrzshKt5jlQhCA96N rbA+x2+ZPskhkkYf59tRpEHy8nr8UvTeXF5fapebJ/4AWBYv/BmBwrRARsMHxDbEozilNtK5aYCh5 pEHFhSCg==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5r-005Oca-Ri; Thu, 27 Jan 2022 17:26:25 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001dL-Jc; Thu, 27 Jan 2022 17:26:21 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:10 -0700 Message-Id: <20220128002614.6136-21-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 20/24] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a bio's queue supports PCI P2PDMA, set FOLL_PCI_P2PDMA for iov_iter_get_pages_flags(). This allows PCI P2PDMA pages to be passed from userspace and enables the O_DIRECT path in iomap based filesystems and direct to block devices. Signed-off-by: Logan Gunthorpe --- block/bio.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index 055fcf159461..cb19689cf9fe 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1121,6 +1121,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; struct page **pages = (struct page **)bv; bool same_page = false; + unsigned int flags = 0; ssize_t size, left; unsigned len, i; size_t offset; @@ -1133,7 +1134,12 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2); pages += entries_left * (PAGE_PTRS_PER_BVEC - 1); - size = iov_iter_get_pages(iter, pages, LONG_MAX, nr_pages, &offset); + if (bio->bi_bdev && bio->bi_bdev->bd_disk && + blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue)) + flags |= FOLL_PCI_P2PDMA; + + size = iov_iter_get_pages_flags(iter, pages, LONG_MAX, nr_pages, + &offset, flags); if (unlikely(size <= 0)) return size ? size : -EFAULT; From patchwork Fri Jan 28 00:26:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727596 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5E2BC4321E for ; Fri, 28 Jan 2022 00:26:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344519AbiA1A01 (ORCPT ); Thu, 27 Jan 2022 19:26:27 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46978 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235971AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=0VjpBnfEXTqeS7LssS80lKjFqlCuU84Zlsdnl3oeMpI=; b=s+iKXLCRsc+FunOheuw5aq1lFe A3yhmwa7yvIPlR+O4DL+kig4M+u0iProbdm2Z/k9TQ6V5GE6VD41v7xCnzaCpBc6VxhWojkEFUGlK yJFShL2es4N5xsnMFnWxQCqOI+6QIkN2s796jh173f2MhCcEo0wMO5G/HBgWs7UXUrfNK4XIomtIN IZLFtYyw+qCrWSirVzZW8oqDHd6scxRM/Bogwwrv+2TB8D16dLCievCu+ti7YfMxUGCeTBwKAtFB1 orWWKfeJH5SIoMbWa/ccX8n2rO/VQI36kH5LbgN8Qj+qk7MBOb/cNvZn5PLstKbV5YphGTQ269YBo wZym8kRA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5r-005OcZ-Pd; Thu, 27 Jan 2022 17:26:24 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001dP-P3; Thu, 27 Jan 2022 17:26:21 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:11 -0700 Message-Id: <20220128002614.6136-22-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 21/24] block: set FOLL_PCI_P2PDMA in bio_map_user_iov() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When a bio's queue supports PCI P2PDMA, set FOLL_PCI_P2PDMA for iov_iter_get_pages_flags(). This allows PCI P2PDMA pages to be passed from userspace and enables the NVMe passthru requests to use P2PDMA pages. Signed-off-by: Logan Gunthorpe --- block/blk-map.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/block/blk-map.c b/block/blk-map.c index 4526adde0156..7508448e290c 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -234,6 +234,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, gfp_t gfp_mask) { unsigned int max_sectors = queue_max_hw_sectors(rq->q); + unsigned int flags = 0; struct bio *bio; int ret; int j; @@ -246,13 +247,17 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter, return -ENOMEM; bio->bi_opf |= req_op(rq); + if (blk_queue_pci_p2pdma(rq->q)) + flags |= FOLL_PCI_P2PDMA; + while (iov_iter_count(iter)) { struct page **pages; ssize_t bytes; size_t offs, added = 0; int npages; - bytes = iov_iter_get_pages_alloc(iter, &pages, LONG_MAX, &offs); + bytes = iov_iter_get_pages_alloc_flags(iter, &pages, LONG_MAX, + &offs, flags); if (unlikely(bytes <= 0)) { ret = bytes ? bytes : -EFAULT; goto out_unmap; From patchwork Fri Jan 28 00:26:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727595 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75E39C433F5 for ; Fri, 28 Jan 2022 00:26:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344491AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46980 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235714AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=bRKwNbLS7GTorTVqHwjsOSnom3kuz6wsSSmUEePGzuc=; b=bbCGCv2EfpiJjjFaP9abLvC1cM Sm++joV+vIhFDE3ycOTZUG498KzRrSdEDKEN9SQIIZUdbhc/YoQ0DfjQMkERarIVbrRXVK6w2CSPZ AwcM72rvP+JqOvBa+3q4iZDXBxHGKa5KJVDotidYB/zfXPJtcPv9FjbpNJc4lVVYY36VJ1bW1BEBo CRlyTYkBXN6VJFG77/h9nzmY6aSthXmwEKGNBwhM4NSTdBkAVdpPgcp8zZPZDr4HFIa4YC4KZX4Yf 6V9SzQVNljq015M1B9bdjnNhtFDKymD2AgeAianLjdGqdqYecjHiGk8E1Tz1MtYAaWItofLbzhgHj e6SlzPQA==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5r-005OcX-Pi; Thu, 27 Jan 2022 17:26:24 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5p-0001dT-Uz; Thu, 27 Jan 2022 17:26:22 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:12 -0700 Message-Id: <20220128002614.6136-23-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 22/24] mm: use custom page_free for P2PDMA pages X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org When P2PDMA pages are passed to userspace, they will need to be reference counted properly and returned to their genalloc after their reference count returns to 1. This is accomplished with the existing DEV_PAGEMAP_OPS and the .page_free() operation. Change CONFIG_P2PDMA to select CONFIG_DEV_PAGEMAP_OPS and add MEMORY_DEVICE_PCI_P2PDMA to page_is_devmap_managed(), devmap_managed_enable_[put|get]() and free_devmap_managed_page(). Signed-off-by: Logan Gunthorpe --- drivers/pci/p2pdma.c | 13 +++++++++++++ mm/memremap.c | 5 +++++ 2 files changed, 18 insertions(+) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index e66694cc9e14..3a24bf5099cf 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -101,6 +101,18 @@ static const struct attribute_group p2pmem_group = { .name = "p2pmem", }; +static void p2pdma_page_free(struct page *page) +{ + struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page->pgmap); + + gen_pool_free(pgmap->provider->p2pdma->pool, + (uintptr_t)page_to_virt(page), PAGE_SIZE); +} + +static const struct dev_pagemap_ops p2pdma_pgmap_ops = { + .page_free = p2pdma_page_free, +}; + static void pci_p2pdma_release(void *data) { struct pci_dev *pdev = data; @@ -198,6 +210,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, pgmap->range.end = pgmap->range.start + size - 1; pgmap->nr_range = 1; pgmap->type = MEMORY_DEVICE_PCI_P2PDMA; + pgmap->ops = &p2pdma_pgmap_ops; p2p_pgmap->provider = pdev; p2p_pgmap->bus_offset = pci_bus_address(pdev, bar) - diff --git a/mm/memremap.c b/mm/memremap.c index 0fc8b85d792e..ced9448ddcb4 100644 --- a/mm/memremap.c +++ b/mm/memremap.c @@ -299,6 +299,10 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid) case MEMORY_DEVICE_GENERIC: break; case MEMORY_DEVICE_PCI_P2PDMA: + if (!pgmap->ops->page_free) { + WARN(1, "Missing page_free method\n"); + return ERR_PTR(-EINVAL); + } params.pgprot = pgprot_noncached(params.pgprot); break; default: @@ -460,6 +464,7 @@ void free_zone_device_page(struct page *page) { switch (page->pgmap->type) { case MEMORY_DEVICE_PRIVATE: + case MEMORY_DEVICE_PCI_P2PDMA: free_device_page(page); return; case MEMORY_DEVICE_FS_DAX: From patchwork Fri Jan 28 00:26:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16758C3525C for ; Fri, 28 Jan 2022 00:26:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344541AbiA1A0a (ORCPT ); Thu, 27 Jan 2022 19:26:30 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46988 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235832AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=FrX57yxjwJSYeD2carGS/mzSZPrJAINwqTf9rbkxCmA=; b=pkJD/Mwl6q1z7Wo1+e+4Vm+B7Y yclsTAlKRJ53J5K2njJTfYROnKo6lWCf33FSPx0GXTHmImv7RYxcjdl/9AKOHzItLzHLoF7CTzVWS GjKLVhtTOzhN56CEFxx9/tfwRo86Lbeqbx91I3TdDAohPTvU5rJvYKgrzDI2jX6Xb9eS+zf24PXh2 QVOBBUxx+m91FxqPYDeZqGqDWQmgf0AYkCxxePWCDK/XeivQerXvHAmdnHFEgxM/a+r1GdoxVawRI wE0mnTT04klrKe98naHoP+qgLHFwtT48//25Qn7zRc2AFd5RZBtnRCoSSvvEvLaeYZS756gbTAboY JO+FYzSQ==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5r-005OcY-D9; Thu, 27 Jan 2022 17:26:24 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5q-0001dX-51; Thu, 27 Jan 2022 17:26:22 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe , Bjorn Helgaas Date: Thu, 27 Jan 2022 17:26:13 -0700 Message-Id: <20220128002614.6136-24-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com, bhelgaas@google.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 23/24] PCI/P2PDMA: Introduce pci_mmap_p2pmem() X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Introduce pci_mmap_p2pmem() which is a helper to allocate and mmap a hunk of p2pmem into userspace. Pages are allocated from the genalloc in bulk and their reference count incremented. They are returned to the genalloc when the page is put. The VMA does not take a reference to the pages when they are inserted with vmf_insert_mixed() (which is necessary for zone device pages) so the backing P2P memory is stored in a structures in vm_private_data. A pseudo mount is used to allocate an inode for each PCI device. The inode's address_space is used in the file doing the mmap so that all VMAs are collected and can be unmapped if the PCI device is unbound. After unmapping, the VMAs are iterated through and their pages are put so the device can continue to be unbound. An active flag is used to signal to VMAs not to allocate any further P2P memory once the removal process starts. The flag is synchronized with concurrent access with an RCU lock. The VMAs and inode will survive after the unbind of the device, but no pages will be present in the VMA and a subsequent access will result in a SIGBUS error. Signed-off-by: Logan Gunthorpe Acked-by: Bjorn Helgaas --- drivers/pci/p2pdma.c | 301 ++++++++++++++++++++++++++++++++++++- include/linux/pci-p2pdma.h | 11 ++ include/uapi/linux/magic.h | 1 + 3 files changed, 311 insertions(+), 2 deletions(-) diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c index 3a24bf5099cf..d54068d6ce6a 100644 --- a/drivers/pci/p2pdma.c +++ b/drivers/pci/p2pdma.c @@ -17,14 +17,19 @@ #include #include #include +#include +#include #include #include #include +#include struct pci_p2pdma { struct gen_pool *pool; bool p2pmem_published; struct xarray map_types; + struct inode *inode; + bool active; }; struct pci_p2pdma_pagemap { @@ -33,6 +38,15 @@ struct pci_p2pdma_pagemap { u64 bus_offset; }; +struct pci_p2pdma_map { + struct kref ref; + struct rcu_head rcu; + struct pci_dev *pdev; + struct inode *inode; + void *kaddr; + size_t len; +}; + static struct pci_p2pdma_pagemap *to_p2p_pgmap(struct dev_pagemap *pgmap) { return container_of(pgmap, struct pci_p2pdma_pagemap, pgmap); @@ -101,6 +115,26 @@ static const struct attribute_group p2pmem_group = { .name = "p2pmem", }; +/* + * P2PDMA internal mount + * Fake an internal VFS mount-point in order to allocate struct address_space + * mappings to remove VMAs on unbind events. + */ +static int pci_p2pdma_fs_cnt; +static struct vfsmount *pci_p2pdma_fs_mnt; + +static int pci_p2pdma_fs_init_fs_context(struct fs_context *fc) +{ + return init_pseudo(fc, P2PDMA_MAGIC) ? 0 : -ENOMEM; +} + +static struct file_system_type pci_p2pdma_fs_type = { + .name = "p2dma", + .owner = THIS_MODULE, + .init_fs_context = pci_p2pdma_fs_init_fs_context, + .kill_sb = kill_anon_super, +}; + static void p2pdma_page_free(struct page *page) { struct pci_p2pdma_pagemap *pgmap = to_p2p_pgmap(page->pgmap); @@ -129,6 +163,9 @@ static void pci_p2pdma_release(void *data) gen_pool_destroy(p2pdma->pool); sysfs_remove_group(&pdev->dev.kobj, &p2pmem_group); xa_destroy(&p2pdma->map_types); + + iput(p2pdma->inode); + simple_release_fs(&pci_p2pdma_fs_mnt, &pci_p2pdma_fs_cnt); } static int pci_p2pdma_setup(struct pci_dev *pdev) @@ -146,17 +183,32 @@ static int pci_p2pdma_setup(struct pci_dev *pdev) if (!p2p->pool) goto out; - error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_release, pdev); + error = simple_pin_fs(&pci_p2pdma_fs_type, &pci_p2pdma_fs_mnt, + &pci_p2pdma_fs_cnt); if (error) goto out_pool_destroy; + p2p->inode = alloc_anon_inode(pci_p2pdma_fs_mnt->mnt_sb); + if (IS_ERR(p2p->inode)) { + error = -ENOMEM; + goto out_unpin_fs; + } + + error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_release, pdev); + if (error) + goto out_put_inode; + error = sysfs_create_group(&pdev->dev.kobj, &p2pmem_group); if (error) - goto out_pool_destroy; + goto out_put_inode; rcu_assign_pointer(pdev->p2pdma, p2p); return 0; +out_put_inode: + iput(p2p->inode); +out_unpin_fs: + simple_release_fs(&pci_p2pdma_fs_mnt, &pci_p2pdma_fs_cnt); out_pool_destroy: gen_pool_destroy(p2p->pool); out: @@ -164,6 +216,54 @@ static int pci_p2pdma_setup(struct pci_dev *pdev) return error; } +static void pci_p2pdma_map_free_pages(struct pci_p2pdma_map *pmap) +{ + int i; + + if (!pmap->kaddr) + return; + + for (i = 0; i < pmap->len; i += PAGE_SIZE) + put_page(virt_to_page(pmap->kaddr + i)); + + pmap->kaddr = NULL; +} + +static void pci_p2pdma_free_mappings(struct address_space *mapping) +{ + struct vm_area_struct *vma; + + i_mmap_lock_write(mapping); + if (RB_EMPTY_ROOT(&mapping->i_mmap.rb_root)) + goto out; + + vma_interval_tree_foreach(vma, &mapping->i_mmap, 0, -1) + pci_p2pdma_map_free_pages(vma->vm_private_data); + +out: + i_mmap_unlock_write(mapping); +} + +static void pci_p2pdma_unmap_mappings(void *data) +{ + struct pci_dev *pdev = data; + struct pci_p2pdma *p2pdma = rcu_dereference_protected(pdev->p2pdma, 1); + + /* Ensure no new pages can be allocated in mappings */ + p2pdma->active = false; + synchronize_rcu(); + + unmap_mapping_range(p2pdma->inode->i_mapping, 0, 0, 1); + + /* + * On some architectures, TLB flushes are done with call_rcu() + * so to ensure GUP fast is done with the pages, call synchronize_rcu() + * before freeing them. + */ + synchronize_rcu(); + pci_p2pdma_free_mappings(p2pdma->inode->i_mapping); +} + /** * pci_p2pdma_add_resource - add memory for use as p2p memory * @pdev: the device to add the memory to @@ -222,6 +322,11 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, goto pgmap_free; } + error = devm_add_action_or_reset(&pdev->dev, pci_p2pdma_unmap_mappings, + pdev); + if (error) + goto pages_free; + p2pdma = rcu_dereference_protected(pdev->p2pdma, 1); error = gen_pool_add_owner(p2pdma->pool, (unsigned long)addr, pci_bus_address(pdev, bar) + offset, @@ -230,6 +335,7 @@ int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, if (error) goto pages_free; + p2pdma->active = true; pci_info(pdev, "added peer-to-peer DMA memory %#llx-%#llx\n", pgmap->range.start, pgmap->range.end); @@ -1030,3 +1136,194 @@ ssize_t pci_p2pdma_enable_show(char *page, struct pci_dev *p2p_dev, return sprintf(page, "%s\n", pci_name(p2p_dev)); } EXPORT_SYMBOL_GPL(pci_p2pdma_enable_show); + +static struct pci_p2pdma_map *pci_p2pdma_map_alloc(struct pci_dev *pdev, + size_t len) +{ + struct pci_p2pdma_map *pmap; + + pmap = kzalloc(sizeof(*pmap), GFP_KERNEL); + if (!pmap) + return NULL; + + kref_init(&pmap->ref); + pmap->pdev = pci_dev_get(pdev); + pmap->len = len; + + return pmap; +} + +static void pci_p2pdma_map_free(struct rcu_head *rcu) +{ + struct pci_p2pdma_map *pmap = + container_of(rcu, struct pci_p2pdma_map, rcu); + + pci_p2pdma_map_free_pages(pmap); + kfree(pmap); +} + +static void pci_p2pdma_map_release(struct kref *ref) +{ + struct pci_p2pdma_map *pmap = + container_of(ref, struct pci_p2pdma_map, ref); + + iput(pmap->inode); + simple_release_fs(&pci_p2pdma_fs_mnt, &pci_p2pdma_fs_cnt); + pci_dev_put(pmap->pdev); + + if (pmap->kaddr) { + /* + * Make sure to wait for the TLB flush (which some + * architectures do using call_rcu()) before returning the + * pages to the genalloc. This ensures the pages are not reused + * before GUP-fast is finished with them. So the mapping is + * freed using call_rcu() seeing adding synchronize_rcu() to + * the munmap path can cause long delays on large systems + * during process cleanup. + */ + call_rcu(&pmap->rcu, pci_p2pdma_map_free); + return; + } + + /* + * If there are no pages, just free the object immediately. There + * are no more references to it so there is nothing that can race + * with adding the pages. + */ + pci_p2pdma_map_free(&pmap->rcu); +} + +static void pci_p2pdma_vma_open(struct vm_area_struct *vma) +{ + struct pci_p2pdma_map *pmap = vma->vm_private_data; + + kref_get(&pmap->ref); +} + +static void pci_p2pdma_vma_close(struct vm_area_struct *vma) +{ + struct pci_p2pdma_map *pmap = vma->vm_private_data; + + kref_put(&pmap->ref, pci_p2pdma_map_release); +} + +static vm_fault_t pci_p2pdma_vma_fault(struct vm_fault *vmf) +{ + struct pci_p2pdma_map *pmap = vmf->vma->vm_private_data; + struct pci_p2pdma *p2pdma; + void *vaddr; + pfn_t pfn; + int i; + + if (!pmap->kaddr) { + rcu_read_lock(); + p2pdma = rcu_dereference(pmap->pdev->p2pdma); + if (!p2pdma) + goto err_out; + + if (!p2pdma->active) + goto err_out; + + pmap->kaddr = (void *)gen_pool_alloc(p2pdma->pool, pmap->len); + if (!pmap->kaddr) + goto err_out; + + for (i = 0; i < pmap->len; i += PAGE_SIZE) + get_page(virt_to_page(pmap->kaddr + i)); + + rcu_read_unlock(); + } + + vaddr = pmap->kaddr + (vmf->pgoff << PAGE_SHIFT); + pfn = phys_to_pfn_t(virt_to_phys(vaddr), PFN_DEV | PFN_MAP); + + return vmf_insert_mixed(vmf->vma, vmf->address, pfn); + +err_out: + rcu_read_unlock(); + return VM_FAULT_SIGBUS; +} +static const struct vm_operations_struct pci_p2pdma_vmops = { + .open = pci_p2pdma_vma_open, + .close = pci_p2pdma_vma_close, + .fault = pci_p2pdma_vma_fault, +}; + +/** + * pci_p2pdma_file_open - setup file mapping to store P2PMEM VMAs + * @pdev: the device to allocate memory from + * @vma: the userspace vma to map the memory to + * + * Set f_mapping of the file to the p2pdma inode so that mappings + * are can be torn down on device unbind. + * + * Returns 0 on success, or a negative error code on failure + */ +void pci_p2pdma_file_open(struct pci_dev *pdev, struct file *file) +{ + struct pci_p2pdma *p2pdma; + + rcu_read_lock(); + p2pdma = rcu_dereference(pdev->p2pdma); + if (p2pdma) + file->f_mapping = p2pdma->inode->i_mapping; + rcu_read_unlock(); +} +EXPORT_SYMBOL_GPL(pci_p2pdma_file_open); + +/** + * pci_mmap_p2pmem - setup an mmap region to be backed with P2PDMA memory + * that was registered with pci_p2pdma_add_resource() + * @pdev: the device to allocate memory from + * @vma: the userspace vma to map the memory to + * + * The file must call pci_p2pdma_mmap_file_open() in its open() operation. + * + * Returns 0 on success, or a negative error code on failure + */ +int pci_mmap_p2pmem(struct pci_dev *pdev, struct vm_area_struct *vma) +{ + struct pci_p2pdma_map *pmap; + struct pci_p2pdma *p2pdma; + int ret; + + /* prevent private mappings from being established */ + if ((vma->vm_flags & VM_MAYSHARE) != VM_MAYSHARE) { + pci_info_ratelimited(pdev, + "%s: fail, attempted private mapping\n", + current->comm); + return -EINVAL; + } + + pmap = pci_p2pdma_map_alloc(pdev, vma->vm_end - vma->vm_start); + if (!pmap) + return -ENOMEM; + + rcu_read_lock(); + p2pdma = rcu_dereference(pdev->p2pdma); + if (!p2pdma) { + ret = -ENODEV; + goto out; + } + + ret = simple_pin_fs(&pci_p2pdma_fs_type, &pci_p2pdma_fs_mnt, + &pci_p2pdma_fs_cnt); + if (ret) + goto out; + + ihold(p2pdma->inode); + pmap->inode = p2pdma->inode; + rcu_read_unlock(); + + vma->vm_flags |= VM_MIXEDMAP; + vma->vm_private_data = pmap; + vma->vm_ops = &pci_p2pdma_vmops; + + return 0; + +out: + rcu_read_unlock(); + kfree(pmap); + return ret; +} +EXPORT_SYMBOL_GPL(pci_mmap_p2pmem); diff --git a/include/linux/pci-p2pdma.h b/include/linux/pci-p2pdma.h index 2c07aa6b7665..040d79126463 100644 --- a/include/linux/pci-p2pdma.h +++ b/include/linux/pci-p2pdma.h @@ -34,6 +34,8 @@ int pci_p2pdma_enable_store(const char *page, struct pci_dev **p2p_dev, bool *use_p2pdma); ssize_t pci_p2pdma_enable_show(char *page, struct pci_dev *p2p_dev, bool use_p2pdma); +void pci_p2pdma_file_open(struct pci_dev *pdev, struct file *file); +int pci_mmap_p2pmem(struct pci_dev *pdev, struct vm_area_struct *vma); #else /* CONFIG_PCI_P2PDMA */ static inline int pci_p2pdma_add_resource(struct pci_dev *pdev, int bar, size_t size, u64 offset) @@ -90,6 +92,15 @@ static inline ssize_t pci_p2pdma_enable_show(char *page, { return sprintf(page, "none\n"); } +static inline void pci_p2pdma_file_open(struct pci_dev *pdev, + struct file *file) +{ +} +static inline int pci_mmap_p2pmem(struct pci_dev *pdev, + struct vm_area_struct *vma) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_PCI_P2PDMA */ diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index 0425cd79af9a..bf5af400fb7d 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -94,6 +94,7 @@ #define BPF_FS_MAGIC 0xcafe4a11 #define AAFS_MAGIC 0x5a3c69f0 #define ZONEFS_MAGIC 0x5a4f4653 +#define P2PDMA_MAGIC 0x70327064 /* Since UDF 2.01 is ISO 13346 based... */ #define UDF_SUPER_MAGIC 0x15013346 From patchwork Fri Jan 28 00:26:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Logan Gunthorpe X-Patchwork-Id: 12727605 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC5E4C4321E for ; Fri, 28 Jan 2022 00:26:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344571AbiA1A0b (ORCPT ); Thu, 27 Jan 2022 19:26:31 -0500 Received: from ale.deltatee.com ([204.191.154.188]:46968 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235959AbiA1A00 (ORCPT ); Thu, 27 Jan 2022 19:26:26 -0500 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=deltatee.com; s=20200525; h=Subject:MIME-Version:References:In-Reply-To: Message-Id:Date:Cc:To:From:content-disposition; bh=64vXVneAT9A14nJ5Wy4DU2Ru5X/AwQFCZoUVhV2cz8s=; b=ZnQOa3fNCsoceXpkfTOvkkxhQ3 YPnv8R6b2Pia3DQGSm/S40Iw4dBsQ5v3Ro2prXcn89tVI0MzOTjtMLCPtkjEysFf36McWDRAuOuxo ZIg7pNMDR3disto8RqevSoa5xVhahTIzavvy5qG++ccgQhBj4GG4J2IwOZXx3/P9xrXDeih/JvVcK uPgAqYGjZgMaZG9WePCJZWo7eMidSu1Yain3cpxonh4FHpfclfmfJsj2uv/5fCZNs0tMHmuR5Q0Db EF3FYDhp3FzdmOhXgut3tml0gqzsP6+8/6TBEGj1FP8uQakf6HyYBWeoW1hFJSf3CO5UxpfraTn2c 3fUhej/g==; Received: from cgy1-donard.priv.deltatee.com ([172.16.1.31]) by ale.deltatee.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nDF5q-005Oca-PY; Thu, 27 Jan 2022 17:26:23 -0700 Received: from gunthorp by cgy1-donard.priv.deltatee.com with local (Exim 4.94.2) (envelope-from ) id 1nDF5q-0001db-As; Thu, 27 Jan 2022 17:26:22 -0700 From: Logan Gunthorpe To: linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org Cc: Stephen Bates , Christoph Hellwig , Dan Williams , Jason Gunthorpe , =?utf-8?q?Christian_K=C3=B6nig?= , John Hubbard , Don Dutile , Matthew Wilcox , Daniel Vetter , Jakowski Andrzej , Minturn Dave B , Jason Ekstrand , Dave Hansen , Xiong Jianxin , Bjorn Helgaas , Ira Weiny , Robin Murphy , Martin Oliveira , Chaitanya Kulkarni , Ralph Campbell , Logan Gunthorpe Date: Thu, 27 Jan 2022 17:26:14 -0700 Message-Id: <20220128002614.6136-25-logang@deltatee.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220128002614.6136-1-logang@deltatee.com> References: <20220128002614.6136-1-logang@deltatee.com> MIME-Version: 1.0 X-SA-Exim-Connect-IP: 172.16.1.31 X-SA-Exim-Rcpt-To: linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-pci@vger.kernel.org, linux-mm@kvack.org, iommu@lists.linux-foundation.org, sbates@raithlin.com, hch@lst.de, jgg@ziepe.ca, christian.koenig@amd.com, ddutile@redhat.com, willy@infradead.org, daniel.vetter@ffwll.ch, jason@jlekstrand.net, dave.hansen@linux.intel.com, helgaas@kernel.org, dan.j.williams@intel.com, andrzej.jakowski@intel.com, dave.b.minturn@intel.com, jianxin.xiong@intel.com, ira.weiny@intel.com, robin.murphy@arm.com, martin.oliveira@eideticom.com, ckulkarnilinux@gmail.com, jhubbard@nvidia.com, rcampbell@nvidia.com, logang@deltatee.com X-SA-Exim-Mail-From: gunthorp@deltatee.com Subject: [PATCH v5 24/24] nvme-pci: allow mmaping the CMB in userspace X-SA-Exim-Version: 4.2.1 (built Sat, 13 Feb 2021 17:57:42 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org Allow userspace to obtain CMB memory by mmaping the controller's char device. The mmap call allocates and returns a hunk of CMB memory, (the offset is ignored) so userspace does not have control over the address within the CMB. A VMA allocated in this way will only be usable by drivers that set FOLL_PCI_P2PDMA when calling GUP. And inter-device support will be checked the first time the pages are mapped for DMA. Currently this is only supported by O_DIRECT to an PCI NVMe device or through the NVMe passthrough IOCTL. Signed-off-by: Logan Gunthorpe --- drivers/nvme/host/core.c | 15 +++++++++++++++ drivers/nvme/host/nvme.h | 2 ++ drivers/nvme/host/pci.c | 17 +++++++++++++++++ 3 files changed, 34 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index ecb01984b55d..54a1c350098e 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -3142,6 +3142,10 @@ static int nvme_dev_open(struct inode *inode, struct file *file) } file->private_data = ctrl; + + if (ctrl->ops->cdev_file_open) + ctrl->ops->cdev_file_open(ctrl, file); + return 0; } @@ -3155,12 +3159,23 @@ static int nvme_dev_release(struct inode *inode, struct file *file) return 0; } +static int nvme_dev_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct nvme_ctrl *ctrl = file->private_data; + + if (!ctrl->ops->mmap_cmb) + return -ENODEV; + + return ctrl->ops->mmap_cmb(ctrl, vma); +} + static const struct file_operations nvme_dev_fops = { .owner = THIS_MODULE, .open = nvme_dev_open, .release = nvme_dev_release, .unlocked_ioctl = nvme_dev_ioctl, .compat_ioctl = compat_ptr_ioctl, + .mmap = nvme_dev_mmap, }; static ssize_t nvme_sysfs_reset(struct device *dev, diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 97efc6d0b146..6b1649d50bdf 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -494,6 +494,8 @@ struct nvme_ctrl_ops { void (*delete_ctrl)(struct nvme_ctrl *ctrl); int (*get_address)(struct nvme_ctrl *ctrl, char *buf, int size); bool (*supports_pci_p2pdma)(struct nvme_ctrl *ctrl); + void (*cdev_file_open)(struct nvme_ctrl *ctrl, struct file *file); + int (*mmap_cmb)(struct nvme_ctrl *ctrl, struct vm_area_struct *vma); }; /* diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 330515886dfc..47e47ffc12bf 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2961,6 +2961,21 @@ static bool nvme_pci_supports_pci_p2pdma(struct nvme_ctrl *ctrl) return dma_pci_p2pdma_supported(dev->dev); } +static void nvme_pci_cdev_file_open(struct nvme_ctrl *ctrl, struct file *file) +{ + struct pci_dev *pdev = to_pci_dev(to_nvme_dev(ctrl)->dev); + + pci_p2pdma_file_open(pdev, file); +} + +static int nvme_pci_mmap_cmb(struct nvme_ctrl *ctrl, + struct vm_area_struct *vma) +{ + struct pci_dev *pdev = to_pci_dev(to_nvme_dev(ctrl)->dev); + + return pci_mmap_p2pmem(pdev, vma); +} + static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = { .name = "pcie", .module = THIS_MODULE, @@ -2972,6 +2987,8 @@ static const struct nvme_ctrl_ops nvme_pci_ctrl_ops = { .submit_async_event = nvme_pci_submit_async_event, .get_address = nvme_pci_get_address, .supports_pci_p2pdma = nvme_pci_supports_pci_p2pdma, + .cdev_file_open = nvme_pci_cdev_file_open, + .mmap_cmb = nvme_pci_mmap_cmb, }; static int nvme_dev_map(struct nvme_dev *dev)