[RFC,v2,20/21] nvme-pci: use new dma API

From: Leon Romanovsky <leonro@nvidia.com>

From: Leon Romanovsky <leonro@nvidia.com>

This demonstrates how the new DMA API can fit into the NVMe driver and
replace the old DMA APIs.

As this is an RFC, I expect more robust error handling, optimizations,
and in-depth testing for the final version once we agree on DMA API
architecture.

Following is the performance comparision for existing DMA API case
with sg_table and with dma_map, once we have agreement on the new DMA
API design I intend to get similar profiling numbers for new DMA API.

sgl (sg_table + old dma API ) vs no_sgl (iod_dma_map + new DMA API) :-

block size                               IOPS (k) Average of 3

4K
--------------------------------------------------------------
sg-list-fio-perf.bs-4k-1.fio:             68.6
sg-list-fio-perf.bs-4k-2.fio:             68       68.36
sg-list-fio-perf.bs-4k-3.fio:             68.5

no-sg-list-fio-perf.bs-4k-1.fio:          68.7
no-sg-list-fio-perf.bs-4k-2.fio:          68.5     68.43
no-sg-list-fio-perf.bs-4k-3.fio:          68.1

% Change default vs new DMA API =       +0.0975%

8K
--------------------------------------------------------------
sg-list-fio-perf.bs-8k-1.fio:             67
sg-list-fio-perf.bs-8k-2.fio:             67.1     67.03
sg-list-fio-perf.bs-8k-3.fio:             67

no-sg-list-fio-perf.bs-8k-1.fio:          66.7
no-sg-list-fio-perf.bs-8k-2.fio:          66.7     66.7
no-sg-list-fio-perf.bs-8k-3.fio:          66.7

% Change default vs new DMA API =       +0.4993%

16K
--------------------------------------------------------------
sg-list-fio-perf.bs-16k-1.fio:            63.8
sg-list-fio-perf.bs-16k-2.fio:            63.4     63.5
sg-list-fio-perf.bs-16k-3.fio:            63.3

no-sg-list-fio-perf.bs-16k-1.fio:         63.5
no-sg-list-fio-perf.bs-16k-2.fio:         63.4     63.33
no-sg-list-fio-perf.bs-16k-3.fio:         63.1

% Change default vs new DMA API =       -0.2632%

32K
--------------------------------------------------------------
sg-list-fio-perf.bs-32k-1.fio:            59.3
sg-list-fio-perf.bs-32k-2.fio:            59.3     59.36
sg-list-fio-perf.bs-32k-3.fio:            59.5

no-sg-list-fio-perf.bs-32k-1.fio:         59.5
no-sg-list-fio-perf.bs-32k-2.fio:         59.6     59.43
no-sg-list-fio-perf.bs-32k-3.fio:         59.2

% Change default vs new DMA API =       +0.1122%

64K
--------------------------------------------------------------
sg-list-fio-perf.bs-64k-1.fio:            53.7
sg-list-fio-perf.bs-64k-2.fio:            53.4     53.56
sg-list-fio-perf.bs-64k-3.fio:            53.6

no-sg-list-fio-perf.bs-64k-1.fio:         53.5
no-sg-list-fio-perf.bs-64k-2.fio:         53.8     53.63
no-sg-list-fio-perf.bs-64k-3.fio:         53.6

% Change default vs new DMA API =        +0.1246%

128K
--------------------------------------------------------------
sg-list-fio-perf/bs-128k-1.fio:           48
sg-list-fio-perf/bs-128k-2.fio:           46.4     47.13
sg-list-fio-perf/bs-128k-3.fio:           47

no-sg-list-fio-perf/bs-128k-1.fio:        46.6
no-sg-list-fio-perf/bs-128k-2.fio:        47        46.9
no-sg-list-fio-perf/bs-128k-3.fio:        47.1

% Change default vs new DMA API =       −0.495%

256K
--------------------------------------------------------------
sg-list-fio-perf/bs-256k-1.fio:           37
sg-list-fio-perf/bs-256k-2.fio:           41        39.93
sg-list-fio-perf/bs-256k-3.fio:           41.8

no-sg-list-fio-perf/bs-256k-1.fio:        37.5
no-sg-list-fio-perf/bs-256k-2.fio:        41.4      40.5
no-sg-list-fio-perf/bs-256k-3.fio:        42.6

% Change default vs new DMA API =       +1.42%

512K
--------------------------------------------------------------
sg-list-fio-perf/bs-512k-1.fio:           28.5
sg-list-fio-perf/bs-512k-2.fio:           28.2      28.4
sg-list-fio-perf/bs-512k-3.fio:           28.5

no-sg-list-fio-perf/bs-512k-1.fio:        28.7
no-sg-list-fio-perf/bs-512k-2.fio:        28.6      28.7
no-sg-list-fio-perf/bs-512k-3.fio:        28.8

% Change default vs new DMA API =       +1.06%

Signed-off-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/nvme/host/pci.c | 354 ++++++++++++++++++++++++----------------
 1 file changed, 215 insertions(+), 139 deletions(-)

Message ID	dd0782c9fcb73487b60939a11f381411464f0572.1726138681.git.leon@kernel.org (mailing list archive)
State	RFC
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E597C1BE238; Thu, 12 Sep 2024 11:17:21 +0000 (UTC) From: Leon Romanovsky <leon@kernel.org> To: Jens Axboe <axboe@kernel.dk>, Jason Gunthorpe <jgg@ziepe.ca>, Robin Murphy <robin.murphy@arm.com>, Joerg Roedel <joro@8bytes.org>, Will Deacon <will@kernel.org>, Keith Busch <kbusch@kernel.org>, Christoph Hellwig <hch@lst.de>, "Zeng, Oak" <oak.zeng@intel.com>, Chaitanya Kulkarni <kch@nvidia.com> Cc: Leon Romanovsky <leonro@nvidia.com>, Sagi Grimberg <sagi@grimberg.me>, Bjorn Helgaas <bhelgaas@google.com>, Logan Gunthorpe <logang@deltatee.com>, Yishai Hadas <yishaih@nvidia.com>, Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>, Kevin Tian <kevin.tian@intel.com>, Alex Williamson <alex.williamson@redhat.com>, Marek Szyprowski <m.szyprowski@samsung.com>, =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= <jglisse@redhat.com>, Andrew Morton <akpm@linux-foundation.org>, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org, iommu@lists.linux.dev, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org Subject: [RFC v2 20/21] nvme-pci: use new dma API Date: Thu, 12 Sep 2024 14:15:55 +0300 Message-ID: <dd0782c9fcb73487b60939a11f381411464f0572.1726138681.git.leon@kernel.org> In-Reply-To: <cover.1726138681.git.leon@kernel.org> References: <cover.1726138681.git.leon@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
Series	Provide a new two step DMA API mapping API \| expand [RFC,v2,00/21] Provide a new two step DMA API mapping API [RFC,v2,01/21] iommu/dma: Provide an interface to allow preallocate IOVA [RFC,v2,02/21] iommu/dma: Implement link/unlink ranges callbacks [RFC,v2,03/21] iommu/dma: Add check if IOVA can be used [RFC,v2,04/21] dma-mapping: initialize IOVA state struct [RFC,v2,05/21] dma-mapping: provide an interface to allocate IOVA [RFC,v2,06/21] dma-mapping: set and query DMA IOVA state [RFC,v2,07/21] dma-mapping: implement link range API [RFC,v2,08/21] mm/hmm: let users to tag specific PFN with DMA mapped bit [RFC,v2,09/21] dma-mapping: provide callbacks to link/unlink HMM PFNs to specific IOVA [RFC,v2,10/21] RDMA/umem: Preallocate and cache IOVA for UMEM ODP [RFC,v2,11/21] RDMA/umem: Store ODP access mask information in PFN [RFC,v2,12/21] RDMA/core: Separate DMA mapping to caching IOVA and page linkage [RFC,v2,13/21] RDMA/umem: Prevent UMEM ODP creation with SWIOTLB [RFC,v2,14/21] vfio/mlx5: Explicitly use number of pages instead of allocated length [RFC,v2,15/21] vfio/mlx5: Rewrite create mkey flow to allow better code reuse [RFC,v2,16/21] vfio/mlx5: Explicitly store page list [RFC,v2,17/21] vfio/mlx5: Convert vfio to use DMA link API [RFC,v2,18/21] nvme-pci: remove optimizations for single DMA entry [RFC,v2,19/21] nvme-pci: precalculate number of DMA entries for each command [RFC,v2,20/21] nvme-pci: use new dma API [RFC,v2,21/21] nvme-pci: don't allow mapping of bvecs with offset

[RFC,v2,20/21] nvme-pci: use new dma API

Commit Message

Patch