mbox series

[v10,0/8] Userspace P2PDMA with O_DIRECT NVMe devices

Message ID 20220922163926.7077-1-logang@deltatee.com (mailing list archive)
Headers show
Series Userspace P2PDMA with O_DIRECT NVMe devices | expand

Message

Logan Gunthorpe Sept. 22, 2022, 4:39 p.m. UTC
Hi,

This is the latest P2PDMA userspace patch set. This version includes
some cleanup from feedback of the last posting[1].

This patch set enables userspace P2PDMA by allowing userspace to mmap()
allocated chunks of the CMB. The resulting VMA can be passed only
to O_DIRECT IO on NVMe backed files or block devices. A flag is added
to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
on whether the block queue indicates P2PDMA support. Patches 7
creates the sysfs resource that can hand out the VMAs and Patch 8
adds brief documentation for the new interface.

Feedback welcome.

This series is based on v6.0-rc6. A git branch is available here:

  https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v10

Thanks,

Logan

[1] https://lkml.kernel.org/r/20220825152425.6296-1-logang@deltatee.com

--

Changes since v8:
  - Rebased onto v6.0-rc6
  - Reworked iov iter changes to reuse the code better and
    name them without the _flags() prefix (per Christoph)
  - Renamed a number of flags variables to gup_flags (per John)
  - Minor fixups to the last documentation patch (from Greg and John)

Changes since v7:
  - Rebased onto v6.0-rc2, included reworking the iov_iter patch
    due to changes there
  - Drop the char device mmap implementation in favour of a sysfs
    based interface. (per Christoph)

Changes since v6:
  - Rebase onto v5.19-rc1
  - Rework how the pages are stored in the VMA per Jason's suggestion

Changes since v5:
  - Rebased onto v5.18-rc1 which includes Christophs cleanup to
    free_zone_device_page() (similar to Ralph's patch).
  - Fix bug with concurrent first calls to pci_p2pdma_vma_fault()
    that caused a double allocation and lost p2p memory. Noticed
    by Andrew Maier.
  - Collected a Reviewed-by tag from Chaitanya.
  - Numerous minor fixes to commit messages

--

Logan Gunthorpe (8):
  mm: introduce FOLL_PCI_P2PDMA to gate getting PCI P2PDMA pages
  iov_iter: introduce iov_iter_get_pages_[alloc_]flags()
  block: add check when merging zone device pages
  lib/scatterlist: add check when merging zone device pages
  block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()
  block: set FOLL_PCI_P2PDMA in bio_map_user_iov()
  PCI/P2PDMA: Allow userspace VMA allocations through sysfs
  ABI: sysfs-bus-pci: add documentation for p2pmem allocate

 Documentation/ABI/testing/sysfs-bus-pci |  10 ++
 block/bio.c                             |  11 ++-
 block/blk-map.c                         |   7 +-
 drivers/pci/p2pdma.c                    | 124 ++++++++++++++++++++++++
 include/linux/mm.h                      |   1 +
 include/linux/mmzone.h                  |  24 +++++
 include/linux/uio.h                     |   6 ++
 lib/iov_iter.c                          |  32 ++++--
 lib/scatterlist.c                       |  25 +++--
 mm/gup.c                                |  22 ++++-
 10 files changed, 240 insertions(+), 22 deletions(-)


base-commit: 521a547ced6477c54b4b0cc206000406c221b4d6
--
2.30.2

Comments

Christoph Hellwig Sept. 23, 2022, 6:01 a.m. UTC | #1
Thanks, the entire series looks good to me now:

Reviewed-by: Christoph Hellwig <hch@lst.de>

Given that this is spread all over, what tree do we want to take it
through?
Greg Kroah-Hartman Sept. 23, 2022, 8:16 a.m. UTC | #2
On Thu, Sep 22, 2022 at 10:39:18AM -0600, Logan Gunthorpe wrote:
> Hi,
> 
> This is the latest P2PDMA userspace patch set. This version includes
> some cleanup from feedback of the last posting[1].
> 
> This patch set enables userspace P2PDMA by allowing userspace to mmap()
> allocated chunks of the CMB. The resulting VMA can be passed only
> to O_DIRECT IO on NVMe backed files or block devices. A flag is added
> to GUP() in Patch 1, then Patches 2 through 6 wire this flag up based
> on whether the block queue indicates P2PDMA support. Patches 7
> creates the sysfs resource that can hand out the VMAs and Patch 8
> adds brief documentation for the new interface.
> 
> Feedback welcome.
> 
> This series is based on v6.0-rc6. A git branch is available here:
> 
>   https://github.com/sbates130272/linux-p2pmem/  p2pdma_user_cmb_v10

Looks good to me, thanks for sticking with it.

greg k-h
Logan Gunthorpe Sept. 23, 2022, 3:25 p.m. UTC | #3
On 2022-09-23 00:01, Christoph Hellwig wrote:
> Thanks, the entire series looks good to me now:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> Given that this is spread all over, what tree do we want to take it
> through?

Yes, while this is ostensibly a feature for NVMe it turns out we didn't
need to touch any NVMe code at all.

The most likely patch in my mind to have conflicts is the iov_iter patch
as there's been a lot of churn there in the last few cycles and there
are continued discussions.

There are 2 PCI patches, but Bjorn's aware of them and has acked them.
I'm also fairly confident this shouldn't conflict with anything in his tree.

Besides that, there is one mm/gup patch which is the next likely to
conflict; one scatterlist patch and three block layer patches which have
largely been stable when I've done rebases.

Logan