diff mbox series

[v9,5/8] block: set FOLL_PCI_P2PDMA in __bio_iov_iter_get_pages()

Message ID 20220825152425.6296-6-logang@deltatee.com (mailing list archive)
State Superseded
Headers show
Series Userspace P2PDMA with O_DIRECT NVMe devices | expand

Commit Message

Logan Gunthorpe Aug. 25, 2022, 3:24 p.m. UTC
When a bio's queue supports PCI P2PDMA, set FOLL_PCI_P2PDMA for
iov_iter_get_pages_flags(). This allows PCI P2PDMA pages to be passed
from userspace and enables the O_DIRECT path in iomap based filesystems
and direct to block devices.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
---
 block/bio.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig Sept. 5, 2022, 2:36 p.m. UTC | #1
On Thu, Aug 25, 2022 at 09:24:22AM -0600, Logan Gunthorpe wrote:
> +	if (bio->bi_bdev && bio->bi_bdev->bd_disk &&
> +	    blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))

bdev->bd_disk is never NULL.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
John Hubbard Sept. 6, 2022, 12:48 a.m. UTC | #2
On 8/25/22 08:24, Logan Gunthorpe wrote:
> When a bio's queue supports PCI P2PDMA, set FOLL_PCI_P2PDMA for
> iov_iter_get_pages_flags(). This allows PCI P2PDMA pages to be passed
> from userspace and enables the O_DIRECT path in iomap based filesystems
> and direct to block devices.

Oh great, more O_DIRECT code paths. :)

> 
> Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
> ---
>  block/bio.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 969607bc1f4d..ca09d79a0683 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1200,6 +1200,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
>  	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
>  	struct page **pages = (struct page **)bv;
> +	unsigned int flags = 0;

It would be nice to name this gup_flags, instead.

>  	ssize_t size, left;
>  	unsigned len, i = 0;
>  	size_t offset, trim;
> @@ -1213,6 +1214,10 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
>  	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
>  
> +	if (bio->bi_bdev && bio->bi_bdev->bd_disk &&
> +	    blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))
> +		flags |= FOLL_PCI_P2PDMA;
> +
>  	/*
>  	 * Each segment in the iov is required to be a block size multiple.
>  	 * However, we may not be able to get the entire segment if it spans
> @@ -1220,8 +1225,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
>  	 * result to ensure the bio's total size is correct. The remainder of
>  	 * the iov data will be picked up in the next bio iteration.
>  	 */
> -	size = iov_iter_get_pages2(iter, pages, UINT_MAX - bio->bi_iter.bi_size,
> -				  nr_pages, &offset);
> +	size = iov_iter_get_pages_flags(iter, pages,
> +					UINT_MAX - bio->bi_iter.bi_size,
> +					nr_pages, &offset, flags);
>  	if (unlikely(size <= 0))
>  		return size ? size : -EFAULT;
>  

Looks good. After applying Christoph's tweak, and optionally the gup_flags
rename as well, please feel free to add:

Reviewed-by: John Hubbard <jhubbard@nvidia.com>


thanks,
diff mbox series

Patch

diff --git a/block/bio.c b/block/bio.c
index 969607bc1f4d..ca09d79a0683 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1200,6 +1200,7 @@  static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
 	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
 	struct page **pages = (struct page **)bv;
+	unsigned int flags = 0;
 	ssize_t size, left;
 	unsigned len, i = 0;
 	size_t offset, trim;
@@ -1213,6 +1214,10 @@  static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	BUILD_BUG_ON(PAGE_PTRS_PER_BVEC < 2);
 	pages += entries_left * (PAGE_PTRS_PER_BVEC - 1);
 
+	if (bio->bi_bdev && bio->bi_bdev->bd_disk &&
+	    blk_queue_pci_p2pdma(bio->bi_bdev->bd_disk->queue))
+		flags |= FOLL_PCI_P2PDMA;
+
 	/*
 	 * Each segment in the iov is required to be a block size multiple.
 	 * However, we may not be able to get the entire segment if it spans
@@ -1220,8 +1225,9 @@  static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	 * result to ensure the bio's total size is correct. The remainder of
 	 * the iov data will be picked up in the next bio iteration.
 	 */
-	size = iov_iter_get_pages2(iter, pages, UINT_MAX - bio->bi_iter.bi_size,
-				  nr_pages, &offset);
+	size = iov_iter_get_pages_flags(iter, pages,
+					UINT_MAX - bio->bi_iter.bi_size,
+					nr_pages, &offset, flags);
 	if (unlikely(size <= 0))
 		return size ? size : -EFAULT;