diff mbox series

[06/14] RDMA/umem: Split ib_umem_num_pages() into ib_umem_num_dma_blocks()

Message ID 6-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com (mailing list archive)
State Superseded
Headers show
Series RDMA: Improve use of umem in DMA drivers | expand

Commit Message

Jason Gunthorpe Sept. 2, 2020, 12:43 a.m. UTC
ib_num_pages() should only be used by things working with the SGL in CPU
pages directly.

Drivers building DMA lists should use the new ib_num_dma_blocks() which
returns the number of blocks rdma_umem_for_each_block() will return.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/hw/cxgb4/mem.c            |  2 +-
 drivers/infiniband/hw/mlx5/mem.c             |  5 +++--
 drivers/infiniband/hw/mthca/mthca_provider.c |  2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c |  2 +-
 include/rdma/ib_umem.h                       | 14 +++++++++++---
 5 files changed, 17 insertions(+), 8 deletions(-)

Comments

Shiraz Saleem Sept. 3, 2020, 2:12 p.m. UTC | #1
> Subject: [PATCH 06/14] RDMA/umem: Split ib_umem_num_pages() into
> ib_umem_num_dma_blocks()
> 
> ib_num_pages() should only be used by things working with the SGL in CPU
> pages directly.
> 
> Drivers building DMA lists should use the new ib_num_dma_blocks() which returns
> the number of blocks rdma_umem_for_each_block() will return.
> 
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/hw/cxgb4/mem.c            |  2 +-
>  drivers/infiniband/hw/mlx5/mem.c             |  5 +++--
>  drivers/infiniband/hw/mthca/mthca_provider.c |  2 +-
> drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c |  2 +-
>  include/rdma/ib_umem.h                       | 14 +++++++++++---
>  5 files changed, 17 insertions(+), 8 deletions(-)
> 

Perhaps the one in the bnxt_re too?

https://elixir.bootlin.com/linux/v5.9-rc3/source/drivers/infiniband/hw/bnxt_re/qplib_res.c#L94
Jason Gunthorpe Sept. 3, 2020, 2:14 p.m. UTC | #2
On Thu, Sep 03, 2020 at 02:12:10PM +0000, Saleem, Shiraz wrote:
> > Subject: [PATCH 06/14] RDMA/umem: Split ib_umem_num_pages() into
> > ib_umem_num_dma_blocks()
> > 
> > ib_num_pages() should only be used by things working with the SGL in CPU
> > pages directly.
> > 
> > Drivers building DMA lists should use the new ib_num_dma_blocks() which returns
> > the number of blocks rdma_umem_for_each_block() will return.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> >  drivers/infiniband/hw/cxgb4/mem.c            |  2 +-
> >  drivers/infiniband/hw/mlx5/mem.c             |  5 +++--
> >  drivers/infiniband/hw/mthca/mthca_provider.c |  2 +-
> > drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c |  2 +-
> >  include/rdma/ib_umem.h                       | 14 +++++++++++---
> >  5 files changed, 17 insertions(+), 8 deletions(-)
> > 
> 
> Perhaps the one in the bnxt_re too?
> 
> https://elixir.bootlin.com/linux/v5.9-rc3/source/drivers/infiniband/hw/bnxt_re/qplib_res.c#L94

Yes, that needs fixing, but it is not so easy. This patch is just for
the easy drivers..

Planning to do it separately as it is the 'easiest' of the remaining
'hard' conversions

Jason
Jason Gunthorpe Sept. 4, 2020, 10:32 p.m. UTC | #3
On Tue, Sep 01, 2020 at 09:43:34PM -0300, Jason Gunthorpe wrote:
> diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
> index b880512ba95f16..ba3b9be0d8c56a 100644
> +++ b/include/rdma/ib_umem.h
> @@ -33,11 +33,17 @@ static inline int ib_umem_offset(struct ib_umem *umem)
>  	return umem->address & ~PAGE_MASK;
>  }
>  
> +static inline size_t ib_umem_num_dma_blocks(struct ib_umem *umem,
> +					    unsigned long pgsz)
> +{
> +	return (ALIGN(umem->address + umem->length, pgsz) -
> +		ALIGN_DOWN(umem->address, pgsz)) /
> +	       pgsz;
> +}

Like the other places, this is wrong as well. The IOVA needs to be
used here or it can be +-1 page away from what
rdma_umem_for_each_dma_block() returns.

However, it does work if pgsz is PAGE_SIZE, or for the very common
cases where umem->address == IOVA... 

Which, I suppose, is why nobody noticed this until now, as I found
several drivers open coding the above.

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c
index 82afdb1987eff6..22c8f5745047db 100644
--- a/drivers/infiniband/hw/cxgb4/mem.c
+++ b/drivers/infiniband/hw/cxgb4/mem.c
@@ -548,7 +548,7 @@  struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 
 	shift = PAGE_SHIFT;
 
-	n = ib_umem_num_pages(mhp->umem);
+	n = ib_umem_num_dma_blocks(mhp->umem, 1 << shift);
 	err = alloc_pbl(mhp, n);
 	if (err)
 		goto err_umem_release;
diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c
index c19ec9fd8a63c3..5641dbda72ff66 100644
--- a/drivers/infiniband/hw/mlx5/mem.c
+++ b/drivers/infiniband/hw/mlx5/mem.c
@@ -169,8 +169,9 @@  void mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem,
 			  int page_shift, __be64 *pas, int access_flags)
 {
 	return __mlx5_ib_populate_pas(dev, umem, page_shift, 0,
-				      ib_umem_num_pages(umem), pas,
-				      access_flags);
+				      ib_umem_num_dma_blocks(umem,
+							     1 << page_shift),
+				      pas, access_flags);
 }
 int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset)
 {
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 317e67ad915fe8..b785fb9a2634ff 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -877,7 +877,7 @@  static struct ib_mr *mthca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 		goto err;
 	}
 
-	n = ib_umem_num_pages(mr->umem);
+	n = ib_umem_num_dma_blocks(mr->umem, PAGE_SIZE);
 
 	mr->mtt = mthca_alloc_mtt(dev, n);
 	if (IS_ERR(mr->mtt)) {
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c
index 91f0957e611543..e80848bfb3bdbf 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c
@@ -133,7 +133,7 @@  struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 		return ERR_CAST(umem);
 	}
 
-	npages = ib_umem_num_pages(umem);
+	npages = ib_umem_num_dma_blocks(umem, PAGE_SIZE);
 	if (npages < 0 || npages > PVRDMA_PAGE_DIR_MAX_PAGES) {
 		dev_warn(&dev->pdev->dev, "overflow %d pages in mem region\n",
 			 npages);
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index b880512ba95f16..ba3b9be0d8c56a 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -33,11 +33,17 @@  static inline int ib_umem_offset(struct ib_umem *umem)
 	return umem->address & ~PAGE_MASK;
 }
 
+static inline size_t ib_umem_num_dma_blocks(struct ib_umem *umem,
+					    unsigned long pgsz)
+{
+	return (ALIGN(umem->address + umem->length, pgsz) -
+		ALIGN_DOWN(umem->address, pgsz)) /
+	       pgsz;
+}
+
 static inline size_t ib_umem_num_pages(struct ib_umem *umem)
 {
-	return (ALIGN(umem->address + umem->length, PAGE_SIZE) -
-		ALIGN_DOWN(umem->address, PAGE_SIZE)) >>
-	       PAGE_SHIFT;
+	return ib_umem_num_dma_blocks(umem, PAGE_SIZE);
 }
 
 static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter,
@@ -55,6 +61,8 @@  static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter,
  * pgsz must be <= PAGE_SIZE or computed by ib_umem_find_best_pgsz(). The
  * returned DMA blocks will be aligned to pgsz and span the range:
  * ALIGN_DOWN(umem->address, pgsz) to ALIGN(umem->address + umem->length, pgsz)
+ *
+ * Performs exactly ib_umem_num_dma_blocks() iterations.
  */
 #define rdma_umem_for_each_dma_block(umem, biter, pgsz)                        \
 	for (__rdma_umem_block_iter_start(biter, umem, pgsz);                  \