Message ID | 6-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | RDMA: Improve use of umem in DMA drivers | expand |
> Subject: [PATCH 06/14] RDMA/umem: Split ib_umem_num_pages() into > ib_umem_num_dma_blocks() > > ib_num_pages() should only be used by things working with the SGL in CPU > pages directly. > > Drivers building DMA lists should use the new ib_num_dma_blocks() which returns > the number of blocks rdma_umem_for_each_block() will return. > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/hw/cxgb4/mem.c | 2 +- > drivers/infiniband/hw/mlx5/mem.c | 5 +++-- > drivers/infiniband/hw/mthca/mthca_provider.c | 2 +- > drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 2 +- > include/rdma/ib_umem.h | 14 +++++++++++--- > 5 files changed, 17 insertions(+), 8 deletions(-) > Perhaps the one in the bnxt_re too? https://elixir.bootlin.com/linux/v5.9-rc3/source/drivers/infiniband/hw/bnxt_re/qplib_res.c#L94
On Thu, Sep 03, 2020 at 02:12:10PM +0000, Saleem, Shiraz wrote: > > Subject: [PATCH 06/14] RDMA/umem: Split ib_umem_num_pages() into > > ib_umem_num_dma_blocks() > > > > ib_num_pages() should only be used by things working with the SGL in CPU > > pages directly. > > > > Drivers building DMA lists should use the new ib_num_dma_blocks() which returns > > the number of blocks rdma_umem_for_each_block() will return. > > > > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > > drivers/infiniband/hw/cxgb4/mem.c | 2 +- > > drivers/infiniband/hw/mlx5/mem.c | 5 +++-- > > drivers/infiniband/hw/mthca/mthca_provider.c | 2 +- > > drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 2 +- > > include/rdma/ib_umem.h | 14 +++++++++++--- > > 5 files changed, 17 insertions(+), 8 deletions(-) > > > > Perhaps the one in the bnxt_re too? > > https://elixir.bootlin.com/linux/v5.9-rc3/source/drivers/infiniband/hw/bnxt_re/qplib_res.c#L94 Yes, that needs fixing, but it is not so easy. This patch is just for the easy drivers.. Planning to do it separately as it is the 'easiest' of the remaining 'hard' conversions Jason
On Tue, Sep 01, 2020 at 09:43:34PM -0300, Jason Gunthorpe wrote: > diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h > index b880512ba95f16..ba3b9be0d8c56a 100644 > +++ b/include/rdma/ib_umem.h > @@ -33,11 +33,17 @@ static inline int ib_umem_offset(struct ib_umem *umem) > return umem->address & ~PAGE_MASK; > } > > +static inline size_t ib_umem_num_dma_blocks(struct ib_umem *umem, > + unsigned long pgsz) > +{ > + return (ALIGN(umem->address + umem->length, pgsz) - > + ALIGN_DOWN(umem->address, pgsz)) / > + pgsz; > +} Like the other places, this is wrong as well. The IOVA needs to be used here or it can be +-1 page away from what rdma_umem_for_each_dma_block() returns. However, it does work if pgsz is PAGE_SIZE, or for the very common cases where umem->address == IOVA... Which, I suppose, is why nobody noticed this until now, as I found several drivers open coding the above. Jason
diff --git a/drivers/infiniband/hw/cxgb4/mem.c b/drivers/infiniband/hw/cxgb4/mem.c index 82afdb1987eff6..22c8f5745047db 100644 --- a/drivers/infiniband/hw/cxgb4/mem.c +++ b/drivers/infiniband/hw/cxgb4/mem.c @@ -548,7 +548,7 @@ struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, shift = PAGE_SHIFT; - n = ib_umem_num_pages(mhp->umem); + n = ib_umem_num_dma_blocks(mhp->umem, 1 << shift); err = alloc_pbl(mhp, n); if (err) goto err_umem_release; diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c index c19ec9fd8a63c3..5641dbda72ff66 100644 --- a/drivers/infiniband/hw/mlx5/mem.c +++ b/drivers/infiniband/hw/mlx5/mem.c @@ -169,8 +169,9 @@ void mlx5_ib_populate_pas(struct mlx5_ib_dev *dev, struct ib_umem *umem, int page_shift, __be64 *pas, int access_flags) { return __mlx5_ib_populate_pas(dev, umem, page_shift, 0, - ib_umem_num_pages(umem), pas, - access_flags); + ib_umem_num_dma_blocks(umem, + 1 << page_shift), + pas, access_flags); } int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset) { diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c index 317e67ad915fe8..b785fb9a2634ff 100644 --- a/drivers/infiniband/hw/mthca/mthca_provider.c +++ b/drivers/infiniband/hw/mthca/mthca_provider.c @@ -877,7 +877,7 @@ static struct ib_mr *mthca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, goto err; } - n = ib_umem_num_pages(mr->umem); + n = ib_umem_num_dma_blocks(mr->umem, PAGE_SIZE); mr->mtt = mthca_alloc_mtt(dev, n); if (IS_ERR(mr->mtt)) { diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c index 91f0957e611543..e80848bfb3bdbf 100644 --- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c +++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c @@ -133,7 +133,7 @@ struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length, return ERR_CAST(umem); } - npages = ib_umem_num_pages(umem); + npages = ib_umem_num_dma_blocks(umem, PAGE_SIZE); if (npages < 0 || npages > PVRDMA_PAGE_DIR_MAX_PAGES) { dev_warn(&dev->pdev->dev, "overflow %d pages in mem region\n", npages); diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index b880512ba95f16..ba3b9be0d8c56a 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -33,11 +33,17 @@ static inline int ib_umem_offset(struct ib_umem *umem) return umem->address & ~PAGE_MASK; } +static inline size_t ib_umem_num_dma_blocks(struct ib_umem *umem, + unsigned long pgsz) +{ + return (ALIGN(umem->address + umem->length, pgsz) - + ALIGN_DOWN(umem->address, pgsz)) / + pgsz; +} + static inline size_t ib_umem_num_pages(struct ib_umem *umem) { - return (ALIGN(umem->address + umem->length, PAGE_SIZE) - - ALIGN_DOWN(umem->address, PAGE_SIZE)) >> - PAGE_SHIFT; + return ib_umem_num_dma_blocks(umem, PAGE_SIZE); } static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, @@ -55,6 +61,8 @@ static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, * pgsz must be <= PAGE_SIZE or computed by ib_umem_find_best_pgsz(). The * returned DMA blocks will be aligned to pgsz and span the range: * ALIGN_DOWN(umem->address, pgsz) to ALIGN(umem->address + umem->length, pgsz) + * + * Performs exactly ib_umem_num_dma_blocks() iterations. */ #define rdma_umem_for_each_dma_block(umem, biter, pgsz) \ for (__rdma_umem_block_iter_start(biter, umem, pgsz); \
ib_num_pages() should only be used by things working with the SGL in CPU pages directly. Drivers building DMA lists should use the new ib_num_dma_blocks() which returns the number of blocks rdma_umem_for_each_block() will return. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/infiniband/hw/cxgb4/mem.c | 2 +- drivers/infiniband/hw/mlx5/mem.c | 5 +++-- drivers/infiniband/hw/mthca/mthca_provider.c | 2 +- drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c | 2 +- include/rdma/ib_umem.h | 14 +++++++++++--- 5 files changed, 17 insertions(+), 8 deletions(-)