mbox series

[rdma-next,v1,0/6] Add APIs to get contiguous memory blocks aligned to a HW supported page size

Message ID 20190219145745.13476-1-shiraz.saleem@intel.com (mailing list archive)
Headers show
Series Add APIs to get contiguous memory blocks aligned to a HW supported page size | expand

Message

Shiraz Saleem Feb. 19, 2019, 2:57 p.m. UTC
From: "Saleem, Shiraz" <shiraz.saleem@intel.com>

This patch set is aiming to allow drivers that support multiple
page sizes to leverage the core umem APIs to obtain suitable HW DMA
addresses for the MR, aligned to a supported page size. The APIs
accomodates for HW that support single page size or mixed page sizes
in an MR. The motivation for this work comes from the discussion in [1].

The first patch modifies current memory registration API ib_umem_get()
to combine contiguous regions into SGEs and add them to the scatter table.

The second patch introduces a new core API that allows drivers to find the
best supported page size to use for this MR, from a bitmap of HW supported
page sizes.

The third patch introduces new core APIs that iterates through the SG list
and returns contiguous memory blocks aligned to a HW supported page size.

The fourth patch and fifth patch removes the dependency of i40iw and bnxt_re
drivers on the hugetlb flag. The new core APIs are called in these drivers to
get huge page size aligned addresses if the MR is backed by huge pages.

The sixth patch removes the hugetlb flag from IB core.

Please note that mixed page portion of the algorithm and bnxt_re update in
patch #5 have not been tested on hardware.

[1] https://patchwork.kernel.org/patch/10499753/

RFC-->v0:
---------
* Add to scatter table by iterating a limited sized page list.
* Updated driver call sites to use the for_each_sg_page iterator
  variant where applicable.
* Tweaked algorithm in ib_umem_find_single_pg_size and ib_umem_next_phys_iter
  to ignore alignment of the start of first SGE and end of the last SGE.
* Simplified ib_umem_find_single_pg_size on offset alignments checks for
  user-space virtual and physical buffer.
* Updated ib_umem_start_phys_iter to do some pre-computation
  for the non-mixed page support case.
* Updated bnxt_re driver to use the new core APIs and remove its
  dependency on the huge tlb flag.
* Fixed a bug in computation of sg_phys_iter->phyaddr in ib_umem_next_phys_iter.
* Drop hugetlb flag usage from RDMA subsystem.
* Rebased on top of for-next.

v0-->v1:
--------
* Remove the patches that update driver to use for_each_sg_page variant
  to iterate in the SGE. This is sent as a seperate series using
  the for_each_sg_dma_page variant.
* Tweak ib_umem_add_sg_table API defintion based on maintainer feedback.
* Cache number of scatterlist entries in umem.
* Update function headers for ib_umem_find_single_pg_size and ib_umem_next_phys_iter.
* Add sanity check on supported_pgsz in ib_umem_find_single_pg_size.

Shiraz Saleem (6):
  RDMA/umem: Combine contiguous PAGE_SIZE regions in SGEs
  RDMA/umem: Add API to find best driver supported page size in an MR
  RDMA/umem: Add API to return aligned memory blocks from SGL
  RDMA/i40iw: Use umem API to retrieve aligned DMA address
  RDMA/bnxt_re: Use umem APIs to retrieve aligned DMA address
  RDMA/umem: Remove hugetlb flag

 drivers/infiniband/core/umem.c            | 281 +++++++++++++++++++++++++++---
 drivers/infiniband/core/umem_odp.c        |   3 -
 drivers/infiniband/hw/bnxt_re/ib_verbs.c  |  28 ++-
 drivers/infiniband/hw/i40iw/i40iw_user.h  |   5 +
 drivers/infiniband/hw/i40iw/i40iw_verbs.c |  49 +-----
 drivers/infiniband/hw/i40iw/i40iw_verbs.h |   3 +-
 include/rdma/ib_umem.h                    |  50 +++++-
 7 files changed, 319 insertions(+), 100 deletions(-)