From patchwork Tue Feb 19 14:57:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 10819997 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2DD991399 for ; Tue, 19 Feb 2019 14:58:02 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 107F12C62F for ; Tue, 19 Feb 2019 14:58:02 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id F2A262C638; Tue, 19 Feb 2019 14:58:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5CFB02C62E for ; Tue, 19 Feb 2019 14:58:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726546AbfBSO6A (ORCPT ); Tue, 19 Feb 2019 09:58:00 -0500 Received: from mga05.intel.com ([192.55.52.43]:28314 "EHLO mga05.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725794AbfBSO6A (ORCPT ); Tue, 19 Feb 2019 09:58:00 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 19 Feb 2019 06:57:59 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.58,388,1544515200"; d="scan'208";a="117372254" Received: from ssaleem-mobl4.amr.corp.intel.com ([10.254.12.189]) by orsmga006.jf.intel.com with ESMTP; 19 Feb 2019 06:57:58 -0800 From: Shiraz Saleem To: dledford@redhat.com, jgg@ziepe.ca Cc: linux-rdma@vger.kernel.org, Shiraz Saleem Subject: [PATCH rdma-next v1 3/6] RDMA/umem: Add API to return aligned memory blocks from SGL Date: Tue, 19 Feb 2019 08:57:42 -0600 Message-Id: <20190219145745.13476-4-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.8.3 In-Reply-To: <20190219145745.13476-1-shiraz.saleem@intel.com> References: <20190219145745.13476-1-shiraz.saleem@intel.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This helper iterates over the SG list and returns contiguous memory blocks aligned to a HW supported page size. The implementation is intended to work for HW that support single page sizes or mixed page sizes in an MR. Drivers can use this helper to retreive the DMA addresses aligned to their best supported page size. Suggested-by: Jason Gunthorpe Reviewed-by: Michael J. Ruhl Signed-off-by: Shiraz Saleem --- drivers/infiniband/core/umem.c | 90 ++++++++++++++++++++++++++++++++++++++++++ include/rdma/ib_umem.h | 23 +++++++++++ 2 files changed, 113 insertions(+) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index d3b572e..8c3ec1b 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -214,6 +214,96 @@ unsigned long ib_umem_find_single_pg_size(struct ib_umem *umem, } EXPORT_SYMBOL(ib_umem_find_single_pg_size); +static unsigned int ib_umem_find_mixed_pg_bit(struct scatterlist *sgl_head, + struct sg_phys_iter *sg_phys_iter) +{ + unsigned long dma_addr_start, dma_addr_end; + + dma_addr_start = sg_dma_address(sg_phys_iter->sg); + dma_addr_end = sg_dma_address(sg_phys_iter->sg) + + sg_dma_len(sg_phys_iter->sg); + + if (sg_phys_iter->sg == sgl_head) + return ib_umem_find_pg_bit(dma_addr_end, + sg_phys_iter->supported_pgsz); + else if (sg_is_last(sg_phys_iter->sg)) + return ib_umem_find_pg_bit(dma_addr_start, + sg_phys_iter->supported_pgsz); + else + return ib_umem_find_pg_bit(sg_phys_iter->phyaddr, + sg_phys_iter->supported_pgsz); +} +void ib_umem_start_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter, + unsigned long supported_pgsz) +{ + memset(sg_phys_iter, 0, sizeof(struct sg_phys_iter)); + sg_phys_iter->sg = umem->sg_head.sgl; + sg_phys_iter->supported_pgsz = supported_pgsz; + + /* Single page support in MR */ + if (hweight_long(supported_pgsz) == 1) + sg_phys_iter->pg_bit = fls64(supported_pgsz) - 1; + else + sg_phys_iter->mixed_pg_support = true; +} +EXPORT_SYMBOL(ib_umem_start_phys_iter); + +/** + * ib_umem_next_phys_iter - Iterate SG entries in aligned memory blocks + * @umem: umem struct + * @sg_phys_iter: SG phy iterator + * @supported_pgsz: bitmask of HW supported page sizes + * + * This helper iterates over the SG list and returns memory + * blocks aligned to a HW supported page size. + * + * Each true result returns a contiguous aligned memory blocks + * such that: + * - pg_bit indicate the alignment of this block such that + * phyaddr & ((1 << pg_bit) - 1) == 0 + * - All blocks except the starting block has a zero offset + * For the starting block offset indicates the first valid byte + * in the MR, HW should not permit access to bytes earlier that offset. + * - For all blocks except the last, len + offset equals 1 << pg_bit. + * + * False is returned when iteration is completed and blocks have been seen. + * + */ +bool ib_umem_next_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter) +{ + unsigned long pg_mask; + + if (!sg_phys_iter->supported_pgsz || !sg_phys_iter->sg) + return false; + + if (sg_phys_iter->remaining) { + sg_phys_iter->phyaddr += (sg_phys_iter->len + sg_phys_iter->offset); + } else { + sg_phys_iter->phyaddr = sg_dma_address(sg_phys_iter->sg); + sg_phys_iter->remaining = sg_dma_len(sg_phys_iter->sg); + } + + /* Mixed page support in MR*/ + if (sg_phys_iter->mixed_pg_support) + sg_phys_iter->pg_bit = ib_umem_find_mixed_pg_bit(umem->sg_head.sgl, + sg_phys_iter); + + pg_mask = ~(BIT_ULL(sg_phys_iter->pg_bit) - 1); + + sg_phys_iter->offset = sg_phys_iter->phyaddr & ~pg_mask; + sg_phys_iter->phyaddr = sg_phys_iter->phyaddr & pg_mask; + sg_phys_iter->len = min_t(unsigned long, sg_phys_iter->remaining, + BIT_ULL(sg_phys_iter->pg_bit) - sg_phys_iter->offset); + sg_phys_iter->remaining -= sg_phys_iter->len; + if (!sg_phys_iter->remaining) + sg_phys_iter->sg = sg_next(sg_phys_iter->sg); + + return true; +} +EXPORT_SYMBOL(ib_umem_next_phys_iter); + /** * ib_umem_get - Pin and DMA map userspace memory. * diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 4e186a3..49bd444 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -57,6 +57,17 @@ struct ib_umem { int npages; }; +struct sg_phys_iter { + struct scatterlist *sg; + unsigned long phyaddr; + unsigned long len; + unsigned long offset; + unsigned long supported_pgsz; + unsigned long remaining; + unsigned int pg_bit; + u8 mixed_pg_support; +}; + /* Returns the offset of the umem start relative to the first page. */ static inline int ib_umem_offset(struct ib_umem *umem) { @@ -91,6 +102,11 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, unsigned long ib_umem_find_single_pg_size(struct ib_umem *umem, unsigned long supported_pgsz, unsigned long uvirt_addr); +void ib_umem_start_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter, + unsigned long supported_pgsz); +bool ib_umem_next_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter); #else /* CONFIG_INFINIBAND_USER_MEM */ @@ -113,6 +129,13 @@ static inline int ib_umem_find_single_pg_size(struct ib_umem *umem, unsigned long uvirt_addr) { return -EINVAL; } +static inline void ib_umem_start_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter, + unsigned long supported_pgsz) { } +static inline bool ib_umem_next_phys_iter(struct ib_umem *umem, + struct sg_phys_iter *sg_phys_iter) { + return false; +} #endif /* CONFIG_INFINIBAND_USER_MEM */