From patchwork Wed Nov 15 19:17:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 13457248 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F56CC2BB3F for ; Wed, 15 Nov 2023 19:18:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229564AbjKOTSK (ORCPT ); Wed, 15 Nov 2023 14:18:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229531AbjKOTSJ (ORCPT ); Wed, 15 Nov 2023 14:18:09 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E361CFA for ; Wed, 15 Nov 2023 11:18:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700075885; x=1731611885; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aBhKJzNfFoabCS9l9jSSnEqxmnOHsqPJWm4cXnfPK2g=; b=TTTmYz1tmliuyD97IUv5sN0AkOcWWrHnV8DhHh97Sgph2DAnC5Wg+B/f S4XCuW743tKLW7aywBAMMmipNV2s4FjGEhO0SrVKwASDoh5HEgrIk1HMd yeE9ZX8LTu39kfG5QNR+0rpbIHHHVGmrV6OlvGWOtxHCnwui7jNDjranV o8vO10jDLPlylfLgjuWS4fDh++TzWXq1B1+T9X1b05qz6JEG+0Z1++s/0 NivAkuqC8tWUZodcvaG5oZjNZibnbU7IrfpbvBwrs1EIfaU/vYBYoei6r LgC7rpwaob9mz+0f137hh8PnTbO+klup2n43LZ3hgLFg6h0sBCBliY8jN w==; X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="393793036" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="393793036" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:05 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="758581297" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="758581297" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.124.160.113]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:04 -0800 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org, linux-rdma@vger.kernel.org Cc: Mike Marciniszyn , Shiraz Saleem Subject: [PATCH for-rc 1/3] RDMA/core: Fix umem iterator when PAGE_SIZE is greater then HCA pgsz Date: Wed, 15 Nov 2023 13:17:50 -0600 Message-Id: <20231115191752.266-2-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20231115191752.266-1-shiraz.saleem@intel.com> References: <20231115191752.266-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mike Marciniszyn 64k pages introduce the situation in this diagram when the HCA 4k page size is being used: +-------------------------------------------+ <--- 64k aligned VA | | | HCA 4k page | | | +-------------------------------------------+ | o | | | | o | | | | o | +-------------------------------------------+ | | | HCA 4k page | | | +-------------------------------------------+ <--- Live HCA page |OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO| <--- offset | | <--- VA | MR data | +-------------------------------------------+ | | | HCA 4k page | | | +-------------------------------------------+ | o | | | | o | | | | o | +-------------------------------------------+ | | | HCA 4k page | | | +-------------------------------------------+ The VA addresses are coming from rdma-core in this diagram can be arbitrary, but for 64k pages, the VA may be offset by some number of HCA 4k pages and followed by some number of HCA 4k pages. The current iterator doesn't account for either the preceding 4k pages or the following 4k pages. Fix the issue by extending the ib_block_iter to contain the number of DMA pages like comment [1] says and by augmenting the macro limit test to downcount that value. This prevents the extra pages following the user MR data. Fix the preceding pages by using the __sq_advance field to start at the first 4k page containing MR data. This fix allows for the elimination of the small page crutch noted in the Fixes. Fixes: 10c75ccb54e4 ("RDMA/umem: Prevent small pages from being returned by ib_umem_find_best_pgsz()") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/rdma/ib_umem.h#n91 [1] Signed-off-by: Mike Marciniszyn Signed-off-by: Shiraz Saleem Reviewed-by: Zhu Yanjun --- drivers/infiniband/core/umem.c | 6 ------ include/rdma/ib_umem.h | 4 +++- include/rdma/ib_verbs.h | 1 + 3 files changed, 4 insertions(+), 7 deletions(-) diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index f9ab671c8eda..07c571c7b699 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -96,12 +96,6 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, return page_size; } - /* rdma_for_each_block() has a bug if the page size is smaller than the - * page size used to build the umem. For now prevent smaller page sizes - * from being returned. - */ - pgsz_bitmap &= GENMASK(BITS_PER_LONG - 1, PAGE_SHIFT); - /* The best result is the smallest page size that results in the minimum * number of required pages. Compute the largest page size that could * work based on VA address bits that don't change. diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 95896472a82b..e775d1b4910c 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -77,6 +77,8 @@ static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, { __rdma_block_iter_start(biter, umem->sgt_append.sgt.sgl, umem->sgt_append.sgt.nents, pgsz); + biter->__sg_advance = ib_umem_offset(umem) & ~(pgsz - 1); + biter->__sg_numblocks = ib_umem_num_dma_blocks(umem, pgsz); } /** @@ -92,7 +94,7 @@ static inline void __rdma_umem_block_iter_start(struct ib_block_iter *biter, */ #define rdma_umem_for_each_dma_block(umem, biter, pgsz) \ for (__rdma_umem_block_iter_start(biter, umem, pgsz); \ - __rdma_block_iter_next(biter);) + __rdma_block_iter_next(biter) && (biter)->__sg_numblocks--;) #ifdef CONFIG_INFINIBAND_USER_MEM diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index fb1a2d6b1969..b7b6b58dd348 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -2850,6 +2850,7 @@ struct ib_block_iter { /* internal states */ struct scatterlist *__sg; /* sg holding the current aligned block */ dma_addr_t __dma_addr; /* unaligned DMA address of this block */ + size_t __sg_numblocks; /* ib_umem_num_dma_blocks() */ unsigned int __sg_nents; /* number of SG entries */ unsigned int __sg_advance; /* number of bytes to advance in sg in next step */ unsigned int __pg_bit; /* alignment of current block */ From patchwork Wed Nov 15 19:17:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 13457250 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D995CC5AD4C for ; Wed, 15 Nov 2023 19:18:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjKOTSK (ORCPT ); Wed, 15 Nov 2023 14:18:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229539AbjKOTSJ (ORCPT ); Wed, 15 Nov 2023 14:18:09 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7E9FA4 for ; Wed, 15 Nov 2023 11:18:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700075886; x=1731611886; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yxklAnW6KY8nVk53uBZ74UFVh1e0+Hd9iwhm608ZEy4=; b=mczhEOkgWWBFBYOQAL9pm1UoGpir05HcD7IyhVHOLlOSDhHAD5a+jWvj bSEkit4yq4JAam3me4Xc6xgJvWw4dQKYBP00zVnbc+/jJE+kU2wftsbjF 2jc/sMCn/coyH4zKmEPFj/SG3pztaX1KgqQjkKbr1GsOWXTHjTIJC1ust KL+GSGAuv29+e9kTTtIwO4feppjh+nprHvK9GomKgfO8cuuD3OI/xrhfA 7jTBeU9OeE4pHrV02RoznNI//5IGT10zjH64y5Lx4wtuxIqJa/Ndyi+/z JsMF8TN9zPzkJOkUrOYsNmPUxeNct0KAAms7ANiQd52IKPswV8nhcGHZH g==; X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="393793039" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="393793039" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:06 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="758581304" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="758581304" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.124.160.113]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:05 -0800 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org, linux-rdma@vger.kernel.org Cc: Mike Marciniszyn , Shiraz Saleem Subject: [PATCH for-rc 2/3] RDMA/irdma: Ensure iWarp QP queue memory is OS paged aligned Date: Wed, 15 Nov 2023 13:17:51 -0600 Message-Id: <20231115191752.266-3-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20231115191752.266-1-shiraz.saleem@intel.com> References: <20231115191752.266-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mike Marciniszyn The SQ is shared for between kernel and used by storing the kernel page pointer and passing that to a kmap_atomic(). This then requires that the alignment is PAGE_SIZE aligned. Fix by adding an iWarp specific alignment check. Fixes: e965ef0e7b2c ("RDMA/irdma: Split QP handler into irdma_reg_user_mr_type_qp") Signed-off-by: Mike Marciniszyn Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/verbs.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 6415ada63c5f..b072aa5179e0 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2934,6 +2934,11 @@ static int irdma_reg_user_mr_type_qp(struct irdma_mem_reg_req req, int err; u8 lvl; + /* iWarp: Catch page not starting on OS page boundary */ + if (!rdma_protocol_roce(&iwdev->ibdev, 1) && + ib_umem_offset(iwmr->region)) + return -EINVAL; + total = req.sq_pages + req.rq_pages + 1; if (total > iwmr->page_cnt) return -EINVAL; From patchwork Wed Nov 15 19:17:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 13457249 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41A21C54FB9 for ; Wed, 15 Nov 2023 19:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229539AbjKOTSK (ORCPT ); Wed, 15 Nov 2023 14:18:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229531AbjKOTSK (ORCPT ); Wed, 15 Nov 2023 14:18:10 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58044C4 for ; Wed, 15 Nov 2023 11:18:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700075887; x=1731611887; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XwAaOc0IaDCuPKdwCBwThalNdZPi3YbAvkkuwrFm/wA=; b=JLSSrs3l+6ly8nKPq8lxF3bKSCIS0OiV9vJiKXx5fvdKDA+t772/RUPq w/V3CFm9dd54VE2gtWFW6iB67OzWV0hvNRhqtVJcwhyx2E4MtUj9vUILf rNRRK7evBRy/dsgBq+lnT+6KBLUyJF0DzcAWbluRI4KySYU7k2sHwm09f 8UoKOAo1kqNjLDJYmg8BreX7gg84ConwzbPgmwdabyQmIq2UfCO2Me732 pr0/GLNbE4CYN+5lkIjO6cPq9SqT0l6I9p6FTYThm9umBBDCt+sK+lUKv +xSyolsmZMj1NaazfzQyRpPWhIvy/uHfjTD1gR+F0eeU1X/ZLV6nAGP8w w==; X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="393793047" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="393793047" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10895"; a="758581312" X-IronPort-AV: E=Sophos;i="6.03,305,1694761200"; d="scan'208";a="758581312" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.124.160.113]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Nov 2023 11:18:06 -0800 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org, linux-rdma@vger.kernel.org Cc: Mike Marciniszyn , Shiraz Saleem Subject: [PATCH for-rc 3/3] RDMA/irdma: Fix support for 64k pages Date: Wed, 15 Nov 2023 13:17:52 -0600 Message-Id: <20231115191752.266-4-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.39.0 In-Reply-To: <20231115191752.266-1-shiraz.saleem@intel.com> References: <20231115191752.266-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mike Marciniszyn Virtual QP and CQ require a 4K HW page size but the driver passes PAGE_SIZE to ib_umem_find_best_pgsz() instead. Fix this by using the appropriate 4k value in the bitmap passed to ib_umem_find_best_pgsz(). Fixes: 693a5386eff0 ("RDMA/irdma: Split mr alloc and free into new functions") Signed-off-by: Mike Marciniszyn Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/verbs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index b072aa5179e0..7c31d2d606bb 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -2902,7 +2902,7 @@ static struct irdma_mr *irdma_alloc_iwmr(struct ib_umem *region, iwmr->type = reg_type; pgsz_bitmap = (reg_type == IRDMA_MEMREG_TYPE_MEM) ? - iwdev->rf->sc_dev.hw_attrs.page_size_cap : PAGE_SIZE; + iwdev->rf->sc_dev.hw_attrs.page_size_cap : SZ_4K; iwmr->page_size = ib_umem_find_best_pgsz(region, pgsz_bitmap, virt); if (unlikely(!iwmr->page_size)) {