diff mbox series

[rdma-next] RDMA/mlx5: Set mkeys for dmabuf at PAGE_SIZE

Message ID 1e2289b9133e89f273a4e68d459057d032cbc2ce.1718301631.git.leon@kernel.org (mailing list archive)
State Accepted
Commit a4e540119be565f47c305f295ed43f8e0bc3f5c3
Headers show
Series [rdma-next] RDMA/mlx5: Set mkeys for dmabuf at PAGE_SIZE | expand

Commit Message

Leon Romanovsky June 13, 2024, 6:01 p.m. UTC
From: Chiara Meiohas <cmeiohas@nvidia.com>

Set the mkey for dmabuf at PAGE_SIZE to support any SGL
after a move operation.

ib_umem_find_best_pgsz returns 0 on error, so it is
incorrect to check the returned page_size against PAGE_SIZE

Fixes: 90da7dc8206a ("RDMA/mlx5: Support dma-buf based userspace memory region")
Signed-off-by: Chiara Meiohas <cmeiohas@nvidia.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 13 +++++++++++++
 drivers/infiniband/hw/mlx5/odp.c     |  6 ++----
 2 files changed, 15 insertions(+), 4 deletions(-)

Comments

Jason Gunthorpe June 17, 2024, 1:59 p.m. UTC | #1
On Thu, Jun 13, 2024 at 09:01:42PM +0300, Leon Romanovsky wrote:
> From: Chiara Meiohas <cmeiohas@nvidia.com>
> 
> Set the mkey for dmabuf at PAGE_SIZE to support any SGL
> after a move operation.
> 
> ib_umem_find_best_pgsz returns 0 on error, so it is
> incorrect to check the returned page_size against PAGE_SIZE

This commit message is not clear enough for something that need to be
backported:

RDMA/mlx5: Support non-page size aligned DMABUF mkeys

The mkey page size for DMABUF is fixed at PAGE_SIZE because we have to
support a move operation that could change a large-sized page list
into a small page-list and the mkey must be able to represent it.

The test for this is not quite correct, instead of checking the output
of mlx5_umem_find_best_pgsz() the call to ib_umem_find_best_pgsz
should specify the exact HW/SW restriction - only PAGE_SIZE is
accepted.

Then the normal logic for dealing with leading/trailing sub page
alignment works correctly and sub page size DMBUF mappings can be
supported.

This is particularly painful on 64K kernels.

Jason
Leon Romanovsky June 18, 2024, 12:08 p.m. UTC | #2
On Mon, Jun 17, 2024 at 10:59:05AM -0300, Jason Gunthorpe wrote:
> On Thu, Jun 13, 2024 at 09:01:42PM +0300, Leon Romanovsky wrote:
> > From: Chiara Meiohas <cmeiohas@nvidia.com>
> > 
> > Set the mkey for dmabuf at PAGE_SIZE to support any SGL
> > after a move operation.
> > 
> > ib_umem_find_best_pgsz returns 0 on error, so it is
> > incorrect to check the returned page_size against PAGE_SIZE
> 
> This commit message is not clear enough for something that need to be
> backported:

This patch is going to be backported without any relation to the commit
message as it has Fixes line.

Thanks

> 
> RDMA/mlx5: Support non-page size aligned DMABUF mkeys
> 
> The mkey page size for DMABUF is fixed at PAGE_SIZE because we have to
> support a move operation that could change a large-sized page list
> into a small page-list and the mkey must be able to represent it.
> 
> The test for this is not quite correct, instead of checking the output
> of mlx5_umem_find_best_pgsz() the call to ib_umem_find_best_pgsz
> should specify the exact HW/SW restriction - only PAGE_SIZE is
> accepted.
> 
> Then the normal logic for dealing with leading/trailing sub page
> alignment works correctly and sub page size DMBUF mappings can be
> supported.
> 
> This is particularly painful on 64K kernels.

Unfortunately, the patch was already merged, so I can't change the
commit message in for-next branch.

Thanks

> 
> Jason
Jason Gunthorpe June 18, 2024, 1:02 p.m. UTC | #3
On Tue, Jun 18, 2024 at 03:08:14PM +0300, Leon Romanovsky wrote:
> > > Set the mkey for dmabuf at PAGE_SIZE to support any SGL
> > > after a move operation.
> > > 
> > > ib_umem_find_best_pgsz returns 0 on error, so it is
> > > incorrect to check the returned page_size against PAGE_SIZE
> > 
> > This commit message is not clear enough for something that need to be
> > backported:
> 
> This patch is going to be backported without any relation to the commit
> message as it has Fixes line.

People doing backports complain with some regularity about poor commit
messages, especailly now that so many patches get a CVE. We need to do
better.

> > RDMA/mlx5: Support non-page size aligned DMABUF mkeys
> > 
> > The mkey page size for DMABUF is fixed at PAGE_SIZE because we have to
> > support a move operation that could change a large-sized page list
> > into a small page-list and the mkey must be able to represent it.
> > 
> > The test for this is not quite correct, instead of checking the output
> > of mlx5_umem_find_best_pgsz() the call to ib_umem_find_best_pgsz
> > should specify the exact HW/SW restriction - only PAGE_SIZE is
> > accepted.
> > 
> > Then the normal logic for dealing with leading/trailing sub page
> > alignment works correctly and sub page size DMBUF mappings can be
> > supported.
> > 
> > This is particularly painful on 64K kernels.
> 
> Unfortunately, the patch was already merged, so I can't change the
> commit message in for-next branch.

How is it already merged? There was no message from your script - are
you loosing emails??

Jason
Leon Romanovsky June 19, 2024, 8:30 a.m. UTC | #4
On Tue, Jun 18, 2024 at 10:02:33AM -0300, Jason Gunthorpe wrote:
> On Tue, Jun 18, 2024 at 03:08:14PM +0300, Leon Romanovsky wrote:
> > > > Set the mkey for dmabuf at PAGE_SIZE to support any SGL
> > > > after a move operation.
> > > > 
> > > > ib_umem_find_best_pgsz returns 0 on error, so it is
> > > > incorrect to check the returned page_size against PAGE_SIZE
> > > 
> > > This commit message is not clear enough for something that need to be
> > > backported:
> > 
> > This patch is going to be backported without any relation to the commit
> > message as it has Fixes line.
> 
> People doing backports complain with some regularity about poor commit
> messages, especailly now that so many patches get a CVE. We need to do
> better.

It is always true.

> 
> > > RDMA/mlx5: Support non-page size aligned DMABUF mkeys
> > > 
> > > The mkey page size for DMABUF is fixed at PAGE_SIZE because we have to
> > > support a move operation that could change a large-sized page list
> > > into a small page-list and the mkey must be able to represent it.
> > > 
> > > The test for this is not quite correct, instead of checking the output
> > > of mlx5_umem_find_best_pgsz() the call to ib_umem_find_best_pgsz
> > > should specify the exact HW/SW restriction - only PAGE_SIZE is
> > > accepted.
> > > 
> > > Then the normal logic for dealing with leading/trailing sub page
> > > alignment works correctly and sub page size DMBUF mappings can be
> > > supported.
> > > 
> > > This is particularly painful on 64K kernels.
> > 
> > Unfortunately, the patch was already merged, so I can't change the
> > commit message in for-next branch.
> 
> How is it already merged? There was no message from your script - are
> you loosing emails??

In this specific case, no.
I switched too early from branch and my thank-you message (b4 ty ..)
simply wasn't sent.

Thanks

> 
> Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index cbcb14d9a42a..bf25ddb17bce 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -115,6 +115,19 @@  unsigned long __mlx5_umem_find_best_quantized_pgoff(
 		__mlx5_bit_sz(typ, page_offset_fld), 0, scale,                 \
 		page_offset_quantized)
 
+static inline unsigned long
+mlx5_umem_dmabuf_find_best_pgsz(struct ib_umem_dmabuf *umem_dmabuf)
+{
+	/*
+	 * mkeys used for dmabuf are fixed at PAGE_SIZE because we must be able
+	 * to hold any sgl after a move operation. Ideally the mkc page size
+	 * could be changed at runtime to be optimal, but right now the driver
+	 * cannot do that.
+	 */
+	return ib_umem_find_best_pgsz(&umem_dmabuf->umem, PAGE_SIZE,
+				      umem_dmabuf->umem.iova);
+}
+
 enum {
 	MLX5_IB_MMAP_OFFSET_START = 9,
 	MLX5_IB_MMAP_OFFSET_END = 255,
diff --git a/drivers/infiniband/hw/mlx5/odp.c b/drivers/infiniband/hw/mlx5/odp.c
index 4a04cbc5b78a..a524181f34df 100644
--- a/drivers/infiniband/hw/mlx5/odp.c
+++ b/drivers/infiniband/hw/mlx5/odp.c
@@ -705,10 +705,8 @@  static int pagefault_dmabuf_mr(struct mlx5_ib_mr *mr, size_t bcnt,
 		return err;
 	}
 
-	page_size = mlx5_umem_find_best_pgsz(&umem_dmabuf->umem, mkc,
-					     log_page_size, 0,
-					     umem_dmabuf->umem.iova);
-	if (unlikely(page_size < PAGE_SIZE)) {
+	page_size = mlx5_umem_dmabuf_find_best_pgsz(umem_dmabuf);
+	if (!page_size) {
 		ib_umem_dmabuf_unmap_pages(umem_dmabuf);
 		err = -EINVAL;
 	} else {