diff mbox series

[01/14] RDMA/umem: Fix ib_umem_find_best_pgsz() for mappings that cross a page boundary

Message ID 1-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com (mailing list archive)
State Superseded
Headers show
Series RDMA: Improve use of umem in DMA drivers | expand

Commit Message

Jason Gunthorpe Sept. 2, 2020, 12:43 a.m. UTC
It is possible for a single SGL to span an aligned boundary, eg if the SGL
is

  61440 -> 90112

Then the length is 28672, which currently limits the block size to
32k. With a 32k page size the two covering blocks will be:

  32768->65536 and 65536->98304

However, the correct answer is a 128K block size which will span the whole
28672 bytes in a single block.

Instead of limiting based on length figure out which high IOVA bits don't
change between the start and end addresses. That is the highest useful
page size.

Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
---
 drivers/infiniband/core/umem.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Leon Romanovsky Sept. 2, 2020, 9:24 a.m. UTC | #1
On Tue, Sep 01, 2020 at 09:43:29PM -0300, Jason Gunthorpe wrote:
> It is possible for a single SGL to span an aligned boundary, eg if the SGL
> is
>
>   61440 -> 90112
>
> Then the length is 28672, which currently limits the block size to
> 32k. With a 32k page size the two covering blocks will be:
>
>   32768->65536 and 65536->98304
>
> However, the correct answer is a 128K block size which will span the whole
> 28672 bytes in a single block.
>
> Instead of limiting based on length figure out which high IOVA bits don't
> change between the start and end addresses. That is the highest useful
> page size.
>
> Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/umem.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>

Thanks,
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Shiraz Saleem Sept. 3, 2020, 2:11 p.m. UTC | #2
> Subject: [PATCH 01/14] RDMA/umem: Fix ib_umem_find_best_pgsz() for
> mappings that cross a page boundary
> 
> It is possible for a single SGL to span an aligned boundary, eg if the SGL is
> 
>   61440 -> 90112
> 
> Then the length is 28672, which currently limits the block size to 32k. With a 32k
> page size the two covering blocks will be:
> 
>   32768->65536 and 65536->98304
> 
> However, the correct answer is a 128K block size which will span the whole
> 28672 bytes in a single block.
> 
> Instead of limiting based on length figure out which high IOVA bits don't change
> between the start and end addresses. That is the highest useful page size.
> 
> Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page
> size in an MR")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/umem.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index
> 831bff8d52e547..120e98403c345d 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -156,8 +156,14 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem
> *umem,
>  		return 0;
> 
>  	va = virt;
> -	/* max page size not to exceed MR length */
> -	mask = roundup_pow_of_two(umem->length);
> +	/* The best result is the smallest page size that results in the minimum
> +	 * number of required pages. Compute the largest page size that could
> +	 * work based on VA address bits that don't change.
> +	 */
> +	mask = pgsz_bitmap &
> +	       GENMASK(BITS_PER_LONG - 1,
> +		       bits_per((umem->length - 1 + umem->address) ^
> +				umem->address));
>  	/* offset into first SGL */
>  	pgoff = umem->address & ~PAGE_MASK;
> 
> --

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
Jason Gunthorpe Sept. 4, 2020, 10:30 p.m. UTC | #3
On Tue, Sep 01, 2020 at 09:43:29PM -0300, Jason Gunthorpe wrote:
> It is possible for a single SGL to span an aligned boundary, eg if the SGL
> is
> 
>   61440 -> 90112
> 
> Then the length is 28672, which currently limits the block size to
> 32k. With a 32k page size the two covering blocks will be:
> 
>   32768->65536 and 65536->98304
> 
> However, the correct answer is a 128K block size which will span the whole
> 28672 bytes in a single block.
> 
> Instead of limiting based on length figure out which high IOVA bits don't
> change between the start and end addresses. That is the highest useful
> page size.
> 
> Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR")
> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
> ---
>  drivers/infiniband/core/umem.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index 831bff8d52e547..120e98403c345d 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -156,8 +156,14 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
>  		return 0;
>  
>  	va = virt;
> -	/* max page size not to exceed MR length */
> -	mask = roundup_pow_of_two(umem->length);
> +	/* The best result is the smallest page size that results in the minimum
> +	 * number of required pages. Compute the largest page size that could
> +	 * work based on VA address bits that don't change.
> +	 */
> +	mask = pgsz_bitmap &
> +	       GENMASK(BITS_PER_LONG - 1,
> +		       bits_per((umem->length - 1 + umem->address) ^
> +				umem->address));

The use of umem->address is incorrect here as well, it should be virt.

All places on the DMA side that touch address are wrong..

Jason
diff mbox series

Patch

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index 831bff8d52e547..120e98403c345d 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -156,8 +156,14 @@  unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem,
 		return 0;
 
 	va = virt;
-	/* max page size not to exceed MR length */
-	mask = roundup_pow_of_two(umem->length);
+	/* The best result is the smallest page size that results in the minimum
+	 * number of required pages. Compute the largest page size that could
+	 * work based on VA address bits that don't change.
+	 */
+	mask = pgsz_bitmap &
+	       GENMASK(BITS_PER_LONG - 1,
+		       bits_per((umem->length - 1 + umem->address) ^
+				umem->address));
 	/* offset into first SGL */
 	pgoff = umem->address & ~PAGE_MASK;