Message ID | 1-v1-00f59ce24f1f+19f50-umem_1_jgg@nvidia.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | RDMA: Improve use of umem in DMA drivers | expand |
On Tue, Sep 01, 2020 at 09:43:29PM -0300, Jason Gunthorpe wrote: > It is possible for a single SGL to span an aligned boundary, eg if the SGL > is > > 61440 -> 90112 > > Then the length is 28672, which currently limits the block size to > 32k. With a 32k page size the two covering blocks will be: > > 32768->65536 and 65536->98304 > > However, the correct answer is a 128K block size which will span the whole > 28672 bytes in a single block. > > Instead of limiting based on length figure out which high IOVA bits don't > change between the start and end addresses. That is the highest useful > page size. > > Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/core/umem.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > Thanks, Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> Subject: [PATCH 01/14] RDMA/umem: Fix ib_umem_find_best_pgsz() for > mappings that cross a page boundary > > It is possible for a single SGL to span an aligned boundary, eg if the SGL is > > 61440 -> 90112 > > Then the length is 28672, which currently limits the block size to 32k. With a 32k > page size the two covering blocks will be: > > 32768->65536 and 65536->98304 > > However, the correct answer is a 128K block size which will span the whole > 28672 bytes in a single block. > > Instead of limiting based on length figure out which high IOVA bits don't change > between the start and end addresses. That is the highest useful page size. > > Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page > size in an MR") > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/core/umem.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index > 831bff8d52e547..120e98403c345d 100644 > --- a/drivers/infiniband/core/umem.c > +++ b/drivers/infiniband/core/umem.c > @@ -156,8 +156,14 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem > *umem, > return 0; > > va = virt; > - /* max page size not to exceed MR length */ > - mask = roundup_pow_of_two(umem->length); > + /* The best result is the smallest page size that results in the minimum > + * number of required pages. Compute the largest page size that could > + * work based on VA address bits that don't change. > + */ > + mask = pgsz_bitmap & > + GENMASK(BITS_PER_LONG - 1, > + bits_per((umem->length - 1 + umem->address) ^ > + umem->address)); > /* offset into first SGL */ > pgoff = umem->address & ~PAGE_MASK; > > -- Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com>
On Tue, Sep 01, 2020 at 09:43:29PM -0300, Jason Gunthorpe wrote: > It is possible for a single SGL to span an aligned boundary, eg if the SGL > is > > 61440 -> 90112 > > Then the length is 28672, which currently limits the block size to > 32k. With a 32k page size the two covering blocks will be: > > 32768->65536 and 65536->98304 > > However, the correct answer is a 128K block size which will span the whole > 28672 bytes in a single block. > > Instead of limiting based on length figure out which high IOVA bits don't > change between the start and end addresses. That is the highest useful > page size. > > Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/core/umem.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c > index 831bff8d52e547..120e98403c345d 100644 > --- a/drivers/infiniband/core/umem.c > +++ b/drivers/infiniband/core/umem.c > @@ -156,8 +156,14 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, > return 0; > > va = virt; > - /* max page size not to exceed MR length */ > - mask = roundup_pow_of_two(umem->length); > + /* The best result is the smallest page size that results in the minimum > + * number of required pages. Compute the largest page size that could > + * work based on VA address bits that don't change. > + */ > + mask = pgsz_bitmap & > + GENMASK(BITS_PER_LONG - 1, > + bits_per((umem->length - 1 + umem->address) ^ > + umem->address)); The use of umem->address is incorrect here as well, it should be virt. All places on the DMA side that touch address are wrong.. Jason
diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index 831bff8d52e547..120e98403c345d 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -156,8 +156,14 @@ unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, return 0; va = virt; - /* max page size not to exceed MR length */ - mask = roundup_pow_of_two(umem->length); + /* The best result is the smallest page size that results in the minimum + * number of required pages. Compute the largest page size that could + * work based on VA address bits that don't change. + */ + mask = pgsz_bitmap & + GENMASK(BITS_PER_LONG - 1, + bits_per((umem->length - 1 + umem->address) ^ + umem->address)); /* offset into first SGL */ pgoff = umem->address & ~PAGE_MASK;
It is possible for a single SGL to span an aligned boundary, eg if the SGL is 61440 -> 90112 Then the length is 28672, which currently limits the block size to 32k. With a 32k page size the two covering blocks will be: 32768->65536 and 65536->98304 However, the correct answer is a 128K block size which will span the whole 28672 bytes in a single block. Instead of limiting based on length figure out which high IOVA bits don't change between the start and end addresses. That is the highest useful page size. Fixes: 4a35339958f1 ("RDMA/umem: Add API to find best driver supported page size in an MR") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> --- drivers/infiniband/core/umem.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)