Message ID | 0-v1-d8f4e1fa84c8+17-rdma_dmabuf_fix_jgg@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [rc] RDMA: Handle the return code from dma_resv_wait_timeout() properly | expand |
On Tue, Aug 16, 2022 at 11:03:20AM -0300, Jason Gunthorpe wrote: > ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure. > > dma_resv_wait_timeout() uses a different scheme: > > * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or > * greater than zero on success. > > This results in ib_umem_dmabuf_map_pages() being non-functional as a > positive return will be understood to be an error by drivers. > > Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence") > Cc: stable@kernel.org > Tested-by: Maor Gottlieb <maorg@nvidia.com> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/core/umem_dmabuf.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > Thanks, applied to -rc.
On Tue, Aug 16, 2022 at 5:03 PM Jason Gunthorpe <jgg@nvidia.com> wrote: > > ib_umem_dmabuf_map_pages() returns 0 on success and -ERRNO on failure. > > dma_resv_wait_timeout() uses a different scheme: > > * Returns -ERESTARTSYS if interrupted, 0 if the wait timed out, or > * greater than zero on success. > > This results in ib_umem_dmabuf_map_pages() being non-functional as a > positive return will be understood to be an error by drivers. > > Fixes: f30bceab16d1 ("RDMA: use dma_resv_wait() instead of extracting the fence") > Cc: stable@kernel.org > Tested-by: Maor Gottlieb <maorg@nvidia.com> > Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> > --- > drivers/infiniband/core/umem_dmabuf.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > Oded, I assume the Habana driver will hit this as well - does this mean you > are not testing upstream kernels? Thanks Jason for letting me know. You are correct, we don't use upstream kernels. We use a back-ported EFA driver for 5.15 which in that version, ib_umem_dmabuf_map_pages() calls dma_resv_excl_fence(). So I guess that's why we didn't encounter this issue. Thanks, oded > > diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c > index fce80a4a5147cd..04c04e6d24c358 100644 > --- a/drivers/infiniband/core/umem_dmabuf.c > +++ b/drivers/infiniband/core/umem_dmabuf.c > @@ -18,6 +18,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) > struct scatterlist *sg; > unsigned long start, end, cur = 0; > unsigned int nmap = 0; > + long ret; > int i; > > dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > @@ -67,9 +68,14 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) > * may be not up-to-date. Wait for the exporter to finish > * the migration. > */ > - return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, > + ret = dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, > DMA_RESV_USAGE_KERNEL, > false, MAX_SCHEDULE_TIMEOUT); > + if (ret < 0) > + return ret; > + if (ret == 0) > + return -ETIMEDOUT; > + return 0; > } > EXPORT_SYMBOL(ib_umem_dmabuf_map_pages); > > > base-commit: 568035b01cfb107af8d2e4bd2fb9aea22cf5b868 > -- > 2.37.2 >
diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c index fce80a4a5147cd..04c04e6d24c358 100644 --- a/drivers/infiniband/core/umem_dmabuf.c +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -18,6 +18,7 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) struct scatterlist *sg; unsigned long start, end, cur = 0; unsigned int nmap = 0; + long ret; int i; dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); @@ -67,9 +68,14 @@ int ib_umem_dmabuf_map_pages(struct ib_umem_dmabuf *umem_dmabuf) * may be not up-to-date. Wait for the exporter to finish * the migration. */ - return dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, + ret = dma_resv_wait_timeout(umem_dmabuf->attach->dmabuf->resv, DMA_RESV_USAGE_KERNEL, false, MAX_SCHEDULE_TIMEOUT); + if (ret < 0) + return ret; + if (ret == 0) + return -ETIMEDOUT; + return 0; } EXPORT_SYMBOL(ib_umem_dmabuf_map_pages);