Message ID | 1602799365-138199-1-git-send-email-jianxin.xiong@intel.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | RDMA: Add dma-buf support | expand |
On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > +static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) > +{ > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > + > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > + > + ib_umem_dmabuf_unmap_pages(&umem_dmabuf->umem, true); > + queue_work(ib_wq, &umem_dmabuf->work); Do we really want to queue remapping or should it wait until there is a page fault? What do GPUs do? Jason
> -----Original Message----- > From: Jason Gunthorpe <jgg@ziepe.ca> > Sent: Friday, October 16, 2020 12:00 PM > To: Xiong, Jianxin <jianxin.xiong@intel.com> > Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky > <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel > <daniel.vetter@intel.com> > Subject: Re: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as user memory region > > On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > > > +static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment > > +*attach) { > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > > + > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > > + > > + ib_umem_dmabuf_unmap_pages(&umem_dmabuf->umem, true); > > + queue_work(ib_wq, &umem_dmabuf->work); > > Do we really want to queue remapping or should it wait until there is a page fault? Queuing remapping here has performance advantage because it reduces the chance of getting the page fault. > > What do GPUs do? > > Jason
On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, > + unsigned long addr, size_t size, > + int dmabuf_fd, int access, > + const struct ib_umem_dmabuf_ops *ops) > +{ > + struct dma_buf *dmabuf; > + struct ib_umem_dmabuf *umem_dmabuf; > + struct ib_umem *umem; > + unsigned long end; > + long ret; > + > + if (check_add_overflow(addr, (unsigned long)size, &end)) > + return ERR_PTR(-EINVAL); > + > + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) > + return ERR_PTR(-EINVAL); > + > + if (unlikely(!ops || !ops->invalidate || !ops->update)) > + return ERR_PTR(-EINVAL); > + > + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); > + if (!umem_dmabuf) > + return ERR_PTR(-ENOMEM); > + > + umem_dmabuf->ops = ops; > + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); > + > + umem = &umem_dmabuf->umem; > + umem->ibdev = device; > + umem->length = size; > + umem->address = addr; addr here is offset within the dma buf, but this code does nothing with it. dma_buf_map_attachment gives a complete SGL for the entire DMA buf, but offset/length select a subset. You need to edit the sgls to make them properly span the sub-range and follow the peculiar rules for how SGLs in ib_umem's have to be constructed. Who validates that the total dma length of the SGL is exactly equal to length? That is really important too. Also, dma_buf_map_attachment() does not do the correct dma mapping for RDMA, eg it does not use ib_dma_map(). This is not a problem for mlx5 but it is troublesome to put in the core code. Jason
> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, October 16, 2020 5:28 PM > To: Xiong, Jianxin <jianxin.xiong@intel.com> > Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky > <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel > <daniel.vetter@intel.com> > Subject: Re: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as user memory region > > On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > > +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, > > + unsigned long addr, size_t size, > > + int dmabuf_fd, int access, > > + const struct ib_umem_dmabuf_ops *ops) { > > + struct dma_buf *dmabuf; > > + struct ib_umem_dmabuf *umem_dmabuf; > > + struct ib_umem *umem; > > + unsigned long end; > > + long ret; > > + > > + if (check_add_overflow(addr, (unsigned long)size, &end)) > > + return ERR_PTR(-EINVAL); > > + > > + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) > > + return ERR_PTR(-EINVAL); > > + > > + if (unlikely(!ops || !ops->invalidate || !ops->update)) > > + return ERR_PTR(-EINVAL); > > + > > + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); > > + if (!umem_dmabuf) > > + return ERR_PTR(-ENOMEM); > > + > > + umem_dmabuf->ops = ops; > > + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); > > + > > + umem = &umem_dmabuf->umem; > > + umem->ibdev = device; > > + umem->length = size; > > + umem->address = addr; > > addr here is offset within the dma buf, but this code does nothing with it. > The current code assumes 0 offset, and 'addr' is the nominal starting address of the buffer. If this is to be changed to offset, then yes, some more handling is needed as you mentioned below. > dma_buf_map_attachment gives a complete SGL for the entire DMA buf, but offset/length select a subset. > > You need to edit the sgls to make them properly span the sub-range and follow the peculiar rules for how SGLs in ib_umem's have to be > constructed. > > Who validates that the total dma length of the SGL is exactly equal to length? That is really important too. > > Also, dma_buf_map_attachment() does not do the correct dma mapping for RDMA, eg it does not use ib_dma_map(). This is not a problem > for mlx5 but it is troublesome to put in the core code. ib_dma_map() uses dma_map_single(), GPU drivers use dma_map_resource() for dma_buf_map_attachment(). They belong to the same family, but take different address type (kernel address vs MMIO physical address). Could you elaborate what the problem could be for non-mlx5 HCAs? > > Jason
On Sat, Oct 17, 2020 at 12:57:21AM +0000, Xiong, Jianxin wrote: > > From: Jason Gunthorpe <jgg@nvidia.com> > > Sent: Friday, October 16, 2020 5:28 PM > > To: Xiong, Jianxin <jianxin.xiong@intel.com> > > Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky > > <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel > > <daniel.vetter@intel.com> > > Subject: Re: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as user memory region > > > > On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > > > +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, > > > + unsigned long addr, size_t size, > > > + int dmabuf_fd, int access, > > > + const struct ib_umem_dmabuf_ops *ops) { > > > + struct dma_buf *dmabuf; > > > + struct ib_umem_dmabuf *umem_dmabuf; > > > + struct ib_umem *umem; > > > + unsigned long end; > > > + long ret; > > > + > > > + if (check_add_overflow(addr, (unsigned long)size, &end)) > > > + return ERR_PTR(-EINVAL); > > > + > > > + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) > > > + return ERR_PTR(-EINVAL); > > > + > > > + if (unlikely(!ops || !ops->invalidate || !ops->update)) > > > + return ERR_PTR(-EINVAL); > > > + > > > + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); > > > + if (!umem_dmabuf) > > > + return ERR_PTR(-ENOMEM); > > > + > > > + umem_dmabuf->ops = ops; > > > + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); > > > + > > > + umem = &umem_dmabuf->umem; > > > + umem->ibdev = device; > > > + umem->length = size; > > > + umem->address = addr; > > > > addr here is offset within the dma buf, but this code does nothing with it. > > > The current code assumes 0 offset, and 'addr' is the nominal starting address of the > buffer. If this is to be changed to offset, then yes, some more handling is needed > as you mentioned below. There is no such thing as 'nominal starting address' If the user is to provide any argument it can only be offset and length. > > Also, dma_buf_map_attachment() does not do the correct dma mapping > > for RDMA, eg it does not use ib_dma_map(). This is not a problem > > for mlx5 but it is troublesome to put in the core code. > > ib_dma_map() uses dma_map_single(), GPU drivers use dma_map_resource() for > dma_buf_map_attachment(). They belong to the same family, but take different > address type (kernel address vs MMIO physical address). Could you elaborate what > the problem could be for non-mlx5 HCAs? They use the virtual dma ops which we intend to remove Jason
On Sat, Oct 17, 2020 at 9:05 PM Jason Gunthorpe <jgg@ziepe.ca> wrote: > > On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > > > +static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) > > +{ > > + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; > > + > > + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); > > + > > + ib_umem_dmabuf_unmap_pages(&umem_dmabuf->umem, true); > > + queue_work(ib_wq, &umem_dmabuf->work); > > Do we really want to queue remapping or should it wait until there is > a page fault? > > What do GPUs do? Atm no gpu drivers in upstream that use buffer-based memory management and support page faults in the hw. So we have to pull the entire thing in anyway and use the dma_fence stuff to track what's busy. For faulting hardware I'd wait until the first page fault and then map in the entire range again (you get the entire thing anyway). Since the move_notify happened because the buffer is moving, you'll end up stalling anyway. Plus if you prefault right away you need some thrashing limiter to not do that when you get immediate move_notify again. As a first thing I'd do the same thing you do for mmu notifier ranges, since it's kinda similarish. -Daniel
> -----Original Message----- > From: Jason Gunthorpe <jgg@nvidia.com> > Sent: Friday, October 16, 2020 6:05 PM > To: Xiong, Jianxin <jianxin.xiong@intel.com> > Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky > <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel > <daniel.vetter@intel.com> > Subject: Re: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as user memory region > > On Sat, Oct 17, 2020 at 12:57:21AM +0000, Xiong, Jianxin wrote: > > > From: Jason Gunthorpe <jgg@nvidia.com> > > > Sent: Friday, October 16, 2020 5:28 PM > > > To: Xiong, Jianxin <jianxin.xiong@intel.com> > > > Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; > > > Doug Ledford <dledford@redhat.com>; Leon Romanovsky > > > <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian > > > Koenig <christian.koenig@amd.com>; Vetter, Daniel > > > <daniel.vetter@intel.com> > > > Subject: Re: [PATCH v5 1/5] RDMA/umem: Support importing dma-buf as > > > user memory region > > > > > > On Thu, Oct 15, 2020 at 03:02:45PM -0700, Jianxin Xiong wrote: > > > > +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, > > > > + unsigned long addr, size_t size, > > > > + int dmabuf_fd, int access, > > > > + const struct ib_umem_dmabuf_ops *ops) { > > > > + struct dma_buf *dmabuf; > > > > + struct ib_umem_dmabuf *umem_dmabuf; > > > > + struct ib_umem *umem; > > > > + unsigned long end; > > > > + long ret; > > > > + > > > > + if (check_add_overflow(addr, (unsigned long)size, &end)) > > > > + return ERR_PTR(-EINVAL); > > > > + > > > > + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) > > > > + return ERR_PTR(-EINVAL); > > > > + > > > > + if (unlikely(!ops || !ops->invalidate || !ops->update)) > > > > + return ERR_PTR(-EINVAL); > > > > + > > > > + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); > > > > + if (!umem_dmabuf) > > > > + return ERR_PTR(-ENOMEM); > > > > + > > > > + umem_dmabuf->ops = ops; > > > > + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); > > > > + > > > > + umem = &umem_dmabuf->umem; > > > > + umem->ibdev = device; > > > > + umem->length = size; > > > > + umem->address = addr; > > > > > > addr here is offset within the dma buf, but this code does nothing with it. > > > > > The current code assumes 0 offset, and 'addr' is the nominal starting > > address of the buffer. If this is to be changed to offset, then yes, > > some more handling is needed as you mentioned below. > > There is no such thing as 'nominal starting address' > > If the user is to provide any argument it can only be offset and length. > > > > Also, dma_buf_map_attachment() does not do the correct dma mapping > > > for RDMA, eg it does not use ib_dma_map(). This is not a problem for > > > mlx5 but it is troublesome to put in the core code. > > > > ib_dma_map() uses dma_map_single(), GPU drivers use dma_map_resource() > > for dma_buf_map_attachment(). They belong to the same family, but take > > different address type (kernel address vs MMIO physical address). > > Could you elaborate what the problem could be for non-mlx5 HCAs? > > They use the virtual dma ops which we intend to remove We can have a check with the dma device before attaching the dma-buf and thus ib_umem_dmabuf_get() call from such drivers would fail. Something like: #ifdef CONFIG_DMA_VIRT_OPS if (device->dma_device->dma_ops == &dma_virt_ops) return ERR_PTR(-EINVAL); #endif > > Jason
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index ccf2670..8ab4eea 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -40,5 +40,5 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \ uverbs_std_types_srq.o \ uverbs_std_types_wq.o \ uverbs_std_types_qp.o -ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o +ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o umem_dmabuf.o ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c index e9fecbd..8c608a5 100644 --- a/drivers/infiniband/core/umem.c +++ b/drivers/infiniband/core/umem.c @@ -2,6 +2,7 @@ * Copyright (c) 2005 Topspin Communications. All rights reserved. * Copyright (c) 2005 Cisco Systems. All rights reserved. * Copyright (c) 2005 Mellanox Technologies. All rights reserved. + * Copyright (c) 2020 Intel Corporation. All rights reserved. * * This software is available to you under a choice of one of two * licenses. You may choose to be licensed under the terms of the GNU @@ -43,6 +44,7 @@ #include <rdma/ib_umem_odp.h> #include "uverbs.h" +#include "umem_dmabuf.h" static void __ib_umem_release(struct ib_device *dev, struct ib_umem *umem, int dirty) { @@ -269,6 +271,8 @@ void ib_umem_release(struct ib_umem *umem) { if (!umem) return; + if (umem->is_dmabuf) + return ib_umem_dmabuf_release(umem); if (umem->is_odp) return ib_umem_odp_release(to_ib_umem_odp(umem)); diff --git a/drivers/infiniband/core/umem_dmabuf.c b/drivers/infiniband/core/umem_dmabuf.c new file mode 100644 index 0000000..4d6d6f3 --- /dev/null +++ b/drivers/infiniband/core/umem_dmabuf.c @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) +/* + * Copyright (c) 2020 Intel Corporation. All rights reserved. + */ + +#include <linux/dma-buf.h> +#include <linux/dma-resv.h> +#include <linux/dma-mapping.h> + +#include "uverbs.h" +#include "umem_dmabuf.h" + +struct ib_umem_dmabuf { + struct ib_umem umem; + struct dma_buf_attachment *attach; + struct sg_table *sgt; + const struct ib_umem_dmabuf_ops *ops; + void *device_context; + struct work_struct work; +}; + +static inline struct ib_umem_dmabuf *to_ib_umem_dmabuf(struct ib_umem *umem) +{ + return container_of(umem, struct ib_umem_dmabuf, umem); +} + +static int ib_umem_dmabuf_map_pages(struct ib_umem *umem, bool first) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + struct sg_table *sgt; + struct dma_fence *fence; + int err; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + + sgt = dma_buf_map_attachment(umem_dmabuf->attach, + DMA_BIDIRECTIONAL); + + if (IS_ERR(sgt)) { + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return PTR_ERR(sgt); + } + + umem_dmabuf->umem.sg_head = *sgt; + umem_dmabuf->umem.nmap = sgt->nents; + umem_dmabuf->sgt = sgt; + + /* + * Although the sg list is valid now, the content of the pages + * may be not up-to-date. Wait for the exporter to finish + * the migration. + */ + fence = dma_resv_get_excl(umem_dmabuf->attach->dmabuf->resv); + if (fence) + dma_fence_wait(fence, false); + + if (first) + err = umem_dmabuf->ops->init(umem, + umem_dmabuf->device_context); + else + err = umem_dmabuf->ops->update(umem, + umem_dmabuf->device_context); + + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return err; +} + +int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, void *device_context) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + + umem_dmabuf->device_context = device_context; + return ib_umem_dmabuf_map_pages(umem, true); +} +EXPORT_SYMBOL(ib_umem_dmabuf_init_mapping); + +bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + bool ret; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ret = !!umem_dmabuf->sgt; + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + return ret; +} +EXPORT_SYMBOL(ib_umem_dmabuf_mapping_ready); + +static void ib_umem_dmabuf_unmap_pages(struct ib_umem *umem, bool do_invalidate) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); + + if (!umem_dmabuf->sgt) + return; + + if (do_invalidate) + umem_dmabuf->ops->invalidate(umem, umem_dmabuf->device_context); + + dma_buf_unmap_attachment(umem_dmabuf->attach, umem_dmabuf->sgt, + DMA_BIDIRECTIONAL); + umem_dmabuf->sgt = NULL; +} + +static void ib_umem_dmabuf_work(struct work_struct *work) +{ + struct ib_umem_dmabuf *umem_dmabuf; + int ret; + + umem_dmabuf = container_of(work, struct ib_umem_dmabuf, work); + ret = ib_umem_dmabuf_map_pages(&umem_dmabuf->umem, false); + if (ret) + pr_debug("%s: failed to update dmabuf mapping, error %d\n", + __func__, ret); +} + +static void ib_umem_dmabuf_invalidate_cb(struct dma_buf_attachment *attach) +{ + struct ib_umem_dmabuf *umem_dmabuf = attach->importer_priv; + + dma_resv_assert_held(umem_dmabuf->attach->dmabuf->resv); + + ib_umem_dmabuf_unmap_pages(&umem_dmabuf->umem, true); + queue_work(ib_wq, &umem_dmabuf->work); +} + +static struct dma_buf_attach_ops ib_umem_dmabuf_attach_ops = { + .allow_peer2peer = 1, + .move_notify = ib_umem_dmabuf_invalidate_cb, +}; + +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, size_t size, + int dmabuf_fd, int access, + const struct ib_umem_dmabuf_ops *ops) +{ + struct dma_buf *dmabuf; + struct ib_umem_dmabuf *umem_dmabuf; + struct ib_umem *umem; + unsigned long end; + long ret; + + if (check_add_overflow(addr, (unsigned long)size, &end)) + return ERR_PTR(-EINVAL); + + if (unlikely(PAGE_ALIGN(end) < PAGE_SIZE)) + return ERR_PTR(-EINVAL); + + if (unlikely(!ops || !ops->invalidate || !ops->update)) + return ERR_PTR(-EINVAL); + + umem_dmabuf = kzalloc(sizeof(*umem_dmabuf), GFP_KERNEL); + if (!umem_dmabuf) + return ERR_PTR(-ENOMEM); + + umem_dmabuf->ops = ops; + INIT_WORK(&umem_dmabuf->work, ib_umem_dmabuf_work); + + umem = &umem_dmabuf->umem; + umem->ibdev = device; + umem->length = size; + umem->address = addr; + umem->writable = ib_access_writable(access); + umem->is_dmabuf = 1; + + dmabuf = dma_buf_get(dmabuf_fd); + if (IS_ERR(dmabuf)) { + ret = PTR_ERR(dmabuf); + goto out_free_umem; + } + + umem_dmabuf->attach = dma_buf_dynamic_attach( + dmabuf, + device->dma_device, + &ib_umem_dmabuf_attach_ops, + umem_dmabuf); + if (IS_ERR(umem_dmabuf->attach)) { + ret = PTR_ERR(umem_dmabuf->attach); + goto out_release_dmabuf; + } + + return umem; + +out_release_dmabuf: + dma_buf_put(dmabuf); + +out_free_umem: + kfree(umem_dmabuf); + return ERR_PTR(ret); +} +EXPORT_SYMBOL(ib_umem_dmabuf_get); + +void ib_umem_dmabuf_release(struct ib_umem *umem) +{ + struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(umem); + struct dma_buf *dmabuf = umem_dmabuf->attach->dmabuf; + + dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL); + ib_umem_dmabuf_unmap_pages(umem, false); + dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv); + + dma_buf_detach(dmabuf, umem_dmabuf->attach); + dma_buf_put(dmabuf); + kfree(umem_dmabuf); +} diff --git a/drivers/infiniband/core/umem_dmabuf.h b/drivers/infiniband/core/umem_dmabuf.h new file mode 100644 index 0000000..485f653 --- /dev/null +++ b/drivers/infiniband/core/umem_dmabuf.h @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */ +/* + * Copyright (c) 2020 Intel Corporation. All rights reserved. + */ + +#ifndef UMEM_DMABUF_H +#define UMEM_DMABUF_H + +void ib_umem_dmabuf_release(struct ib_umem *umem); + +#endif /* UMEM_DMABUF_H */ diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h index 7059750..fac8553 100644 --- a/include/rdma/ib_umem.h +++ b/include/rdma/ib_umem.h @@ -1,6 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ /* * Copyright (c) 2007 Cisco Systems. All rights reserved. + * Copyright (c) 2020 Intel Corporation. All rights reserved. */ #ifndef IB_UMEM_H @@ -22,12 +23,19 @@ struct ib_umem { unsigned long address; u32 writable : 1; u32 is_odp : 1; + u32 is_dmabuf : 1; struct work_struct work; struct sg_table sg_head; int nmap; unsigned int sg_nents; }; +struct ib_umem_dmabuf_ops { + int (*init)(struct ib_umem *umem, void *context); + int (*update)(struct ib_umem *umem, void *context); + int (*invalidate)(struct ib_umem *umem, void *context); +}; + /* Returns the offset of the umem start relative to the first page. */ static inline int ib_umem_offset(struct ib_umem *umem) { @@ -79,6 +87,12 @@ int ib_umem_copy_from(void *dst, struct ib_umem *umem, size_t offset, unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, unsigned long pgsz_bitmap, unsigned long virt); +struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, size_t size, + int dmabuf_fd, int access, + const struct ib_umem_dmabuf_ops *ops); +int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, void *device_context); +bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem); #else /* CONFIG_INFINIBAND_USER_MEM */ @@ -101,7 +115,23 @@ static inline unsigned long ib_umem_find_best_pgsz(struct ib_umem *umem, { return 0; } +static inline struct ib_umem *ib_umem_dmabuf_get(struct ib_device *device, + unsigned long addr, + size_t size, int dmabuf_fd, + int access, + struct ib_umem_dmabuf_ops *ops) +{ + return ERR_PTR(-EINVAL); +} +static inline int ib_umem_dmabuf_init_mapping(struct ib_umem *umem, + void *device_context) +{ + return -EINVAL; +} +static inline bool ib_umem_dmabuf_mapping_ready(struct ib_umem *umem) +{ + return false; +} #endif /* CONFIG_INFINIBAND_USER_MEM */ - #endif /* IB_UMEM_H */