[RFC,0/3] RDMA: Add dma-buf support

Message ID	1587056973-101760-1-git-send-email-jianxin.xiong@intel.com (mailing list archive)
Headers	show Return-Path: <SRS0=U60/=6A=vger.kernel.org=linux-rdma-owner@kernel.org> IronPort-SDR: RVr9OLikVma2fczk42yrBtP1sddJFXVgWlmWJmTOqtXbjopKpTipElNXW5cnTHy2eBy5OFuXHW a8zXD0UhVMzg== IronPort-SDR: fMSh/2dfaS/1yOWVPfRxx7I1LucZnAyGlEfhSlOyGovBg9EMaxM/ii8qC6lX8wCcVjNv61CAVk A3Lblpueghvw== From: Jianxin Xiong <jianxin.xiong@intel.com> To: linux-rdma@vger.kernel.org Cc: Jianxin Xiong <jianxin.xiong@intel.com>, Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@ziepe.ca>, Sumit Semwal <sumit.semwal@linaro.org>, Leon Romanovsky <leon@kernel.org> Subject: [RFC PATCH 0/3] RDMA: Add dma-buf support Date: Thu, 16 Apr 2020 10:09:30 -0700 Message-Id: <1587056973-101760-1-git-send-email-jianxin.xiong@intel.com> Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk
Series	RDMA: Add dma-buf support \| expand [RFC,0/3] RDMA: Add dma-buf support [RFC,1/3] RDMA/umem: Support importing dma-buf as user memory region [RFC,2/3] RDMA/uverbs: Add uverbs commands for fd-based MR registration [RFC,3/3] RDMA/mlx5: Support new uverbs commands for registering fd-based MR

Xiong, Jianxin April 16, 2020, 5:09 p.m. UTC

This patch set adds dma-buf importer role to the RDMA driver and thus
provides a non-proprietary approach for supporting RDMA to/from buffers
allocated from device local memory (e.g. GPU VRAM). 

Dma-buf is a standard mechanism in Linux kernel for sharing buffers
among different device drivers. It is supported by mainstream GPU
drivers. By using ioctl calls over the devices under /dev/dri/, user
space applications can allocate and export GPU buffers as dma-buf
objetcs with associated file descriptors.

In order to use the exported GPU buffer for RDMA operations, the RDMA
driver needs to be able to import dma-buf objects. This happens at the
time of memory registration. A GPU buffer is registered as a special
type of user space memory region with the dma-buf file descriptor as
an extra parameter. The uverbs API needs to be extended to allow the
extra parameter be passed from user space to kernel.

Vendor RDMA drivers need to be modified in order to take advantage of
the new feature. A patch for the mlx5 driver is provided as an example.

Related user space RDMA library changes will be provided as a separate
patch set.

Jianxin Xiong (3):
  RDMA/umem: Support importing dma-buf as user memory region
  RDMA/uverbs: Add uverbs commands for fd-based MR registration
  RDMA/mlx5: Support new uverbs commands for registering fd-based MR

 drivers/infiniband/Kconfig            |  10 ++
 drivers/infiniband/core/Makefile      |   1 +
 drivers/infiniband/core/device.c      |   2 +
 drivers/infiniband/core/umem.c        |   3 +
 drivers/infiniband/core/umem_dmabuf.c | 100 +++++++++++++++++++
 drivers/infiniband/core/uverbs_cmd.c  | 179 +++++++++++++++++++++++++++++++++-
 drivers/infiniband/hw/mlx5/main.c     |   6 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h  |   7 ++
 drivers/infiniband/hw/mlx5/mr.c       |  85 ++++++++++++++--
 include/rdma/ib_umem.h                |   5 +
 include/rdma/ib_umem_dmabuf.h         |  50 ++++++++++
 include/rdma/ib_verbs.h               |   8 ++
 include/uapi/rdma/ib_user_verbs.h     |  28 ++++++
 13 files changed, 472 insertions(+), 12 deletions(-)
 create mode 100644 drivers/infiniband/core/umem_dmabuf.c
 create mode 100644 include/rdma/ib_umem_dmabuf.h

Jason Gunthorpe April 16, 2020, 5:54 p.m. UTC | #1

On Thu, Apr 16, 2020 at 10:09:30AM -0700, Jianxin Xiong wrote:
> This patch set adds dma-buf importer role to the RDMA driver and thus
> provides a non-proprietary approach for supporting RDMA to/from buffers
> allocated from device local memory (e.g. GPU VRAM). 

How exactly does this allow access to GPU VRAM?

dma_buf_attach() cannot return non-struct page memory in the sgt, and
I'm not sure the API has enough information to do use the p2pdma stuff
to even establish a p2p mapping.

We've already been over this in another thread.. There is a way to
improve things to get there, but I don't understand how this patch
series is claming to be able to work with VRAM - if it is that means
there is a bug in a GPU driver that should be squashed.

Other than that, this seems broadly reasonable to me as a way to
access a DMA buf pointing at system memory, though it would be nice to
have a rational for why we should do this rather than rely on mmap'd
versions of a dma buf.

Jason

Xiong, Jianxin April 16, 2020, 7:08 p.m. UTC | #2

> > This patch set adds dma-buf importer role to the RDMA driver and thus
> > provides a non-proprietary approach for supporting RDMA to/from
> > buffers allocated from device local memory (e.g. GPU VRAM).
> 
> How exactly does this allow access to GPU VRAM?
> 
> dma_buf_attach() cannot return non-struct page memory in the sgt, and I'm not sure the API has enough information to do use the p2pdma
> stuff to even establish a p2p mapping.
> 
> We've already been over this in another thread.. There is a way to improve things to get there, but I don't understand how this patch series
> is claming to be able to work with VRAM - if it is that means there is a bug in a GPU driver that should be squashed.
> 

Right, the GPU driver needs to cooperate to get the thing to work as expected. The "p2p" flag and related GPU driver changes proposed in other threads would ensure VRAM is really used.  Alternatively, a GPU driver can have a working mode that assumes p2p mapping capability of the client. Either way, the patches to the RDMA driver would be mostly identical except for adding the use of the "p2p" flag.  

> Other than that, this seems broadly reasonable to me as a way to access a DMA buf pointing at system memory, though it would be nice to
> have a rational for why we should do this rather than rely on mmap'd versions of a dma buf.
> 

> Jason

Jason Gunthorpe April 16, 2020, 7:34 p.m. UTC | #3

On Thu, Apr 16, 2020 at 07:08:15PM +0000, Xiong, Jianxin wrote:
> > > This patch set adds dma-buf importer role to the RDMA driver and thus
> > > provides a non-proprietary approach for supporting RDMA to/from
> > > buffers allocated from device local memory (e.g. GPU VRAM).
> > 
> > How exactly does this allow access to GPU VRAM?
> > 
> > dma_buf_attach() cannot return non-struct page memory in the sgt,
> > and I'm not sure the API has enough information to do use the
> > p2pdma stuff to even establish a p2p mapping.
> > 
> > We've already been over this in another thread.. There is a way to
> > improve things to get there, but I don't understand how this patch
> > series is claming to be able to work with VRAM - if it is that
> > means there is a bug in a GPU driver that should be squashed.
> > 
> 
> Right, the GPU driver needs to cooperate to get the thing to work as
> expected. The "p2p" flag and related GPU driver changes proposed in
> other threads would ensure VRAM is really used.  Alternatively, a
> GPU driver can have a working mode that assumes p2p mapping
> capability of the client. Either way, the patches to the RDMA driver
> would be mostly identical except for adding the use of the "p2p"
> flag.

I think the other thread has explained this would not be "mostly
identical" but here is significant work to rip out the scatter list
from the umem.

So, I'm back to my original ask, can you justify adding this if it
cannot do VRAM? What is the use case?

Jason

Xiong, Jianxin April 16, 2020, 9:02 p.m. UTC | #4

> >
> > Right, the GPU driver needs to cooperate to get the thing to work as
> > expected. The "p2p" flag and related GPU driver changes proposed in
> > other threads would ensure VRAM is really used.  Alternatively, a GPU
> > driver can have a working mode that assumes p2p mapping capability of
> > the client. Either way, the patches to the RDMA driver would be mostly
> > identical except for adding the use of the "p2p"
> > flag.
> 
> I think the other thread has explained this would not be "mostly identical" but here is significant work to rip out the scatter list from the
> umem.
> 

Probably we are referring to different threads here. Could you kindly refer me to the thread you mentioned? I was referring to the thread about patching dma-buf and GPU driver: https://www.spinics.net/lists/amd-gfx/msg47022.html

> So, I'm back to my original ask, can you justify adding this if it cannot do VRAM? What is the use case?

Working with VRAM is the goal. This patch has been tested with a modified GPU driver that has dma_buf_ops set up to not migrate the buffer to system memory when attached. The GPU drivers and the RDMA drivers can be improved independently and it doesn't hurt to have the RDMA driver ready before the GPU drivers.

> 
> Jason

Jason Gunthorpe April 17, 2020, 12:35 p.m. UTC | #5

On Thu, Apr 16, 2020 at 09:02:51PM +0000, Xiong, Jianxin wrote:
> > >
> > > Right, the GPU driver needs to cooperate to get the thing to work as
> > > expected. The "p2p" flag and related GPU driver changes proposed in
> > > other threads would ensure VRAM is really used.  Alternatively, a GPU
> > > driver can have a working mode that assumes p2p mapping capability of
> > > the client. Either way, the patches to the RDMA driver would be mostly
> > > identical except for adding the use of the "p2p"
> > > flag.
> > 
> > I think the other thread has explained this would not be "mostly identical" but here is significant work to rip out the scatter list from the
> > umem.
> > 
> 
> Probably we are referring to different threads here. Could you
> kindly refer me to the thread you mentioned? I was referring to the
> thread about patching dma-buf and GPU driver:
> https://www.spinics.net/lists/amd-gfx/msg47022.html

https://lore.kernel.org/linux-media/20200311152838.GA24280@infradead.org/

> > So, I'm back to my original ask, can you justify adding this if it
> > cannot do VRAM? What is the use case?
> 
> Working with VRAM is the goal. This patch has been tested with a
> modified GPU driver that has dma_buf_ops set up to not migrate the
> buffer to system memory when attached. The GPU drivers and the RDMA
> drivers can be improved independently and it doesn't hurt to have
> the RDMA driver ready before the GPU drivers.

Well, if there is no other use case then this series will have to wait
until someone does all the other work to make P2P work upstream.

Jason

Xiong, Jianxin April 17, 2020, 4:49 p.m. UTC | #6

> > > >
> > > > Right, the GPU driver needs to cooperate to get the thing to work
> > > > as expected. The "p2p" flag and related GPU driver changes
> > > > proposed in other threads would ensure VRAM is really used.
> > > > Alternatively, a GPU driver can have a working mode that assumes
> > > > p2p mapping capability of the client. Either way, the patches to
> > > > the RDMA driver would be mostly identical except for adding the use of the "p2p"
> > > > flag.
> > >
> > > I think the other thread has explained this would not be "mostly
> > > identical" but here is significant work to rip out the scatter list from the umem.
> > >
> >
> > Probably we are referring to different threads here. Could you kindly
> > refer me to the thread you mentioned? I was referring to the thread
> > about patching dma-buf and GPU driver:
> > https://www.spinics.net/lists/amd-gfx/msg47022.html
> 
> https://lore.kernel.org/linux-media/20200311152838.GA24280@infradead.org/
> 

Thanks. We are actually looking at the same series but somehow I skipped the details of the single patch that looks simplest which turns out to have most complication. I agree if scatter list is not to be used, there are going to be significant work involved.

> > > So, I'm back to my original ask, can you justify adding this if it
> > > cannot do VRAM? What is the use case?
> >
> > Working with VRAM is the goal. This patch has been tested with a
> > modified GPU driver that has dma_buf_ops set up to not migrate the
> > buffer to system memory when attached. The GPU drivers and the RDMA
> > drivers can be improved independently and it doesn't hurt to have the
> > RDMA driver ready before the GPU drivers.
> 
> Well, if there is no other use case then this series will have to wait until someone does all the other work to make P2P work upstream.
> 
> Jason

[RFC,0/3] RDMA: Add dma-buf support

Message

Comments