[rdma-core,3/5] pyverbs: Add dma-buf based MR support

Message ID	1606153984-104583-4-git-send-email-jianxin.xiong@intel.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <linux-rdma-owner@kernel.org> IronPort-SDR: +V7y9INig1jvbO79eEuK4iMWD61VSVzenBks1qxq7f1DMo8BQrkO1AIbcwARcB9ABXU6kxbuyk eLRWobf0nH8A== IronPort-SDR: 9vils/WBwvnYkMVY6uFUzvlVAqGTQOlBYNR3BgVlnfcFzsZOOHUL5Bx62n9fAzH74NBJKIp/AU kdDImGrG5y9w== From: Jianxin Xiong <jianxin.xiong@intel.com> To: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Jianxin Xiong <jianxin.xiong@intel.com>, Doug Ledford <dledford@redhat.com>, Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>, Sumit Semwal <sumit.semwal@linaro.org>, Christian Koenig <christian.koenig@amd.com>, Daniel Vetter <daniel.vetter@intel.com> Subject: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support Date: Mon, 23 Nov 2020 09:53:02 -0800 Message-Id: <1606153984-104583-4-git-send-email-jianxin.xiong@intel.com> In-Reply-To: <1606153984-104583-1-git-send-email-jianxin.xiong@intel.com> References: <1606153984-104583-1-git-send-email-jianxin.xiong@intel.com> Precedence: bulk
Series	Add user space dma-buf support \| expand [rdma-core,0/5] Add user space dma-buf support [rdma-core,1/5] verbs: Support dma-buf based memory region [rdma-core,2/5] mlx5: Support dma-buf based memory region [rdma-core,3/5] pyverbs: Add dma-buf based MR support [rdma-core,4/5] tests: Add tests for dma-buf based memory regions [rdma-core,5/5] tests: Bug fix for get_access_flags()

Xiong, Jianxin Nov. 23, 2020, 5:53 p.m. UTC

Define a new sub-class of 'MR' that uses dma-buf object for the memory
region. Define a new class 'DmaBuf' for dma-buf object allocation.

Signed-off-by: Jianxin Xiong <jianxin.xiong@intel.com>
---
 pyverbs/CMakeLists.txt |  2 ++
 pyverbs/dmabuf.pxd     | 13 +++++++++
 pyverbs/dmabuf.pyx     | 58 +++++++++++++++++++++++++++++++++++++
 pyverbs/libibverbs.pxd |  2 ++
 pyverbs/mr.pxd         |  5 ++++
 pyverbs/mr.pyx         | 77 ++++++++++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 155 insertions(+), 2 deletions(-)
 create mode 100644 pyverbs/dmabuf.pxd
 create mode 100644 pyverbs/dmabuf.pyx

Jason Gunthorpe Nov. 23, 2020, 6:05 p.m. UTC | #1

On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:

> +cdef class DmaBuf:
> +    def __init__(self, size, unit=0):
> +        """
> +        Allocate DmaBuf object from a GPU device. This is done through the
> +        DRI device interface (/dev/dri/card*). Usually this requires the
> +        effective user id being root or being a member of the 'video' group.
> +        :param size: The size (in number of bytes) of the buffer.
> +        :param unit: The unit number of the GPU to allocate the buffer from.
> +        :return: The newly created DmaBuf object on success.
> +        """
> +        self.dmabuf_mrs = weakref.WeakSet()
> +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> +
> +        args = bytearray(32)
> +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)
> +
> +        args = bytearray(12)
> +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> +        a, b, self.fd = unpack('=iii', args)
> +
> +        args = bytearray(16)
> +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> +        a, b, self.map_offset = unpack('=iiq', args);

Wow, OK

Is it worth using ctypes here instead? Can you at least add a comment
before each pack specifying the 'struct XXX' this is following?

Does this work with normal Intel GPUs, like in a Laptop? AMD too?

Christian, I would be very happy to hear from you that this entire
work is good for AMD as well

Edward should look through this, but I'm glad to see something like
this

Thanks,
Jason

Xiong, Jianxin Nov. 23, 2020, 7:48 p.m. UTC | #2

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Monday, November 23, 2020 10:05 AM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Leon Romanovsky
> <leon@kernel.org>; Sumit Semwal <sumit.semwal@linaro.org>; Christian Koenig <christian.koenig@amd.com>; Vetter, Daniel
> <daniel.vetter@intel.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the
> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > + args)
> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment before each pack specifying the 'struct XXX' this is following?
> 

The ioctl call only accept a bytearray, not sure how to use ctypes here. I will add 
comments with the actual layout of the parameter structure.

> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 

Yes, the interface is generic and works with most GPUs. Works with AMD, too.

> Christian, I would be very happy to hear from you that this entire work is good for AMD as well
> 
> Edward should look through this, but I'm glad to see something like this
> 
> Thanks,
> Jason

Daniel Vetter Nov. 24, 2020, 3:16 p.m. UTC | #3

On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> 
> > +cdef class DmaBuf:
> > +    def __init__(self, size, unit=0):
> > +        """
> > +        Allocate DmaBuf object from a GPU device. This is done through the
> > +        DRI device interface (/dev/dri/card*). Usually this requires the

Please use /dev/dri/renderD* instead. That's the interface meant for
unpriviledged rendering access. card* is the legacy interface with
backwards compat galore, don't use.

Specifically if you do this on a gpu which also has display (maybe some
testing on a local developer machine, no idea ...) then you mess with
compositors and stuff.

Also wherever you copied this from, please also educate those teams that
using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

> > +        effective user id being root or being a member of the 'video' group.
> > +        :param size: The size (in number of bytes) of the buffer.
> > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > +        :return: The newly created DmaBuf object on success.
> > +        """
> > +        self.dmabuf_mrs = weakref.WeakSet()
> > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > +
> > +        args = bytearray(32)
> > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq', args)

Yeah no, don't allocate render buffers with create_dumb. Every time this
comes up I'm wondering whether we should just completely disable dma-buf
operations on these. Dumb buffers are explicitly only for software
rendering for display purposes when the gpu userspace stack isn't fully
running yet, aka boot splash.

And yes I know there's endless amounts of abuse of that stuff floating
around, especially on arm-soc/android systems.

> > +
> > +        args = bytearray(12)
> > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > +        a, b, self.fd = unpack('=iii', args)
> > +
> > +        args = bytearray(16)
> > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > +        a, b, self.map_offset = unpack('=iiq', args);
> 
> Wow, OK
> 
> Is it worth using ctypes here instead? Can you at least add a comment
> before each pack specifying the 'struct XXX' this is following?
> 
> Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> 
> Christian, I would be very happy to hear from you that this entire
> work is good for AMD as well

I think the smallest generic interface for allocating gpu buffers which
are more useful than the stuff you get from CREATE_DUMB is gbm. That's
used by compositors to get bare metal opengl going on linux. Ofc Android
has gralloc for the same purpose, and cros has minigbm (which isn't the
same as gbm at all). So not so cool.

The other generic option is using vulkan, which works directly on bare
metal (without a compositor or anything running), and is cross vendor. So
cool, except not used for compute, which is generally the thing you want
if you have an rdma card.

Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back,
properly.

Compute is the worst, because opencl is widely considered a mistake (maybe
opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
have xe (intel-only).

It's pretty glorious :-/

Also I think we discussed this already, but for actual p2p the intel
patches aren't in upstream yet. We have some internally, but with very
broken locking (in the process of getting fixed up, but it's taking time).

Cheers, Daniel

> Edward should look through this, but I'm glad to see something like
> this
> 
> Thanks,
> Jason
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel

Jason Gunthorpe Nov. 24, 2020, 3:36 p.m. UTC | #4

On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:

> Compute is the worst, because opencl is widely considered a mistake (maybe
> opencl 3 is better, but nvidia is stuck on 1.2). The actually used stuff is
> cuda (nvidia-only), rocm (amd-only) and now with intel also playing we
> have xe (intel-only).

> It's pretty glorious :-/

I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third
API" ;)

Hopefuly xe compute won't leave a lot of half finished abandoned
kernel code like Xeon Phi did :(

> Also I think we discussed this already, but for actual p2p the intel
> patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).

Someone needs to say this test works on a real system with an
unpatched upstream driver.

I thought AMD had the needed parts merged?

Jason

Xiong, Jianxin Nov. 24, 2020, 4:21 p.m. UTC | #5

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Tuesday, November 24, 2020 7:36 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Tue, Nov 24, 2020 at 04:16:58PM +0100, Daniel Vetter wrote:
> 
> > Compute is the worst, because opencl is widely considered a mistake
> > (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually
> > used stuff is cuda (nvidia-only), rocm (amd-only) and now with intel
> > also playing we have xe (intel-only).
> 
> > It's pretty glorious :-/
> 
> I enjoyed how the Intel version of CUDA is called "OneAPI" not "Third API" ;)
> 
> Hopefuly xe compute won't leave a lot of half finished abandoned kernel code like Xeon Phi did :(
> 
> > Also I think we discussed this already, but for actual p2p the intel
> > patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> 
> Someone needs to say this test works on a real system with an unpatched upstream driver.
> 
> I thought AMD had the needed parts merged?

Yes, I have tested these with AMD GPU.

> 
> Jason

Xiong, Jianxin Nov. 24, 2020, 6:45 p.m. UTC | #6

> -----Original Message-----
> From: Daniel Vetter <daniel@ffwll.ch>
> Sent: Tuesday, November 24, 2020 7:17 AM
> To: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> >
> > > +cdef class DmaBuf:
> > > +    def __init__(self, size, unit=0):
> > > +        """
> > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > +        DRI device interface (/dev/dri/card*). Usually this
> > > +requires the
> 
> Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> backwards compat galore, don't use.
> 
> Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> compositors and stuff.
> 
> Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)

/dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
mode setting commands (including dumb_buf). The original intention here is to
have something to support the new tests added, not for general compute. 

> 
> > > +        effective user id being root or being a member of the 'video' group.
> > > +        :param size: The size (in number of bytes) of the buffer.
> > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > +        :return: The newly created DmaBuf object on success.
> > > +        """
> > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > +
> > > +        args = bytearray(32)
> > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > + args)
> 
> Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> stack isn't fully running yet, aka boot splash.
> 
> And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.

One alternative is to use the GEM_CREATE method which can be done via the renderD*
device, but the command is vendor specific, so the logic is a little bit more complex. 

> 
> > > +
> > > +        args = bytearray(12)
> > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > +        a, b, self.fd = unpack('=iii', args)
> > > +
> > > +        args = bytearray(16)
> > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > +        a, b, self.map_offset = unpack('=iiq', args);
> >
> > Wow, OK
> >
> > Is it worth using ctypes here instead? Can you at least add a comment
> > before each pack specifying the 'struct XXX' this is following?
> >
> > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> >
> > Christian, I would be very happy to hear from you that this entire
> > work is good for AMD as well
> 
> I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> (which isn't the same as gbm at all). So not so cool.

Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
That would be much simpler than going with gbm and less dependency in setting up
the testing evrionment.

> 
> The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> 
> Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> 
> Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> 
> It's pretty glorious :-/
> 
> Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> broken locking (in the process of getting fixed up, but it's taking time).
> 
> Cheers, Daniel
> 
> > Edward should look through this, but I'm glad to see something like
> > this
> >
> > Thanks,
> > Jason
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

Daniel Vetter Nov. 25, 2020, 10:50 a.m. UTC | #7

On Tue, Nov 24, 2020 at 06:45:06PM +0000, Xiong, Jianxin wrote:
> > -----Original Message-----
> > From: Daniel Vetter <daniel@ffwll.ch>
> > Sent: Tuesday, November 24, 2020 7:17 AM
> > To: Jason Gunthorpe <jgg@ziepe.ca>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Mon, Nov 23, 2020 at 02:05:04PM -0400, Jason Gunthorpe wrote:
> > > On Mon, Nov 23, 2020 at 09:53:02AM -0800, Jianxin Xiong wrote:
> > >
> > > > +cdef class DmaBuf:
> > > > +    def __init__(self, size, unit=0):
> > > > +        """
> > > > +        Allocate DmaBuf object from a GPU device. This is done through the
> > > > +        DRI device interface (/dev/dri/card*). Usually this
> > > > +requires the
> > 
> > Please use /dev/dri/renderD* instead. That's the interface meant for unpriviledged rendering access. card* is the legacy interface with
> > backwards compat galore, don't use.
> > 
> > Specifically if you do this on a gpu which also has display (maybe some testing on a local developer machine, no idea ...) then you mess with
> > compositors and stuff.
> > 
> > Also wherever you copied this from, please also educate those teams that using /dev/dri/card* for rendering stuff is a Bad Idea (tm)
> 
> /dev/dri/renderD* is not always available (e.g. for many iGPUs) and doesn't support
> mode setting commands (including dumb_buf). The original intention here is to
> have something to support the new tests added, not for general compute. 

Not having dumb_buf available is a feature. So even more reasons to use
that.

Also note that amdgpu has killed card* access pretty much, it's for
modesetting only.

> > > > +        effective user id being root or being a member of the 'video' group.
> > > > +        :param size: The size (in number of bytes) of the buffer.
> > > > +        :param unit: The unit number of the GPU to allocate the buffer from.
> > > > +        :return: The newly created DmaBuf object on success.
> > > > +        """
> > > > +        self.dmabuf_mrs = weakref.WeakSet()
> > > > +        self.dri_fd = open('/dev/dri/card'+str(unit), O_RDWR)
> > > > +
> > > > +        args = bytearray(32)
> > > > +        pack_into('=iiiiiiq', args, 0, 1, size, 8, 0, 0, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_CREATE_DUMB, args)
> > > > +        a, b, c, d, self.handle, e, self.size = unpack('=iiiiiiq',
> > > > + args)
> > 
> > Yeah no, don't allocate render buffers with create_dumb. Every time this comes up I'm wondering whether we should just completely
> > disable dma-buf operations on these. Dumb buffers are explicitly only for software rendering for display purposes when the gpu userspace
> > stack isn't fully running yet, aka boot splash.
> > 
> > And yes I know there's endless amounts of abuse of that stuff floating around, especially on arm-soc/android systems.
> 
> One alternative is to use the GEM_CREATE method which can be done via the renderD*
> device, but the command is vendor specific, so the logic is a little bit more complex. 

Yup. I guess the most minimal thing is to have a per-vendor (you can ask
drm for the driver name to match the right one) callback here to allocate
buffers correctly. Might be less churn than trying to pull in vulkan or
something like that.

It's at least what we're doing in igt for testing drm drivers (although
most of the generic igt tests for display, so dumb_buffer fallback is
available).

DRM_IOCTL_VERSION is the thing you'd need here, struct drm_version.name
has the field for figuring out which driver it is.

Also drivers without render node support won't ever be in the same system
as an rdma card and actually useful (because well they're either very old,
or display-only). So not an issue I think.

> > > > +
> > > > +        args = bytearray(12)
> > > > +        pack_into('=iii', args, 0, self.handle, O_RDWR, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_PRIME_HANDLE_TO_FD, args)
> > > > +        a, b, self.fd = unpack('=iii', args)
> > > > +
> > > > +        args = bytearray(16)
> > > > +        pack_into('=iiq', args, 0, self.handle, 0, 0)
> > > > +        ioctl(self.dri_fd, DRM_IOCTL_MODE_MAP_DUMB, args);
> > > > +        a, b, self.map_offset = unpack('=iiq', args);
> > >
> > > Wow, OK
> > >
> > > Is it worth using ctypes here instead? Can you at least add a comment
> > > before each pack specifying the 'struct XXX' this is following?
> > >
> > > Does this work with normal Intel GPUs, like in a Laptop? AMD too?
> > >
> > > Christian, I would be very happy to hear from you that this entire
> > > work is good for AMD as well
> > 
> > I think the smallest generic interface for allocating gpu buffers which are more useful than the stuff you get from CREATE_DUMB is gbm.
> > That's used by compositors to get bare metal opengl going on linux. Ofc Android has gralloc for the same purpose, and cros has minigbm
> > (which isn't the same as gbm at all). So not so cool.
> 
> Again, would the "renderD* + GEM_CREATE" combination be an acceptable alternative? 
> That would be much simpler than going with gbm and less dependency in setting up
> the testing evrionment.

Yeah imo makes sense. It's a bunch more code for you to make it work on
i915 and amd, but it's not terrible. And avoids the dependencies, and also
avoids the abuse of card* and dumb buffers. Plus not really more complex,
you just need a table or something to match from the drm driver name to
the driver-specific buffer create function. Everything else stays the
same.

Also this opens up the door to force-test stuff like p2p in the future,
since at least on i915 you'll be able to ensure that a buffer is in vram
only.

Would be good if we also have a trick for amdgpu to make sure the buffer
stays in vram. I think there's some flags you can pass to the amdgpu
buffer create function. So maybe you want 2 testcases here, one allocates
the buffer in system memory, the other in vram for testing p2p
functionality. That kind of stuff isn't possible with dumb buffers.
-Daniel




> > 
> > The other generic option is using vulkan, which works directly on bare metal (without a compositor or anything running), and is cross vendor.
> > So cool, except not used for compute, which is generally the thing you want if you have an rdma card.
> > 
> > Both gbm-egl/opengl and vulkan have extensions to hand you a dma-buf back, properly.
> > 
> > Compute is the worst, because opencl is widely considered a mistake (maybe opencl 3 is better, but nvidia is stuck on 1.2). The actually used
> > stuff is cuda (nvidia-only), rocm (amd-only) and now with intel also playing we have xe (intel-only).
> > 
> > It's pretty glorious :-/
> > 
> > Also I think we discussed this already, but for actual p2p the intel patches aren't in upstream yet. We have some internally, but with very
> > broken locking (in the process of getting fixed up, but it's taking time).
> > 
> > Cheers, Daniel
> > 
> > > Edward should look through this, but I'm glad to see something like
> > > this
> > >
> > > Thanks,
> > > Jason
> > > _______________________________________________
> > > dri-devel mailing list
> > > dri-devel@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> > 
> > --
> > Daniel Vetter
> > Software Engineer, Intel Corporation
> > http://blog.ffwll.ch

Jason Gunthorpe Nov. 25, 2020, 12:14 p.m. UTC | #8

On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:

> Yeah imo makes sense. It's a bunch more code for you to make it work on
> i915 and amd, but it's not terrible. And avoids the dependencies, and also
> avoids the abuse of card* and dumb buffers. Plus not really more complex,
> you just need a table or something to match from the drm driver name to
> the driver-specific buffer create function. Everything else stays the
> same.

If it is going to get more complicated please write it in C then. We
haven't done it yet, but you can link a C function through cython to
the python test script

If you struggle here I can probably work out the build system bits,
but it should not be too terrible

Jason

Xiong, Jianxin Nov. 25, 2020, 7:27 p.m. UTC | #9

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:15 AM
> To: Daniel Vetter <daniel@ffwll.ch>
> Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> 
> > Yeah imo makes sense. It's a bunch more code for you to make it work
> > on
> > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > also avoids the abuse of card* and dumb buffers. Plus not really more
> > complex, you just need a table or something to match from the drm
> > driver name to the driver-specific buffer create function. Everything
> > else stays the same.
> 
> If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> python test script
> 
> If you struggle here I can probably work out the build system bits, but it should not be too terrible

Thanks Daniel and Jason. I have started working in this direction. There should be no
technical obstacle here.

Jason Gunthorpe Nov. 26, 2020, midnight UTC | #10

On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > From: Jason Gunthorpe <jgg@ziepe.ca>
> > Sent: Wednesday, November 25, 2020 4:15 AM
> > To: Daniel Vetter <daniel@ffwll.ch>
> > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > <christian.koenig@amd.com>
> > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> > 
> > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > 
> > > Yeah imo makes sense. It's a bunch more code for you to make it work
> > > on
> > > i915 and amd, but it's not terrible. And avoids the dependencies, and
> > > also avoids the abuse of card* and dumb buffers. Plus not really more
> > > complex, you just need a table or something to match from the drm
> > > driver name to the driver-specific buffer create function. Everything
> > > else stays the same.
> > 
> > If it is going to get more complicated please write it in C then. We haven't done it yet, but you can link a C function through cython to the
> > python test script
> > 
> > If you struggle here I can probably work out the build system bits, but it should not be too terrible
> 
> Thanks Daniel and Jason. I have started working in this direction. There should be no
> technical obstacle here. 

Just to be clear I mean write some 'get dma buf fd' function in C, not
the whole test

Jason

Xiong, Jianxin Nov. 26, 2020, 12:43 a.m. UTC | #11

> -----Original Message-----
> From: Jason Gunthorpe <jgg@ziepe.ca>
> Sent: Wednesday, November 25, 2020 4:00 PM
> To: Xiong, Jianxin <jianxin.xiong@intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>; Leon Romanovsky <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-devel@lists.freedesktop.org;
> Doug Ledford <dledford@redhat.com>; Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig <christian.koenig@amd.com>
> Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR support
> 
> On Wed, Nov 25, 2020 at 07:27:07PM +0000, Xiong, Jianxin wrote:
> > > From: Jason Gunthorpe <jgg@ziepe.ca>
> > > Sent: Wednesday, November 25, 2020 4:15 AM
> > > To: Daniel Vetter <daniel@ffwll.ch>
> > > Cc: Xiong, Jianxin <jianxin.xiong@intel.com>; Leon Romanovsky
> > > <leon@kernel.org>; linux-rdma@vger.kernel.org; dri-
> > > devel@lists.freedesktop.org; Doug Ledford <dledford@redhat.com>;
> > > Vetter, Daniel <daniel.vetter@intel.com>; Christian Koenig
> > > <christian.koenig@amd.com>
> > > Subject: Re: [PATCH rdma-core 3/5] pyverbs: Add dma-buf based MR
> > > support
> > >
> > > On Wed, Nov 25, 2020 at 11:50:41AM +0100, Daniel Vetter wrote:
> > >
> > > > Yeah imo makes sense. It's a bunch more code for you to make it
> > > > work on
> > > > i915 and amd, but it's not terrible. And avoids the dependencies,
> > > > and also avoids the abuse of card* and dumb buffers. Plus not
> > > > really more complex, you just need a table or something to match
> > > > from the drm driver name to the driver-specific buffer create
> > > > function. Everything else stays the same.
> > >
> > > If it is going to get more complicated please write it in C then. We
> > > haven't done it yet, but you can link a C function through cython to
> > > the python test script
> > >
> > > If you struggle here I can probably work out the build system bits,
> > > but it should not be too terrible
> >
> > Thanks Daniel and Jason. I have started working in this direction.
> > There should be no technical obstacle here.
> 
> Just to be clear I mean write some 'get dma buf fd' function in C, not the whole test
> 
Yes, that's my understanding.

[rdma-core,3/5] pyverbs: Add dma-buf based MR support

Commit Message

Comments

Patch