Message ID | 20210302080358.3095748-1-vivek.kasireddy@intel.com (mailing list archive) |
---|---|
Headers | show |
Series | Use dmabufs for display updates instead of pixman | expand |
Patchew URL: https://patchew.org/QEMU/20210302080358.3095748-1-vivek.kasireddy@intel.com/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Message-id: 20210302080358.3095748-1-vivek.kasireddy@intel.com Subject: [RFC 0/1] Use dmabufs for display updates instead of pixman === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20210302080358.3095748-1-vivek.kasireddy@intel.com -> patchew/20210302080358.3095748-1-vivek.kasireddy@intel.com Switched to a new branch 'test' 5509447 virtio-gpu: Use dmabuf for display updates if possible instead of pixman === OUTPUT BEGIN === ERROR: space prohibited between function name and open parenthesis '(' #65: FILE: hw/display/virtio-gpu.c:541: + res->iov_cnt * sizeof (struct udmabuf_create_item)); ERROR: braces {} are necessary for all arms of this statement #66: FILE: hw/display/virtio-gpu.c:542: + if (!create) [...] ERROR: space required after that ',' (ctx:VxV) #92: FILE: hw/display/virtio-gpu.c:568: + modifier_lo = fourcc_mod_code(INTEL,I915_FORMAT_MOD_X_TILED) & 0xFFFFFFFF; ^ ERROR: braces {} are necessary for all arms of this statement #182: FILE: hw/display/virtio-gpu.c:699: + if (!ret) [...] total: 4 errors, 0 warnings, 196 lines checked Commit 550944737e2a (virtio-gpu: Use dmabuf for display updates if possible instead of pixman) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20210302080358.3095748-1-vivek.kasireddy@intel.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
On Tue, Mar 02, 2021 at 12:03:57AM -0800, Vivek Kasireddy wrote: > This is still a WIP/RFC patch that attempts to use dmabufs for display > updates with the help of Udmabuf driver instead of pixman. This patch > is posted to the ML to elicit feedback and start a discussion whether > something like this would be useful or not for mainly non-Virgl > rendered BOs and also potentially in other cases. Yes, it surely makes sense to go into that direction. The patch as-is doesn't, it breaks the guest/host interface. That's ok-ish for a quick proof-of-concept, but clearly not merge-able. > TODO: > - Use Blob resources for getting meta-data such as modifier, format, etc. That is pretty much mandatory. Without blob resources there is no concept of resources shared between host and guest in virtio-gpu, all data is explicitly copied with transfer commands. Which implies quite a bit of work because we don't have blob resource support in qemu yet. > - Test with Virgil rendered BOs to see if this can be used in that case.. That also opens up the question how to go forward with virtio-gpu in general. The object hierarchy we have right now (skipping pci + vga variants for simplicity): TYPE_VIRTIO_GPU_BASE (abstract base) -> TYPE_VIRTIO_GPU (in-qemu implementation) -> TYPE_VHOST_USER_GPU (vhost-user implementation) When compiled with opengl + virgl TYPE_VIRTIO_GPU has a virgl=on/off property. Having a single device is not ideal for modular builds. because the hw-display-virtio-gpu.so module has a dependency on ui-opengl.so so that is needed (due to symbol references) even for the virgl=off case. Also the code is a bit of a #ifdef mess. I think we should split TYPE_VIRTIO_GPU into two devices. Remove virgl+opengl support from TYPE_VIRTIO_GPU. Add a new TYPE_VIRTIO_GPU_VIRGL, with either TYPE_VIRTIO_GPU or TYPE_VIRTIO_GPU_BASE as parent (not sure which is easier), have all opengl/virgl support code there. I think when using opengl it makes sense to also require virgl, so we can use the virglrenderer library to manage blob resources (even when the actual rendering isn't done with virgl). Also reduces the complexity and test matrix. Maybe it even makes sense to deprecate in-qemu virgl support and focus exclusively on the vhost-user implementation, so we don't have to duplicate all work for both implementations. > Considerations/Challenges: > - One of the main concerns with using dmabufs is how to synchronize access > to them and this use-case is no different. If the Guest is running Weston, > then it could use a maximum of 4 color buffers but uses only 2 by default and > flips between them if it is not sharing the FBs with other plugins while > running with the drm backend. In this case, how do we make sure that Weston > and Qemu UI are not using the same buffer at any given time? There is graphic_hw_gl_block + graphic_hw_gl_flushed for syncronization. Right now this is only wired up in spice, and it is rather simple (just stalls virgl rendering instead of providing per-buffer syncronization). > - If we have Xorg running in the Guest, then it gets even more interesting as > Xorg in some cases does frontbuffer rendering (uses DRM_IOCTL_MODE_DIRTYFB). Well, if the guest does frontbuffer rendering we can't do much about it and have to live with rendering glitches I guess. take care, Gerd
Hi Gerd, > Yes, it surely makes sense to go into that direction. > The patch as-is doesn't, it breaks the guest/host interface. > That's ok-ish for a quick proof-of-concept, but clearly not merge-able. > > > TODO: > > - Use Blob resources for getting meta-data such as modifier, format, etc. > > That is pretty much mandatory. Without blob resources there is no concept of resources > shared between host and guest in virtio-gpu, all data is explicitly copied with transfer > commands. [Kasireddy, Vivek] My understanding of virtio-gpu and the concept of resources is still fairly limited but are blob resources really needed for non-Virgl use-cases -- other than something like a dmabuf/scanout blob that shares the meta-data such as modifer? I thought the main motivation for blob resources would be to avoid the explicit copy you mentioned for Virgl workloads. > > Which implies quite a bit of work because we don't have blob resource support in qemu > yet. [Kasireddy, Vivek] I was scrubbing through old mailing list messages to understand the motivation behind blob resources as to why they are needed and came across this: https://gitlab.freedesktop.org/virgl/qemu/-/commits/virtio-gpu-next Does your work above not count for anything? > > > - Test with Virgil rendered BOs to see if this can be used in that case.. > > That also opens up the question how to go forward with virtio-gpu in general. The object > hierarchy we have right now (skipping pci + vga variants for simplicity): > > TYPE_VIRTIO_GPU_BASE (abstract base) > -> TYPE_VIRTIO_GPU (in-qemu implementation) > -> TYPE_VHOST_USER_GPU (vhost-user implementation) > > When compiled with opengl + virgl TYPE_VIRTIO_GPU has a virgl=on/off property. > Having a single device is not ideal for modular builds. > because the hw-display-virtio-gpu.so module has a dependency on ui-opengl.so so that is > needed (due to symbol references) even for the virgl=off case. Also the code is a bit of a > #ifdef mess. > > I think we should split TYPE_VIRTIO_GPU into two devices. Remove > virgl+opengl support from TYPE_VIRTIO_GPU. Add a new > TYPE_VIRTIO_GPU_VIRGL, with either TYPE_VIRTIO_GPU or > TYPE_VIRTIO_GPU_BASE as parent (not sure which is easier), have all opengl/virgl > support code there. > > I think when using opengl it makes sense to also require virgl, so we can use the > virglrenderer library to manage blob resources (even when the actual rendering isn't done > with virgl). Also reduces the complexity and test matrix. [Kasireddy, Vivek] When you say "using opengl" are you referring to the presentation of the rendered buffer via dmabuf or pixman? If yes, I am not sure why this would need to depend on Virgl. For our use-case(s) where we are using virtio-gpu in buffer sharing mode, we'd still need opengl for submitting the dmabuf to UI, IIUC. > > Maybe it even makes sense to deprecate in-qemu virgl support and focus exclusively on > the vhost-user implementation, so we don't have to duplicate all work for both > implementations. [Kasireddy, Vivek] Is the vhost-user implementation better in terms of performance, generally? > > case, how do we make sure that Weston and Qemu UI are not using the same buffer at > any given time? > > There is graphic_hw_gl_block + graphic_hw_gl_flushed for syncronization. > Right now this is only wired up in spice, and it is rather simple (just stalls virgl rendering > instead of providing per-buffer syncronization). [Kasireddy, Vivek] I guess that might work for Virgl rendering but not for our use-case. What we need is a way to tell if the previously submitted dmabuf has been consumed by the Host compositor or not before we release/close it. Weston (wl_buffer.release event and fences) and EGL (sync and fences) do provide few options but I am not sure if GTK lets us use any of those or not. Any recommendations? EGLSync objects? On a different note, any particular reason why Qemu UI EGL implementation is limited to Xorg and not extended to Wayland/Weston for which there is GTK glarea? Thanks, Vivek > > take care, > Gerd
Hi, > > That is pretty much mandatory. Without blob resources there is no concept of resources > > shared between host and guest in virtio-gpu, all data is explicitly copied with transfer > > commands. > [Kasireddy, Vivek] My understanding of virtio-gpu and the concept of resources is still > fairly limited but are blob resources really needed for non-Virgl use-cases -- other than > something like a dmabuf/scanout blob that shares the meta-data such as modifer? I > thought the main motivation for blob resources would be to avoid the explicit copy you > mentioned for Virgl workloads. Well, you want avoid the copy as well, right? With blob resources you can do that in a well defined way, i.e. the guest knows what you are doing and behaves accordingly. Without blob resources you can't, at least not without violating the guests expectation that any changes it does only visible to the host after an explicit transfer (aka copy) command. > > Which implies quite a bit of work because we don't have blob resource support in qemu > > yet. > [Kasireddy, Vivek] I was scrubbing through old mailing list messages to understand the > motivation behind blob resources as to why they are needed and came across this: > https://gitlab.freedesktop.org/virgl/qemu/-/commits/virtio-gpu-next > > Does your work above not count for anything? It is quite old, and I think not up-to-date with the final revision of the blob resource specification. I wouldn't be able to update this in near future due to being busy with other projects. Feel free to grab & update & submit these patches though. > > I think when using opengl it makes sense to also require virgl, so we can use the > > virglrenderer library to manage blob resources (even when the actual rendering isn't done > > with virgl). Also reduces the complexity and test matrix. > [Kasireddy, Vivek] When you say "using opengl" are you referring to the presentation of > the rendered buffer via dmabuf or pixman? If yes, I am not sure why this would need to > depend on Virgl. Well, you can probably do it without virgl as well. But why? Instead of just using the virglrenderer library effectively duplicate the blob resource management bits in qemu? Beside the code duplication this is also a maintainance issue. This adds one more configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good idea to add a third mode, without virgl support but using opengl for blob dma-bufs. > For our use-case(s) where we are using virtio-gpu in buffer sharing mode, > we'd still need opengl for submitting the dmabuf to UI, IIUC. Correct. When you want use dma-bufs you need opengl. > > Maybe it even makes sense to deprecate in-qemu virgl support and focus exclusively on > > the vhost-user implementation, so we don't have to duplicate all work for both > > implementations. > [Kasireddy, Vivek] Is the vhost-user implementation better in terms of performance, generally? It is better both in terms of security (it's easier to sandbox) and performance. The in-qemu implementation runs in the qemu iothread. Which also handles a bunch of other jobs. Also virglrenderer being busy -- for example with compiling complex shaders -- can block qemu for a while, which in turn can cause latency spikes in the guest. With the vhost-user implementation this is not a problem. Drawback is the extra communication (and synchronization) needed between vhost-user + qemu to make the guest display available via spice or gtk. The latter can possibly be solved by exporting the guest display as pipewire remote desktop (random idea I didn't investigate much yet). > On a different note, any particular reason why Qemu UI EGL > implementation is limited to Xorg and not extended to Wayland/Weston > for which there is GTK glarea? Well, ideally I'd love to just use glarea. Which happens on wayland. The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl context for glarea. At least that used to be the case in the past, maybe that has changed with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So we are stuck with our own egl widget implementation for now. Probably we will be able to drop it at some point in the future. HTH, Gerd
Hi Gerd, Sorry for the delayed response. I wanted to wait until I finished my proof-of-concept -- that included adding synchronization -- to ask follow up questions. > > > > Does your work above not count for anything? > > It is quite old, and I think not up-to-date with the final revision of the blob resource > specification. I wouldn't be able to update this in near future due to being busy with other > projects. Feel free to grab & update & submit these patches though. [Kasireddy, Vivek] Sure, we'll take a look at your work and use that as a starting point. Roughly, how much of your work can be reused? Also, given my limited understanding of how discrete GPUs work, I was wondering how many copies would there need to be with blob resources/dmabufs and whether a zero-copy goal would be feasible or not? > > Beside the code duplication this is also a maintainance issue. This adds one more > configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on > opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good > idea to add a third mode, without virgl support but using opengl for blob dma-bufs. [Kasireddy, Vivek] We'll have to re-visit this part but for our use-case with virtio-gpu, we are disabling virglrenderer in Qemu and virgl DRI driver in the Guest. However, we still need to use Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of the UI/GTK updates. > > > > On a different note, any particular reason why Qemu UI EGL > > implementation is limited to Xorg and not extended to Wayland/Weston > > for which there is GTK glarea? > > Well, ideally I'd love to just use glarea. Which happens on wayland. > > The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl > context for glarea. At least that used to be the case in the past, maybe that has changed > with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So > we are stuck with our own egl widget implementation for now. Probably we will be able > to drop it at some point in the future. [Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like that is not going to change anytime soon. Having said that, I was wondering if it makes sense to add a new purely Wayland backend besides GtkGlArea so that Qemu UI can more quickly adopt new features such as explicit sync. I was thinking about the new backend being similar to this: https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf-egl.c The reason why I am proposing this idea is because even if we manage to add explicit sync support to GTK and it gets merged, upgrading Qemu GTK support from 3.22 to > 4.x may prove to be daunting. Currently, the way I am doing explicit sync is by adding these new APIs to GTK and calling them from Qemu: static int create_egl_fence_fd(EGLDisplay dpy) { EGLSyncKHR sync = eglCreateSyncKHR(dpy, EGL_SYNC_NATIVE_FENCE_ANDROID, NULL); int fd; g_assert(sync != EGL_NO_SYNC_KHR); fd = eglDupNativeFenceFDANDROID(dpy, sync); g_assert(fd >= 0); eglDestroySyncKHR(dpy, sync); return fd; } static void wait_for_buffer_release_fence(EGLDisplay dpy) { int ret; EGLint attrib_list[] = { EGL_SYNC_NATIVE_FENCE_FD_ANDROID, release_fence_fd, EGL_NONE, }; if (release_fence_fd < 0) return; EGLSyncKHR sync = eglCreateSyncKHR(dpy, EGL_SYNC_NATIVE_FENCE_ANDROID, attrib_list); g_assert(sync); release_fence_fd = -1; eglClientWaitSyncKHR(dpy, sync, 0, EGL_FOREVER_KHR); eglDestroySyncKHR(dpy, sync); } And, of-course, I am tying the wait above to a dma_fence associated with the previous guest FB that is signalled to ensure that the Host is done using the FB thereby providing explicit synchronization between Guest and Host. It seems to work OK but I was wondering if you had any alternative ideas or suggestions for doing explicit or implicit sync that are more easier. Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here: https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/viogpu We are going to try it out but do you know how up to date it is kept? Thanks, Vivek
On Wed, Mar 17, 2021 at 08:28:33AM +0000, Kasireddy, Vivek wrote: > Hi Gerd, > Sorry for the delayed response. I wanted to wait until I finished my proof-of-concept -- > that included adding synchronization -- to ask follow up questions. > > > > > > > Does your work above not count for anything? > > > > It is quite old, and I think not up-to-date with the final revision of the blob resource > > specification. I wouldn't be able to update this in near future due to being busy with other > > projects. Feel free to grab & update & submit these patches though. > [Kasireddy, Vivek] Sure, we'll take a look at your work and use that as a starting > point. Roughly, how much of your work can be reused? There are some small udmabuf support patches which can probably be reused pretty much as-is. Everything else needs larger changes I suspect, but it's been a while I looked at this ... > Also, given my limited understanding of how discrete GPUs work, I was wondering how > many copies would there need to be with blob resources/dmabufs and whether a zero-copy > goal would be feasible or not? Good question. Right now there are two copies (gtk ui): (1) guest ram -> DisplaySurface -> gtk widget (gl=off), or (2) guest ram -> DisplaySurface -> texture (gl=on). You should be able to reduce this to one copy for gl=on ... (3) guest ram -> texture ... by taking DisplaySurface out of the picture, without any changes to the guest/host interface. Drawback is that it requires adding an opengl dependency to virtio-gpu even with virgl=off, because the virtio-gpu device will have to handle the copy to the texture then, in response to guest TRANSFER commands. When adding blob resource support: Easiest is probably supporting VIRTIO_GPU_BLOB_MEM_GUEST (largely identical to non-blob resources) with VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE (allows the host to create a shared mapping). Then you can go create a udmabuf for the resource on the host side. For the non-gl code path you can mmap() the udmabuf (which gives you a linear mapping for the scattered guest pages) and create a DisplaySurface backed by guest ram pages (removing the guest ram -> DisplaySurface copy). For the gl code path you can create a texture backed by the udmabuf and go render on the host without copying at all. Using VIRTIO_GPU_BLOB_MEM_GUEST + VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE for resources needs guest changes too, either in mesa (when using virgl) or the kernel driver's dumb buffer handling (when not using virgl). Alternatively (listed more for completeness): You can create a blob resource with VIRTGPU_BLOB_MEM_HOST3D (requires virgl, see also virgl_drm_winsys_resource_create_blob in mesa). It will be allocated by the host, then mapped into the guest using a virtual pci memory bar. Guest userspace (aka mesa driver) can mmap() these resources and has direct, zero-copy access to the host resource. Going to dma-buf export that, import into i915, then let the gpu render implies we are doing p2p dma from a physical (pci-assigned) device to the memory bar of a virtual pci device. Doing that should be possible, but frankly I would be surprised if that actually works out-of-the-box. Dunno how many dragons are lurking here. Could become an interesting challenge to make that fly. > > Beside the code duplication this is also a maintainance issue. This adds one more > > configuration to virtio-gpu. Right now you can build virtio-gpu with virgl (depends on > > opengl), or you can build without virgl (doesn't use opengl then). I don't think it is a good > > idea to add a third mode, without virgl support but using opengl for blob dma-bufs. > [Kasireddy, Vivek] We'll have to re-visit this part but for our use-case with virtio-gpu, we > are disabling virglrenderer in Qemu and virgl DRI driver in the Guest. However, we still > need to use Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of > the UI/GTK updates. Well, VIRTGPU_BLOB_MEM_HOST3D blob resources are created using virgl renderer commands (VIRGL_CCMD_PIPE_RESOURCE_CREATE). So supporting that without virglrenderer is not an option. VIRTIO_GPU_BLOB_MEM_GUEST might be possible without too much effort. > > > On a different note, any particular reason why Qemu UI EGL > > > implementation is limited to Xorg and not extended to Wayland/Weston > > > for which there is GTK glarea? > > > > Well, ideally I'd love to just use glarea. Which happens on wayland. > > > > The problem with Xorg is that the gtk x11 backend uses glx not egl to create an opengl > > context for glarea. At least that used to be the case in the past, maybe that has changed > > with newer versions. qemu needs egl contexts though, otherwise dma-bufs don't work. So > > we are stuck with our own egl widget implementation for now. Probably we will be able > > to drop it at some point in the future. > [Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like that is not going to > change anytime soon. Hmm, so the egl backend has to stay for the time being. > Having said that, I was wondering if it makes sense to add a new > purely Wayland backend besides GtkGlArea so that Qemu UI can more quickly adopt new > features such as explicit sync. I was thinking about the new backend being similar to this: > https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf-egl.c I'd prefer to not do that. > The reason why I am proposing this idea is because even if we manage to add explicit > sync support to GTK and it gets merged, upgrading Qemu GTK support from 3.22 > to > 4.x may prove to be daunting. Currently, the way I am doing explicit sync is > by adding these new APIs to GTK and calling them from Qemu: Well, we had the same code supporting gtk2+3 with #ifdefs. There are also #ifdefs to avoid using functions deprecated during 3.x lifetime. So I expect porting to gtk4 wouldn't be too bad. Also I expect qemu wouldn't be the only application needing sync support, so trying to get that integrated with upstream gtk certainly makes sense. > Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here: > https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/viogpu > > We are going to try it out but do you know how up to date it is kept? No, not following development closely. take care, Gerd
> -----Original Message----- > From: Qemu-devel <qemu-devel-bounces+tina.zhang=intel.com@nongnu.org> > On Behalf Of Gerd Hoffmann > Sent: Tuesday, March 2, 2021 8:04 PM > To: Kasireddy, Vivek <vivek.kasireddy@intel.com> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>; Kim, Dongwon > <dongwon.kim@intel.com>; qemu-devel@nongnu.org; Marc-André Lureau > <marcandre.lureau@redhat.com> > Subject: Re: [RFC 0/1] Use dmabufs for display updates instead of pixman > > On Tue, Mar 02, 2021 at 12:03:57AM -0800, Vivek Kasireddy wrote: > > This is still a WIP/RFC patch that attempts to use dmabufs for display > > updates with the help of Udmabuf driver instead of pixman. This patch > > is posted to the ML to elicit feedback and start a discussion whether > > something like this would be useful or not for mainly non-Virgl > > rendered BOs and also potentially in other cases. > > Yes, it surely makes sense to go into that direction. > The patch as-is doesn't, it breaks the guest/host interface. > That's ok-ish for a quick proof-of-concept, but clearly not merge-able. Hi, According to https://lore.kernel.org/dri-devel/20210212110140.gdpu7kapnr7ovdcn@sirius.home.kraxel.org/ proposal, we made some progress on making a 'virtio-gpu (display) + pass-through GPU' prototype. We leverage the kmsro framework provided by mesa to let the virtio-gpu display work with a passed-through GPU in headless mode. And the MR is here: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9592 Although our work is different from this on-going discussion which is about enabling a general way to share buffers between guest and host, we'd like to leverage this patch. So, is there any plan to refine this patch? E.g. move the uuid blob support into another patch, as the implementation of the proposal doesn't require guest user space to share buffers with host side, and also maybe add the dma-buf support for cursor plane. Thanks. BR, Tina > > > TODO: > > - Use Blob resources for getting meta-data such as modifier, format, etc. > > That is pretty much mandatory. Without blob resources there is no concept of > resources shared between host and guest in virtio-gpu, all data is explicitly > copied with transfer commands. > > Which implies quite a bit of work because we don't have blob resource support > in qemu yet. > > > - Test with Virgil rendered BOs to see if this can be used in that case.. > > That also opens up the question how to go forward with virtio-gpu in general. > The object hierarchy we have right now (skipping pci + vga variants for > simplicity): > > TYPE_VIRTIO_GPU_BASE (abstract base) > -> TYPE_VIRTIO_GPU (in-qemu implementation) > -> TYPE_VHOST_USER_GPU (vhost-user implementation) > > When compiled with opengl + virgl TYPE_VIRTIO_GPU has a virgl=on/off > property. Having a single device is not ideal for modular builds. > because the hw-display-virtio-gpu.so module has a dependency on ui-opengl.so > so that is needed (due to symbol references) even for the virgl=off case. Also > the code is a bit of a #ifdef mess. > > I think we should split TYPE_VIRTIO_GPU into two devices. Remove > virgl+opengl support from TYPE_VIRTIO_GPU. Add a new > TYPE_VIRTIO_GPU_VIRGL, with either TYPE_VIRTIO_GPU or > TYPE_VIRTIO_GPU_BASE as parent (not sure which is easier), have all > opengl/virgl support code there. > > I think when using opengl it makes sense to also require virgl, so we can use the > virglrenderer library to manage blob resources (even when the actual rendering > isn't done with virgl). Also reduces the complexity and test matrix. > > Maybe it even makes sense to deprecate in-qemu virgl support and focus > exclusively on the vhost-user implementation, so we don't have to duplicate all > work for both implementations. > > > Considerations/Challenges: > > - One of the main concerns with using dmabufs is how to synchronize > > access to them and this use-case is no different. If the Guest is > > running Weston, then it could use a maximum of 4 color buffers but > > uses only 2 by default and flips between them if it is not sharing the > > FBs with other plugins while running with the drm backend. In this > > case, how do we make sure that Weston and Qemu UI are not using the same > buffer at any given time? > > There is graphic_hw_gl_block + graphic_hw_gl_flushed for syncronization. > Right now this is only wired up in spice, and it is rather simple (just stalls virgl > rendering instead of providing per-buffer syncronization). > > > - If we have Xorg running in the Guest, then it gets even more > > interesting as Xorg in some cases does frontbuffer rendering (uses > DRM_IOCTL_MODE_DIRTYFB). > > Well, if the guest does frontbuffer rendering we can't do much about it and have > to live with rendering glitches I guess. > > take care, > Gerd >
Hi Gerd, Thank you for taking the time to explain how support for blob resources needs to be added. We are going to get started soon and here are the tasks we are planning to do in order of priority: 1) Add support for VIRTIO_GPU_BLOB_MEM_GUEST + VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE 2) Upgrade Qemu GTK UI from 3.22 to 4.x 3) Add explicit sync support to GTK4 and Qemu UI 4) Add support for VIRTGPU_BLOB_MEM_HOST3D We'll start sending patches as we go along. Thanks, Vivek > > [Kasireddy, Vivek] Sure, we'll take a look at your work and use that > > as a starting point. Roughly, how much of your work can be reused? > > There are some small udmabuf support patches which can probably be reused pretty much > as-is. Everything else needs larger changes I suspect, but it's been a while I looked at this > ... > > > Also, given my limited understanding of how discrete GPUs work, I was > > wondering how many copies would there need to be with blob > > resources/dmabufs and whether a zero-copy goal would be feasible or not? > > Good question. > > Right now there are two copies (gtk ui): > > (1) guest ram -> DisplaySurface -> gtk widget (gl=off), or > (2) guest ram -> DisplaySurface -> texture (gl=on). > > You should be able to reduce this to one copy for gl=on ... > > (3) guest ram -> texture > > ... by taking DisplaySurface out of the picture, without any changes to the guest/host > interface. Drawback is that it requires adding an opengl dependency to virtio-gpu even > with virgl=off, because the virtio-gpu device will have to handle the copy to the texture > then, in response to guest TRANSFER commands. > > When adding blob resource support: > > Easiest is probably supporting VIRTIO_GPU_BLOB_MEM_GUEST (largely identical to > non-blob resources) with VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE > (allows the host to create a shared mapping). Then you can go create a udmabuf for the > resource on the host side. For the non-gl code path you can mmap() the udmabuf (which > gives you a linear mapping for the scattered guest pages) and create a DisplaySurface > backed by guest ram pages (removing the guest ram -> DisplaySurface copy). For the gl > code path you can create a texture backed by the udmabuf and go render on the host > without copying at all. > > Using VIRTIO_GPU_BLOB_MEM_GUEST + > VIRTIO_GPU_BLOB_FLAG_USE_SHAREABLE for resources needs guest changes too, > either in mesa (when using virgl) or the kernel driver's dumb buffer handling (when not > using virgl). > > Alternatively (listed more for completeness): > > You can create a blob resource with VIRTGPU_BLOB_MEM_HOST3D (requires virgl, > see also virgl_drm_winsys_resource_create_blob in mesa). It will be allocated by the > host, then mapped into the guest using a virtual pci memory bar. Guest userspace (aka > mesa driver) can mmap() these resources and has direct, zero-copy access to the host > resource. > > Going to dma-buf export that, import into i915, then let the gpu render implies we are > doing p2p dma from a physical (pci-assigned) device to the memory bar of a virtual pci > device. > > Doing that should be possible, but frankly I would be surprised if that actually works out- > of-the-box. Dunno how many dragons are lurking here. > Could become an interesting challenge to make that fly. > > > > Beside the code duplication this is also a maintainance issue. This > > > adds one more configuration to virtio-gpu. Right now you can build > > > virtio-gpu with virgl (depends on opengl), or you can build without > > > virgl (doesn't use opengl then). I don't think it is a good idea to add a third mode, > without virgl support but using opengl for blob dma-bufs. > > [Kasireddy, Vivek] We'll have to re-visit this part but for our > > use-case with virtio-gpu, we are disabling virglrenderer in Qemu and > > virgl DRI driver in the Guest. However, we still need to use > > Opengl/EGL to convert the dmabuf (guest fb) to texture and render as part of the > UI/GTK updates. > > Well, VIRTGPU_BLOB_MEM_HOST3D blob resources are created using virgl renderer > commands (VIRGL_CCMD_PIPE_RESOURCE_CREATE). So supporting that without > virglrenderer is not an option. > > VIRTIO_GPU_BLOB_MEM_GUEST might be possible without too much effort. > > > > > On a different note, any particular reason why Qemu UI EGL > > > > implementation is limited to Xorg and not extended to > > > > Wayland/Weston for which there is GTK glarea? > > > > > > Well, ideally I'd love to just use glarea. Which happens on wayland. > > > > > > The problem with Xorg is that the gtk x11 backend uses glx not egl > > > to create an opengl context for glarea. At least that used to be > > > the case in the past, maybe that has changed with newer versions. > > > qemu needs egl contexts though, otherwise dma-bufs don't work. So > > > we are stuck with our own egl widget implementation for now. Probably we will be > able to drop it at some point in the future. > > > [Kasireddy, Vivek] GTK X11 backend still uses GLX and it seems like > > that is not going to change anytime soon. > > Hmm, so the egl backend has to stay for the time being. > > > Having said that, I was wondering if it makes sense to add a new > > purely Wayland backend besides GtkGlArea so that Qemu UI can more > > quickly adopt new features such as explicit sync. I was thinking about the new backend > being similar to this: > > https://cgit.freedesktop.org/wayland/weston/tree/clients/simple-dmabuf > > -egl.c > > I'd prefer to not do that. > > > The reason why I am proposing this idea is because even if we manage > > to add explicit sync support to GTK and it gets merged, upgrading Qemu > > GTK support from 3.22 to > 4.x may prove to be daunting. Currently, > > the way I am doing explicit sync is by adding these new APIs to GTK and calling them > from Qemu: > > Well, we had the same code supporting gtk2+3 with #ifdefs. There are also #ifdefs to > avoid using functions deprecated during 3.x lifetime. > So I expect porting to gtk4 wouldn't be too bad. > > Also I expect qemu wouldn't be the only application needing sync support, so trying to get > that integrated with upstream gtk certainly makes sense. > > > Lastly, on a different note, I noticed that there is a virtio-gpu Windows driver here: > > https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/vi > > ogpu > > > > We are going to try it out but do you know how up to date it is kept? > > No, not following development closely. > > take care, > Gerd
> Hi, > According to > https://lore.kernel.org/dri-devel/20210212110140.gdpu7kapnr7ovdcn@sirius.home.kraxel.org/ > proposal, we made some progress on making a 'virtio-gpu (display) + > pass-through GPU' prototype. We leverage the kmsro framework provided > by mesa to let the virtio-gpu display work with a passed-through GPU > in headless mode. And the MR is here: > https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9592 Cool. > Although our work is different from this on-going discussion which is > about enabling a general way to share buffers between guest and host, > we'd like to leverage this patch. So, is there any plan to refine this > patch? Item (1) on Vivek's new TODO list should provide that. Once we have shared blob resources we can create udmabufs on the host side, which in turn allows to drop extra copies in the display path and speed up this use case as well (both with and without opengl). take care, Gerd