Message ID | 20250126201121.470990-1-dmitry.osipenko@collabora.com (mailing list archive) |
---|---|
Headers | show |
Series | Support virtio-gpu DRM native context | expand |
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > This patchset adds DRM native context support to VirtIO-GPU on Qemu. > > Contarary to Virgl and Venus contexts that mediates high level GFX APIs, > DRM native context [1] mediates lower level kernel driver UAPI, which > reflects in a less CPU overhead and less/simpler code needed to support it. > DRM context consists of a host and guest parts that have to be implemented > for each GPU driver. On a guest side, DRM context presents a virtual GPU as > a real/native host GPU device for GL/VK applications. > > [1] https://www.youtube.com/watch?v=9sFP_yddLLQ > > Today there are four DRM native context drivers existing in a wild: > > - Freedreno (Qualcomm SoC GPUs), completely upstreamed > - AMDGPU, completely upstreamed Well good news and bad news. I can verify that AMD native context works when I run my Aarch64 guest on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). I get potato frame rates though (~150FPS) although I suspect that is because the PCI errata workaround. When it comes to graphics memory allocation is there anything I can do to force all allocations to be very aligned? Is this in the purview of the AMD drm drivers or TTM itself? I'm still seeing corruption with -display gtk,gl=on on my x86 system BTW. I would like to understand if that is a problem with QEMU, GTK or something else in the stack before we merge. > - Intel (i915), merge requests are opened > - Asahi (Apple SoC GPUs), partially merged upstream > <snip>
On 1/27/25 19:17, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >> >> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >> DRM native context [1] mediates lower level kernel driver UAPI, which >> reflects in a less CPU overhead and less/simpler code needed to support it. >> DRM context consists of a host and guest parts that have to be implemented >> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >> a real/native host GPU device for GL/VK applications. >> >> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >> >> Today there are four DRM native context drivers existing in a wild: >> >> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >> - AMDGPU, completely upstreamed > > Well good news and bad news. > > I can verify that AMD native context works when I run my Aarch64 guest > on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). > I get potato frame rates though (~150FPS) although I suspect that is > because the PCI errata workaround. > > When it comes to graphics memory allocation is there anything I can do > to force all allocations to be very aligned? Is this in the purview of > the AMD drm drivers or TTM itself? All GPU allocations should be aligned to a page size. Alignment is specified by AMD driver. I don't expect that alignment is the problem. What's the size of your host and guest pages?
On 1/27/25 19:17, Alex Bennée wrote: ... > I'm still seeing corruption with -display gtk,gl=on on my x86 system > BTW. I would like to understand if that is a problem with QEMU, GTK or > something else in the stack before we merge. I reproduced the display mirroring/corruption issue and bisected it to the following commit. The problem only happens when QEMU/GTK uses Wayland display directly, while previously I was running QEMU with XWayland that doesn't have the problem. Why this change breaks dmabuf displaying with Wayland/GTK is unclear. Reverting commit fixes the bug. +Dongwon Kim +Vivek Kasireddy commit 77bf310084dad38b3a2badf01766c659056f1cf2 Author: Dongwon Kim <dongwon.kim@intel.com> Date: Fri Apr 26 15:50:59 2024 -0700 ui/gtk: Draw guest frame at refresh cycle Draw routine needs to be manually invoked in the next refresh if there is a scanout blob from the guest. This is to prevent a situation where there is a scheduled draw event but it won't happen bacause the window is currently in inactive state (minimized or tabified). If draw is not done for a long time, gl_block timeout and/or fence timeout (on the guest) will happen eventually. v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com>
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/27/25 19:17, Alex Bennée wrote: > ... >> I'm still seeing corruption with -display gtk,gl=on on my x86 system >> BTW. I would like to understand if that is a problem with QEMU, GTK or >> something else in the stack before we merge. > > I reproduced the display mirroring/corruption issue and bisected it to > the following commit. The problem only happens when QEMU/GTK uses > Wayland display directly, while previously I was running QEMU with > XWayland that doesn't have the problem. Why this change breaks dmabuf > displaying with Wayland/GTK is unclear. Ahh that makes sense - I obviously forgot to mention I'm running sway/wayland across both machines. > Reverting commit fixes the bug. > > +Dongwon Kim +Vivek Kasireddy > > commit 77bf310084dad38b3a2badf01766c659056f1cf2 > Author: Dongwon Kim <dongwon.kim@intel.com> > Date: Fri Apr 26 15:50:59 2024 -0700 > > ui/gtk: Draw guest frame at refresh cycle > > Draw routine needs to be manually invoked in the next refresh > if there is a scanout blob from the guest. This is to prevent > a situation where there is a scheduled draw event but it won't > happen bacause the window is currently in inactive state > (minimized or tabified). If draw is not done for a long time, > gl_block timeout and/or fence timeout (on the guest) will happen > eventually. > > v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c > > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > Cc: Gerd Hoffmann <kraxel@redhat.com> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com> > Cc: Daniel P. Berrangé <berrange@redhat.com> > Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> > Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> > Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> Maybe a race on: QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf; ?
Hi, The commit below could change the timing of drawing by making the drawing done at refresh cycle instead of via drawing event. So it looks like either dmabuf or client's framebuffer is being written and read at the same time. Hey, can you describe how the corruption looks like? Is it just garbage image with random noise or the actual frame with some defects like tearing...? > Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > > > On 1/27/25 19:17, Alex Bennée wrote: > > ... > >> I'm still seeing corruption with -display gtk,gl=on on my x86 system > >> BTW. I would like to understand if that is a problem with QEMU, GTK > >> or something else in the stack before we merge. > > > > I reproduced the display mirroring/corruption issue and bisected it to > > the following commit. The problem only happens when QEMU/GTK uses > > Wayland display directly, while previously I was running QEMU with > > XWayland that doesn't have the problem. Why this change breaks dmabuf > > displaying with Wayland/GTK is unclear. > > Ahh that makes sense - I obviously forgot to mention I'm running sway/wayland > across both machines. > > > Reverting commit fixes the bug. > > > > +Dongwon Kim +Vivek Kasireddy > > > > commit 77bf310084dad38b3a2badf01766c659056f1cf2 > > Author: Dongwon Kim <dongwon.kim@intel.com> > > Date: Fri Apr 26 15:50:59 2024 -0700 > > > > ui/gtk: Draw guest frame at refresh cycle > > > > Draw routine needs to be manually invoked in the next refresh > > if there is a scanout blob from the guest. This is to prevent > > a situation where there is a scheduled draw event but it won't > > happen bacause the window is currently in inactive state > > (minimized or tabified). If draw is not done for a long time, > > gl_block timeout and/or fence timeout (on the guest) will happen > > eventually. > > > > v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c > > > > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > > Cc: Gerd Hoffmann <kraxel@redhat.com> > > Cc: Marc-André Lureau <marcandre.lureau@redhat.com> > > Cc: Daniel P. Berrangé <berrange@redhat.com> > > Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> > > Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> > > Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> > > > Maybe a race on: > > QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf; ? > > -- > Alex Bennée > Virtualisation Tech Lead @ Linaro
"Kim, Dongwon" <dongwon.kim@intel.com> writes: > Hi, > > The commit below could change the timing of drawing by making the drawing > done at refresh cycle instead of via drawing event. So it looks like either dmabuf > or client's framebuffer is being written and read at the same time. Hey, can you > describe how the corruption looks like? Is it just garbage image with random noise > or the actual frame with some defects like tearing...? The terminal gets mirrored upside down and the mouse creates damage as it moves about. > >> Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context >> >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >> > On 1/27/25 19:17, Alex Bennée wrote: >> > ... >> >> I'm still seeing corruption with -display gtk,gl=on on my x86 system >> >> BTW. I would like to understand if that is a problem with QEMU, GTK >> >> or something else in the stack before we merge. >> > >> > I reproduced the display mirroring/corruption issue and bisected it to >> > the following commit. The problem only happens when QEMU/GTK uses >> > Wayland display directly, while previously I was running QEMU with >> > XWayland that doesn't have the problem. Why this change breaks dmabuf >> > displaying with Wayland/GTK is unclear. >> >> Ahh that makes sense - I obviously forgot to mention I'm running sway/wayland >> across both machines. >> >> > Reverting commit fixes the bug. >> > >> > +Dongwon Kim +Vivek Kasireddy >> > >> > commit 77bf310084dad38b3a2badf01766c659056f1cf2 >> > Author: Dongwon Kim <dongwon.kim@intel.com> >> > Date: Fri Apr 26 15:50:59 2024 -0700 >> > >> > ui/gtk: Draw guest frame at refresh cycle >> > >> > Draw routine needs to be manually invoked in the next refresh >> > if there is a scanout blob from the guest. This is to prevent >> > a situation where there is a scheduled draw event but it won't >> > happen bacause the window is currently in inactive state >> > (minimized or tabified). If draw is not done for a long time, >> > gl_block timeout and/or fence timeout (on the guest) will happen >> > eventually. >> > >> > v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c >> > >> > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> >> > Cc: Gerd Hoffmann <kraxel@redhat.com> >> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com> >> > Cc: Daniel P. Berrangé <berrange@redhat.com> >> > Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> >> > Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> >> > Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> >> >> >> Maybe a race on: >> >> QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf; ? >> >> -- >> Alex Bennée >> Virtualisation Tech Lead @ Linaro
> Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > > "Kim, Dongwon" <dongwon.kim@intel.com> writes: > > > Hi, > > > > The commit below could change the timing of drawing by making the > > drawing done at refresh cycle instead of via drawing event. So it > > looks like either dmabuf or client's framebuffer is being written and > > read at the same time. Hey, can you describe how the corruption looks > > like? Is it just garbage image with random noise or the actual frame with some > defects like tearing...? > > The terminal gets mirrored upside down and the mouse creates damage as it > moves about. I am wondering if this is reproducible without virgl and drm native context (like w/ sw rasterizer on the guest) as well. > > > > >> Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > >> > >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> > >> > On 1/27/25 19:17, Alex Bennée wrote: > >> > ... > >> >> I'm still seeing corruption with -display gtk,gl=on on my x86 > >> >> system BTW. I would like to understand if that is a problem with > >> >> QEMU, GTK or something else in the stack before we merge. > >> > > >> > I reproduced the display mirroring/corruption issue and bisected it > >> > to the following commit. The problem only happens when QEMU/GTK > >> > uses Wayland display directly, while previously I was running QEMU > >> > with XWayland that doesn't have the problem. Why this change breaks > >> > dmabuf displaying with Wayland/GTK is unclear. > >> > >> Ahh that makes sense - I obviously forgot to mention I'm running > >> sway/wayland across both machines. > >> > >> > Reverting commit fixes the bug. > >> > > >> > +Dongwon Kim +Vivek Kasireddy > >> > > >> > commit 77bf310084dad38b3a2badf01766c659056f1cf2 > >> > Author: Dongwon Kim <dongwon.kim@intel.com> > >> > Date: Fri Apr 26 15:50:59 2024 -0700 > >> > > >> > ui/gtk: Draw guest frame at refresh cycle > >> > > >> > Draw routine needs to be manually invoked in the next refresh > >> > if there is a scanout blob from the guest. This is to prevent > >> > a situation where there is a scheduled draw event but it won't > >> > happen bacause the window is currently in inactive state > >> > (minimized or tabified). If draw is not done for a long time, > >> > gl_block timeout and/or fence timeout (on the guest) will happen > >> > eventually. > >> > > >> > v2: Use gd_gl_area_draw(vc) in gtk-gl-area.c > >> > > >> > Suggested-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > >> > Cc: Gerd Hoffmann <kraxel@redhat.com> > >> > Cc: Marc-André Lureau <marcandre.lureau@redhat.com> > >> > Cc: Daniel P. Berrangé <berrange@redhat.com> > >> > Signed-off-by: Dongwon Kim <dongwon.kim@intel.com> > >> > Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com> > >> > Message-Id: <20240426225059.3871283-1-dongwon.kim@intel.com> > >> > >> > >> Maybe a race on: > >> > >> QemuDmaBuf *dmabuf = vc->gfx.guest_fb.dmabuf; ? > >> > >> -- > >> Alex Bennée > >> Virtualisation Tech Lead @ Linaro > > -- > Alex Bennée > Virtualisation Tech Lead @ Linaro
On 2/4/25 04:51, Kim, Dongwon wrote: >> Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context >> >> "Kim, Dongwon" <dongwon.kim@intel.com> writes: >> >>> Hi, >>> >>> The commit below could change the timing of drawing by making the >>> drawing done at refresh cycle instead of via drawing event. So it >>> looks like either dmabuf or client's framebuffer is being written and >>> read at the same time. Hey, can you describe how the corruption looks >>> like? Is it just garbage image with random noise or the actual frame with some >> defects like tearing...? >> >> The terminal gets mirrored upside down and the mouse creates damage as it >> moves about. > > I am wondering if this is reproducible without virgl and drm native context (like w/ > sw rasterizer on the guest) as well. It looks like a problem with redraw areas, see video sample [1]. [1] https://drive.google.com/file/d/13PN2sFoPsM2ox6_gf9GLXStbsZ27xlLy/view?usp=sharing It's reproducible without native context with a SW render using '-device virtio-vga,hostmem=8G,blob=on'
Thanks for showing me the video. I will take a look and check what would go wrong here. I kinda understand corruption may happen in some scenario but I don't know what could cause the upside down image. Do you have any idea?? Maybe the frame was temporarily replaced with a mishandled texture that QEMU creates from the surface temporarily but I am not sure.. In the worst case scenario, I think we may have to revert the change from gtk-gl-area.c for now until the problem is root-caused. But you would see some rendering timeout in case the GTK window is in invisible state (like minimization). > Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > > On 2/4/25 04:51, Kim, Dongwon wrote: > >> Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > >> > >> "Kim, Dongwon" <dongwon.kim@intel.com> writes: > >> > >>> Hi, > >>> > >>> The commit below could change the timing of drawing by making the > >>> drawing done at refresh cycle instead of via drawing event. So it > >>> looks like either dmabuf or client's framebuffer is being written > >>> and read at the same time. Hey, can you describe how the corruption > >>> looks like? Is it just garbage image with random noise or the actual > >>> frame with some > >> defects like tearing...? > >> > >> The terminal gets mirrored upside down and the mouse creates damage > >> as it moves about. > > > > I am wondering if this is reproducible without virgl and drm native > > context (like w/ sw rasterizer on the guest) as well. > > It looks like a problem with redraw areas, see video sample [1]. > > [1] > https://drive.google.com/file/d/13PN2sFoPsM2ox6_gf9GLXStbsZ27xlLy/view?u > sp=sharing > > It's reproducible without native context with a SW render using '-device virtio- > vga,hostmem=8G,blob=on' > > -- > Best regards, > Dmitry
On 2/5/25 21:53, Kim, Dongwon wrote: > Thanks for showing me the video. I will take a look and check what would go wrong here. I kinda understand corruption may happen > in some scenario but I don't know what could cause the upside down image. Do you have any idea?? Maybe the frame was temporarily replaced with > a mishandled texture that QEMU creates from the surface temporarily but I am not sure.. No clue. Could be anything. Could be a GTK/Wayland bug, could be an obscure QEMU bug. GTK expert wanted here. > In the worst case scenario, I think we may have to revert the change from gtk-gl-area.c for now until the problem is root-caused. Good to me > But you would see some rendering timeout in case the GTK window is in invisible state (like minimization). No timeouts observed
On 2/5/25 23:08, Dmitry Osipenko wrote: >> Thanks for showing me the video. I will take a look and check what would go wrong here. I kinda understand corruption may happen >> in some scenario but I don't know what could cause the upside down image. Do you have any idea?? Maybe the frame was temporarily replaced with >> a mishandled texture that QEMU creates from the surface temporarily but I am not sure.. > No clue. Could be anything. Could be a GTK/Wayland bug, could be an > obscure QEMU bug. GTK expert wanted here. Alright, it's bugged with "blob=on", but works with "blob=off". While I don't see QEMU using blobs. Might be QEMU's bug then.
On 2/6/25 01:13, Dmitry Osipenko wrote: > On 2/5/25 23:08, Dmitry Osipenko wrote: >>> Thanks for showing me the video. I will take a look and check what would go wrong here. I kinda understand corruption may happen >>> in some scenario but I don't know what could cause the upside down image. Do you have any idea?? Maybe the frame was temporarily replaced with >>> a mishandled texture that QEMU creates from the surface temporarily but I am not sure.. >> No clue. Could be anything. Could be a GTK/Wayland bug, could be an >> obscure QEMU bug. GTK expert wanted here. > > Alright, it's bugged with "blob=on", but works with "blob=off". While I > don't see QEMU using blobs. Might be QEMU's bug then. > Looked further at it. So QEMU was using blobs with "blob=on" and I was looking at a wrong place. Then I found that setting y0_top=true for dmabuf makes display to show upside down, but there is no rendering bug with it. Something redraws display with y0_top=true, while it should be y0_top=false. I couldn't figure out how it's related to the offending change. I also noticed that QEMU checks Wayland presence in early_gtk_display_init() and doesn't use EGL callbacks that are used for X11 display, but the y0_top logic looks the same. Interestingly, Windows should be using same display code paths as Wayland, but I don't have ability to test Windows. If nobody will be able to fix the bug soon, at least reverting the Wayland part will be good to do.
Hi Dmitry, > Subject: Re: [PATCH v6 00/10] Support virtio-gpu DRM native context > > On 2/6/25 01:13, Dmitry Osipenko wrote: > > On 2/5/25 23:08, Dmitry Osipenko wrote: > >>> Thanks for showing me the video. I will take a look and check what would > go wrong here. I kinda understand corruption may happen > >>> in some scenario but I don't know what could cause the upside down > image. Do you have any idea?? Maybe the frame was temporarily replaced > with > >>> a mishandled texture that QEMU creates from the surface temporarily but > I am not sure.. > >> No clue. Could be anything. Could be a GTK/Wayland bug, could be an > >> obscure QEMU bug. GTK expert wanted here. > > > > Alright, it's bugged with "blob=on", but works with "blob=off". While I > > don't see QEMU using blobs. Might be QEMU's bug then. > > > > Looked further at it. So QEMU was using blobs with "blob=on" and I was > looking at a wrong place. Then I found that setting y0_top=true for > dmabuf makes display to show upside down, but there is no rendering bug > with it. Something redraws display with y0_top=true, while it should be > y0_top=false. I couldn't figure out how it's related to the offending > change. It is possible that the additional draw call (offending change) is exposing a bug elsewhere. This is because having both blob=on and virgl=on was not a valid combo until recently. IIUC, virgl directly calls gd_gl_area_scanout_texture() with y0_top=true (sometimes) but I guess gd_gl_area_scanout_texture() might be getting called again with yo_top=false via gd_gl_area_scanout_dmabuf() in the same repaint cycle? > > I also noticed that QEMU checks Wayland presence in > early_gtk_display_init() and doesn't use EGL callbacks that are used for > X11 display, but the y0_top logic looks the same. Interestingly, Windows > should be using same display code paths as Wayland, but I don't have > ability to test Windows. > > If nobody will be able to fix the bug soon, at least reverting the > Wayland part will be good to do. Hope we can find a proper fix soon but looks like Dongwon already sent the patch to revert the changes in gtk-gl-area.c. Thanks, Vivek > > -- > Best regards, > Dmitry
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > This patchset adds DRM native context support to VirtIO-GPU on Qemu. > > Contarary to Virgl and Venus contexts that mediates high level GFX APIs, > DRM native context [1] mediates lower level kernel driver UAPI, which > reflects in a less CPU overhead and less/simpler code needed to support it. > DRM context consists of a host and guest parts that have to be implemented > for each GPU driver. On a guest side, DRM context presents a virtual GPU as > a real/native host GPU device for GL/VK applications. > <snip> So first the good news. I can now get this up and running (x86/kvm guest with Intel graphics) and as far as I can tell the native context mode is working. With Dongwon Kim's patch the mirroring/corruption I was seeing is gone. I can successfully run glmark2-wayland (although see bellow) but vkmark completely fails to start reporting: MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device MESA: error: Failed to create virtgpu AddressSpaceStream MESA: error: vulkan: Failed to get host connection MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST === Physical Device 0 === Vendor ID: 0x8086 Device ID: 0xA780 Device Name: Intel(R) Graphics (RPL-S) Driver Version: 101068899 Device UUID: b39e1cf39b101489e3c6039406f78d6c I was booting with 4G of shared memory. Later versions of vkmark (2025.01) fail due to missing the VK_KHR_display extension required as of https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a > # Note about known performance problem in Qemu: > > DRM contexts are mapping host blobs extensively and these mapping > operations work slowly in Qemu. Exact reason is unknown. Mappings work > fast on Crosvm For DRM contexts this problem is more visible than for > Venus/Virgl. And how! With drm_native I get a lot of stutter while running and barely 100FPS (compared to ~8000 on pure venus). IMHO we need to figure out why there is such a discrepancy before merging because currently it makes more sense to use Venus. <snip> I'll do some more testing with my AMD/Aarch64 rig next week.
On 2/14/25 17:33, Alex Bennée wrote: > Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > >> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >> >> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >> DRM native context [1] mediates lower level kernel driver UAPI, which >> reflects in a less CPU overhead and less/simpler code needed to support it. >> DRM context consists of a host and guest parts that have to be implemented >> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >> a real/native host GPU device for GL/VK applications. >> > <snip> > > So first the good news. I can now get this up and running (x86/kvm guest > with Intel graphics) and as far as I can tell the native context mode is > working. With Dongwon Kim's patch the mirroring/corruption I was seeing > is gone. > > I can successfully run glmark2-wayland (although see bellow) but vkmark > completely fails to start reporting: > > MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING > MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE > MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument > MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST > MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device > MESA: error: Failed to create virtgpu AddressSpaceStream > MESA: error: vulkan: Failed to get host connection > MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST > === Physical Device 0 === > Vendor ID: 0x8086 > Device ID: 0xA780 > Device Name: Intel(R) Graphics (RPL-S) > Driver Version: 101068899 > Device UUID: b39e1cf39b101489e3c6039406f78d6c > > I was booting with 4G of shared memory. Thanks for the testing. I assume all these errors are generated by the failing gfxstream. Hence, may ignore them since you don't have enabled gfxstream. > Later versions of vkmark (2025.01) fail due to missing the > VK_KHR_display extension required as of > https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a This VK_KHR_display problem is only reproducible with your rootfs that you shared with me. It could be a trouble with your build configs or a buggy package version used by your rootfs build, more likely the former. >> # Note about known performance problem in Qemu: >> >> DRM contexts are mapping host blobs extensively and these mapping >> operations work slowly in Qemu. Exact reason is unknown. Mappings work >> fast on Crosvm For DRM contexts this problem is more visible than for >> Venus/Virgl. > > And how! > > With drm_native I get a lot of stutter while running and barely 100FPS > (compared to ~8000 on pure venus). IMHO we need to figure out why there > is such a discrepancy before merging because currently it makes more > sense to use If you'd run with Xorg/Wayland directly without a DE, then it should work okay. This should be a problem with unmapping performance that I'm thinking about. That unmapping problem is partially understood. Unmapping code works correctly, but we'll need to optimize the flatview code to perform unmapping immediately. Meanwhile, you may apply the QEMU hack below, it should resolve most of the stutter, please let me know if it helps. There is also a pending Mesa intel-virtio blob mapping optimization that currently isn't available in my gitlab code, I'll refresh that feature and then ask you to try it. Could be that there is more to the unmapping perf issue in QEMU. I'm investigating. AMDGPU nctx is less affected by the bad unmapping performance. I expect it will work well for you. diff --git a/util/rcu.c b/util/rcu.c index fa32c942e4bb..aac3522c323c 100644 --- a/util/rcu.c +++ b/util/rcu.c @@ -174,7 +174,7 @@ void synchronize_rcu(void) } -#define RCU_CALL_MIN_SIZE 30 +#define RCU_CALL_MIN_SIZE 1 /* Multi-producer, single-consumer queue based on urcu/static/wfqueue.h * from liburcu. Note that head is only used by the consumer. @@ -267,7 +267,7 @@ static void *call_rcu_thread(void *opaque) * added before synchronize_rcu() starts. */ while (n == 0 || (n < RCU_CALL_MIN_SIZE && ++tries <= 5)) { - g_usleep(10000); + g_usleep(1000); if (n == 0) { qemu_event_reset(&rcu_call_ready_event); n = qatomic_read(&rcu_call_count);
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 1/27/25 19:17, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>> >>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>> DRM native context [1] mediates lower level kernel driver UAPI, which >>> reflects in a less CPU overhead and less/simpler code needed to support it. >>> DRM context consists of a host and guest parts that have to be implemented >>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>> a real/native host GPU device for GL/VK applications. >>> >>> [1] https://www.youtube.com/watch?v=9sFP_yddLLQ >>> >>> Today there are four DRM native context drivers existing in a wild: >>> >>> - Freedreno (Qualcomm SoC GPUs), completely upstreamed >>> - AMDGPU, completely upstreamed >> >> Well good news and bad news. >> >> I can verify that AMD native context works when I run my Aarch64 guest >> on my Aarch64 host with -accel TCG (therefor avoiding KVM all together). >> I get potato frame rates though (~150FPS) although I suspect that is >> because the PCI errata workaround. >> >> When it comes to graphics memory allocation is there anything I can do >> to force all allocations to be very aligned? Is this in the purview of >> the AMD drm drivers or TTM itself? > > All GPU allocations should be aligned to a page size. Alignment is > specified by AMD driver. I don't expect that alignment is the problem. > What's the size of your host and guest pages? 4k AFAIK.
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 2/14/25 17:33, Alex Bennée wrote: >> Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: >> >>> This patchset adds DRM native context support to VirtIO-GPU on Qemu. >>> >>> Contarary to Virgl and Venus contexts that mediates high level GFX APIs, >>> DRM native context [1] mediates lower level kernel driver UAPI, which >>> reflects in a less CPU overhead and less/simpler code needed to support it. >>> DRM context consists of a host and guest parts that have to be implemented >>> for each GPU driver. On a guest side, DRM context presents a virtual GPU as >>> a real/native host GPU device for GL/VK applications. >>> >> <snip> >> >> So first the good news. I can now get this up and running (x86/kvm guest >> with Intel graphics) and as far as I can tell the native context mode is >> working. With Dongwon Kim's patch the mirroring/corruption I was seeing >> is gone. >> >> I can successfully run glmark2-wayland (although see bellow) but vkmark >> completely fails to start reporting: >> >> MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_FENCE_PASSING >> MESA: info: virtgpu backend not enabling VIRTGPU_PARAM_CREATE_GUEST_HANDLE >> MESA: error: DRM_IOCTL_VIRTGPU_GET_CAPS failed with Invalid argument >> MESA: error: DRM_IOCTL_VIRTGPU_CONTEXT_INIT failed with Invalid argument, continuing without context... >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:681: VK_ERROR_DEVICE_LOST >> MESA: error: DRM_VIRTGPU_RESOURCE_CREATE_BLOB failed with No space left on device >> MESA: error: Failed to create virtgpu AddressSpaceStream >> MESA: error: vulkan: Failed to get host connection >> MESA: warning: ../src/gfxstream/guest/vulkan/gfxstream_vk_device.cpp:332: VK_ERROR_DEVICE_LOST >> === Physical Device 0 === >> Vendor ID: 0x8086 >> Device ID: 0xA780 >> Device Name: Intel(R) Graphics (RPL-S) >> Driver Version: 101068899 >> Device UUID: b39e1cf39b101489e3c6039406f78d6c >> >> I was booting with 4G of shared memory. > > Thanks for the testing. > > I assume all these errors are generated by the failing gfxstream. Hence, > may ignore them since you don't have enabled gfxstream. > >> Later versions of vkmark (2025.01) fail due to missing the >> VK_KHR_display extension required as of >> https://github.com/vkmark/vkmark/commit/7c3189c6482cb84c3c0e69d6dabb9d80e0c0092a > > This VK_KHR_display problem is only reproducible with your rootfs that > you shared with me. It could be a trouble with your build configs or a > buggy package version used by your rootfs build, more likely the > former. So you have built that latest vkmark? This is a recent addition to vkmark for the 2025.1 release. Does vulkaninfo --summary show the extension available for you? It is certainly available on the host side: VK_KHR_display : extension revision 23 >>> # Note about known performance problem in Qemu: >>> >>> DRM contexts are mapping host blobs extensively and these mapping >>> operations work slowly in Qemu. Exact reason is unknown. Mappings work >>> fast on Crosvm For DRM contexts this problem is more visible than for >>> Venus/Virgl. >> >> And how! >> >> With drm_native I get a lot of stutter while running and barely 100FPS >> (compared to ~8000 on pure venus). IMHO we need to figure out why there >> is such a discrepancy before merging because currently it makes more >> sense to use > If you'd run with Xorg/Wayland directly without a DE, then it should > work okay. This should be a problem with unmapping performance that I'm > thinking about. > > That unmapping problem is partially understood. Unmapping code works > correctly, but we'll need to optimize the flatview code to perform > unmapping immediately. Why immediately? Surely if we are unmapping we can defer it. Or is this a case of having stale mappings making the life of new allocations harder? > Meanwhile, you may apply the QEMU hack below, it > should resolve most of the stutter, please let me know if it helps. > > There is also a pending Mesa intel-virtio blob mapping optimization that > currently isn't available in my gitlab code, I'll refresh that feature > and then ask you to try it. > > Could be that there is more to the unmapping perf issue in QEMU. I'm > investigating. > > AMDGPU nctx is less affected by the bad unmapping performance. I expect > it will work well for you. > > > > diff --git a/util/rcu.c b/util/rcu.c > index fa32c942e4bb..aac3522c323c 100644 > --- a/util/rcu.c > +++ b/util/rcu.c > @@ -174,7 +174,7 @@ void synchronize_rcu(void) > } > > > -#define RCU_CALL_MIN_SIZE 30 > +#define RCU_CALL_MIN_SIZE 1 > > /* Multi-producer, single-consumer queue based on urcu/static/wfqueue.h > * from liburcu. Note that head is only used by the consumer. > @@ -267,7 +267,7 @@ static void *call_rcu_thread(void *opaque) > * added before synchronize_rcu() starts. > */ > while (n == 0 || (n < RCU_CALL_MIN_SIZE && ++tries <= 5)) { > - g_usleep(10000); > + g_usleep(1000); > if (n == 0) { > qemu_event_reset(&rcu_call_ready_event); > n = qatomic_read(&rcu_call_count);
On 2/17/25 18:22, Alex Bennée wrote: ... >> This VK_KHR_display problem is only reproducible with your rootfs that >> you shared with me. It could be a trouble with your build configs or a >> buggy package version used by your rootfs build, more likely the >> former. > So you have built that latest vkmark? This is a recent addition to > vkmark for the 2025.1 release. Yes, latest 2025.1 from git/master. > Does vulkaninfo --summary show the extension available for you? It is > certainly available on the host side: > > VK_KHR_display : extension revision 23 > Have it on guest with my rootfs, not with yours. I'd suspect problem is with the your Mesa build flags, maybe you haven't enabled necessary flags related to WSI. .. >>> With drm_native I get a lot of stutter while running and barely 100FPS >>> (compared to ~8000 on pure venus). IMHO we need to figure out why there >>> is such a discrepancy before merging because currently it makes more >>> sense to use >> If you'd run with Xorg/Wayland directly without a DE, then it should >> work okay. This should be a problem with unmapping performance that I'm >> thinking about. >> >> That unmapping problem is partially understood. Unmapping code works >> correctly, but we'll need to optimize the flatview code to perform >> unmapping immediately. > Why immediately? Surely if we are unmapping we can defer it. Or is this > a case of having stale mappings making the life of new allocations > harder? Unmapping currently works synchronously for virtio-gpu in QEMU, hence deferring it blocks whole virtio-gpu up to 100+ ms. And if multiple unmappings are done in a row, then it's 100ms multiplied by the number of unmappings.
Dmitry Osipenko <dmitry.osipenko@collabora.com> writes: > On 2/17/25 18:22, Alex Bennée wrote: > ... >>> This VK_KHR_display problem is only reproducible with your rootfs that >>> you shared with me. It could be a trouble with your build configs or a >>> buggy package version used by your rootfs build, more likely the >>> former. >> So you have built that latest vkmark? This is a recent addition to >> vkmark for the 2025.1 release. > > Yes, latest 2025.1 from git/master. > >> Does vulkaninfo --summary show the extension available for you? It is >> certainly available on the host side: >> >> VK_KHR_display : extension revision 23 >> > > Have it on guest with my rootfs, not with yours. I'd suspect problem is > with the your Mesa build flags, maybe you haven't enabled necessary > flags related to WSI. I can't see any reference in the buildroot recipes. What is your mesa's build flags? > > .. >>>> With drm_native I get a lot of stutter while running and barely 100FPS >>>> (compared to ~8000 on pure venus). IMHO we need to figure out why there >>>> is such a discrepancy before merging because currently it makes more >>>> sense to use >>> If you'd run with Xorg/Wayland directly without a DE, then it should >>> work okay. This should be a problem with unmapping performance that I'm >>> thinking about. >>> >>> That unmapping problem is partially understood. Unmapping code works >>> correctly, but we'll need to optimize the flatview code to perform >>> unmapping immediately. >> Why immediately? Surely if we are unmapping we can defer it. Or is this >> a case of having stale mappings making the life of new allocations >> harder? > > Unmapping currently works synchronously for virtio-gpu in QEMU, hence > deferring it blocks whole virtio-gpu up to 100+ ms. And if multiple > unmappings are done in a row, then it's 100ms multiplied by the number > of unmappings.