Message ID | 20220213024222.3548-1-akihiko.odaki@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | ui/dbus: Share one listener for a console | expand |
Hi Akihiko On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > ui/dbus required to have multiple DisplayChangeListeners (possibly with > OpenGL) > for a console but that caused several problems: > - It broke egl-headless, an unusual display which implements OpenGL > rendering > for non-OpenGL displays. The code to support multiple > DisplayChangeListeners > does not consider the case where non-OpenGL displays listens OpenGL > consoles. > Can you provide instructions on what broke? Even better write a test, please. "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py", which covers egl-headless usage, works. > - Multiple OpenGL DisplayChangeListeners of dbus shares one DisplaySurface > and > modifies its texture field, causing OpenGL texture leak and > use-after-free. > Again, please provide instructions. I have regularly run -display dbus with multiple clients and qemu compiled with sanitizers. I don't see any leak or use after free. > - Introduced extra code to check the compatibility of OpenGL context > providers > and OpenGL texture renderers where those are often inherently tightly > coupled > since they must share the same hardware. > So code checks are meant to prevent misusage. They might be too limited or broken in some ways, but reverting is likely going to introduce other regressions I was trying to fix. - Needed extra work to broadcast the same change to multiple dbus listeners. > > Compared to what? > This series solve them by implementing the change broadcast in ui/dbus, > which > knows exactly what is needed. Changes for the common code to support > multiple > OpenGL DisplayChangeListeners were reverted to fix egl-headless and reduce > the code size. > Thanks a lot for your work, I am going to take a look at your approach. But please help us understand better what the problem actually is, by giving examples & tests to avoid future regressions and document the expected behaviour. > Akihiko Odaki (6): > ui/dbus: Share one listener for a console > Revert "console: save current scanout details" > Revert "ui: split the GL context in a different object" > Revert "ui: dispatch GL events to all listeners" > Revert "ui: associate GL context outside of display listener > registration" > Revert "ui: factor out qemu_console_set_display_gl_ctx()" > > include/ui/console.h | 60 +----- > include/ui/egl-context.h | 6 +- > include/ui/gtk.h | 11 +- > include/ui/sdl2.h | 7 +- > include/ui/spice-display.h | 1 - > ui/console.c | 258 +++++++---------------- > ui/dbus-console.c | 109 ++-------- > ui/dbus-listener.c | 417 +++++++++++++++++++++++++++---------- > ui/dbus.c | 22 -- > ui/dbus.h | 36 +++- > ui/egl-context.c | 6 +- > ui/egl-headless.c | 20 +- > ui/gtk-egl.c | 10 +- > ui/gtk-gl-area.c | 8 +- > ui/gtk.c | 25 +-- > ui/sdl2-gl.c | 10 +- > ui/sdl2.c | 14 +- > ui/spice-display.c | 19 +- > 18 files changed, 498 insertions(+), 541 deletions(-) > > -- > 2.32.0 (Apple Git-132) > > >
On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau <marcandre.lureau@gmail.com> wrote: > > Hi Akihiko > > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: >> >> ui/dbus required to have multiple DisplayChangeListeners (possibly with OpenGL) >> for a console but that caused several problems: >> - It broke egl-headless, an unusual display which implements OpenGL rendering >> for non-OpenGL displays. The code to support multiple DisplayChangeListeners >> does not consider the case where non-OpenGL displays listens OpenGL consoles. > > > Can you provide instructions on what broke? Even better write a test, please. The following command segfaults. Adding a test would be nice, but it would need a binary which uses OpenGL. qemu-system-x86_64 -device virtio-gpu-gl-pci -display egl-headless -vnc :0 -m 8G -cdrom Fedora-Workstation-Live-x86_64-34-1.2.iso -accel kvm > > "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py", which covers egl-headless usage, works. > >> >> - Multiple OpenGL DisplayChangeListeners of dbus shares one DisplaySurface and >> modifies its texture field, causing OpenGL texture leak and use-after-free. > > > Again, please provide instructions. I have regularly run -display dbus with multiple clients and qemu compiled with sanitizers. I don't see any leak or use after free. I doubt sanitizers can find this because it is an OpenGL texture. You may add a probe around surface_gl_create_texture and surface_gl_destroy_texture. > >> >> - Introduced extra code to check the compatibility of OpenGL context providers >> and OpenGL texture renderers where those are often inherently tightly coupled >> since they must share the same hardware. > > > So code checks are meant to prevent misusage. They might be too limited or broken in some ways, but reverting is likely going to introduce other regressions I was trying to fix. The misuse will not occur because DisplayChangeListeners will be merged with OpenGL context providers. > >> - Needed extra work to broadcast the same change to multiple dbus listeners. >> > > Compared to what? Compared to sharing one DisplayChangeListener for multiple dbus listeners. > >> >> This series solve them by implementing the change broadcast in ui/dbus, which >> knows exactly what is needed. Changes for the common code to support multiple >> OpenGL DisplayChangeListeners were reverted to fix egl-headless and reduce >> the code size. > > > Thanks a lot for your work, I am going to take a look at your approach. But please help us understand better what the problem actually is, by giving examples & tests to avoid future regressions and document the expected behaviour. The thing is really complicated and I may miss details so please feel free to ask questions or provide suggestions. Regards, Akihiko Odaki > >> >> Akihiko Odaki (6): >> ui/dbus: Share one listener for a console >> Revert "console: save current scanout details" >> Revert "ui: split the GL context in a different object" >> Revert "ui: dispatch GL events to all listeners" >> Revert "ui: associate GL context outside of display listener >> registration" >> Revert "ui: factor out qemu_console_set_display_gl_ctx()" >> >> include/ui/console.h | 60 +----- >> include/ui/egl-context.h | 6 +- >> include/ui/gtk.h | 11 +- >> include/ui/sdl2.h | 7 +- >> include/ui/spice-display.h | 1 - >> ui/console.c | 258 +++++++---------------- >> ui/dbus-console.c | 109 ++-------- >> ui/dbus-listener.c | 417 +++++++++++++++++++++++++++---------- >> ui/dbus.c | 22 -- >> ui/dbus.h | 36 +++- >> ui/egl-context.c | 6 +- >> ui/egl-headless.c | 20 +- >> ui/gtk-egl.c | 10 +- >> ui/gtk-gl-area.c | 8 +- >> ui/gtk.c | 25 +-- >> ui/sdl2-gl.c | 10 +- >> ui/sdl2.c | 14 +- >> ui/spice-display.c | 19 +- >> 18 files changed, 498 insertions(+), 541 deletions(-) >> >> -- >> 2.32.0 (Apple Git-132) >> >> > > > -- > Marc-André Lureau
Hi On Mon, Feb 14, 2022 at 5:15 PM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau > <marcandre.lureau@gmail.com> wrote: > > > > Hi Akihiko > > > > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki <akihiko.odaki@gmail.com> > wrote: > >> > >> ui/dbus required to have multiple DisplayChangeListeners (possibly with > OpenGL) > >> for a console but that caused several problems: > >> - It broke egl-headless, an unusual display which implements OpenGL > rendering > >> for non-OpenGL displays. The code to support multiple > DisplayChangeListeners > >> does not consider the case where non-OpenGL displays listens OpenGL > consoles. > > > > > > Can you provide instructions on what broke? Even better write a test, > please. > > The following command segfaults. Adding a test would be nice, but it > would need a binary which uses OpenGL. > qemu-system-x86_64 -device virtio-gpu-gl-pci -display egl-headless > -vnc :0 -m 8G -cdrom Fedora-Workstation-Live-x86_64-34-1.2.iso -accel > kvm > Thanks! This is clearly a mistake from commit 7cc712e98 ("ui: dispatch GL events to all listener"). It should have taken into account that some listeners do not have GL callbacks, and guard the call. We should wrap the missing ops->dpy_gl_call() with a if (ops->dpy_gl_call) ? I'll send a patch for that. > > > "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py", which > covers egl-headless usage, works. > > > >> > >> - Multiple OpenGL DisplayChangeListeners of dbus shares one > DisplaySurface and > >> modifies its texture field, causing OpenGL texture leak and > use-after-free. > > > > > > Again, please provide instructions. I have regularly run -display dbus > with multiple clients and qemu compiled with sanitizers. I don't see any > leak or use after free. > > I doubt sanitizers can find this because it is an OpenGL texture. You > may add a probe around surface_gl_create_texture and > surface_gl_destroy_texture. > Indeed, a surface is created on each frame, because we create a 2d surface on each qemu_console_resize(), which is called at each virgl scanout. This is a regression introduced by commit ebced09185 ("console: save current scanout details"). I can propose an easy fix, please check it. And the root of the leak is actually surface_gl_create_texutre(), it should be idempotent, just like destroy(). > > > >> > >> - Introduced extra code to check the compatibility of OpenGL context > providers > >> and OpenGL texture renderers where those are often inherently tightly > coupled > >> since they must share the same hardware. > > > > > > So code checks are meant to prevent misusage. They might be too limited > or broken in some ways, but reverting is likely going to introduce other > regressions I was trying to fix. > > The misuse will not occur because DisplayChangeListeners will be > merged with OpenGL context providers. > Ok, but aren't the checks enough to prevent it already? I have to check the use cases and differences of design, but you might be right that we don't need such a split after all. > > > >> - Needed extra work to broadcast the same change to multiple dbus > listeners. > >> > > > > Compared to what? > > Compared to sharing one DisplayChangeListener for multiple dbus listeners. > Well, you just moved the problem to the dbus display, not removed any work. What we have currently is more generic, you should be able to add/remove various listeners (in theory, we only really do it for dbus at this point). > > > > >> > >> This series solve them by implementing the change broadcast in ui/dbus, > which > >> knows exactly what is needed. Changes for the common code to support > multiple > >> OpenGL DisplayChangeListeners were reverted to fix egl-headless and > reduce > >> the code size. > > > > > > Thanks a lot for your work, I am going to take a look at your approach. > But please help us understand better what the problem actually is, by > giving examples & tests to avoid future regressions and document the > expected behaviour. > > The thing is really complicated and I may miss details so please feel > free to ask questions or provide suggestions. > > Reverting changes and proposing an alternative implementation requires detailed explanation and convincing arguments. It may take a while, but we will try to get through the problems and evaluate the alternative designs. Thanks again for your help! Regards, > Akihiko Odaki > > > > > >> > >> Akihiko Odaki (6): > >> ui/dbus: Share one listener for a console > >> Revert "console: save current scanout details" > >> Revert "ui: split the GL context in a different object" > >> Revert "ui: dispatch GL events to all listeners" > >> Revert "ui: associate GL context outside of display listener > >> registration" > >> Revert "ui: factor out qemu_console_set_display_gl_ctx()" > >> > >> include/ui/console.h | 60 +----- > >> include/ui/egl-context.h | 6 +- > >> include/ui/gtk.h | 11 +- > >> include/ui/sdl2.h | 7 +- > >> include/ui/spice-display.h | 1 - > >> ui/console.c | 258 +++++++---------------- > >> ui/dbus-console.c | 109 ++-------- > >> ui/dbus-listener.c | 417 +++++++++++++++++++++++++++---------- > >> ui/dbus.c | 22 -- > >> ui/dbus.h | 36 +++- > >> ui/egl-context.c | 6 +- > >> ui/egl-headless.c | 20 +- > >> ui/gtk-egl.c | 10 +- > >> ui/gtk-gl-area.c | 8 +- > >> ui/gtk.c | 25 +-- > >> ui/sdl2-gl.c | 10 +- > >> ui/sdl2.c | 14 +- > >> ui/spice-display.c | 19 +- > >> 18 files changed, 498 insertions(+), 541 deletions(-) > >> > >> -- > >> 2.32.0 (Apple Git-132) > >> > >> > > > > > > -- > > Marc-André Lureau >
On 2022/02/15 4:49, Marc-André Lureau wrote: > Hi > > On Mon, Feb 14, 2022 at 5:15 PM Akihiko Odaki <akihiko.odaki@gmail.com > <mailto:akihiko.odaki@gmail.com>> wrote: > > On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau > <marcandre.lureau@gmail.com <mailto:marcandre.lureau@gmail.com>> wrote: > > > > Hi Akihiko > > > > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki > <akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com>> wrote: > >> > >> ui/dbus required to have multiple DisplayChangeListeners > (possibly with OpenGL) > >> for a console but that caused several problems: > >> - It broke egl-headless, an unusual display which implements > OpenGL rendering > >> for non-OpenGL displays. The code to support multiple > DisplayChangeListeners > >> does not consider the case where non-OpenGL displays listens > OpenGL consoles. > > > > > > Can you provide instructions on what broke? Even better write a > test, please. > > The following command segfaults. Adding a test would be nice, but it > would need a binary which uses OpenGL. > qemu-system-x86_64 -device virtio-gpu-gl-pci -display egl-headless > -vnc :0 -m 8G -cdrom Fedora-Workstation-Live-x86_64-34-1.2.iso -accel > kvm > > > Thanks! > > This is clearly a mistake from commit 7cc712e98 ("ui: dispatch GL events > to all listener"). > > It should have taken into account that some listeners do not have GL > callbacks, and guard the call. > > We should wrap the missing ops->dpy_gl_call() with a if > (ops->dpy_gl_call) ? I'll send a patch for that. The assumption that OpenGL DisplayChangeListener and non-OpenGL DisplayChangeListener are exclusive is now broken so we have to examine if the whole patch series works correct without the assumption. Other problem I have found (and forgot to mention) is: - that console_select and register_displaychangelistener may not call dpy_gfx_switch and call dpy_gl_scanout_texture instead. It is incompatible with non-OpenGL displays and - that the compatibility check breaks if egl-headless is present and a non-OpenGL DisplayChangeListener with con field is being added. By the way, dbus registers a DisplayChangeListener for the size detection, apparently it fails to set "con" field, making it an accidental user of console_select. > > > > > "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py", > which covers egl-headless usage, works. > > > >> > >> - Multiple OpenGL DisplayChangeListeners of dbus shares one > DisplaySurface and > >> modifies its texture field, causing OpenGL texture leak and > use-after-free. > > > > > > Again, please provide instructions. I have regularly run -display > dbus with multiple clients and qemu compiled with sanitizers. I > don't see any leak or use after free. > > I doubt sanitizers can find this because it is an OpenGL texture. You > may add a probe around surface_gl_create_texture and > surface_gl_destroy_texture. > > > Indeed, a surface is created on each frame, because we create a 2d > surface on each qemu_console_resize(), which is called at each virgl > scanout. This is a regression introduced by commit ebced09185 ("console: > save current scanout details"). I can propose an easy fix, please check it. > > And the root of the leak is actually surface_gl_create_texutre(), it > should be idempotent, just like destroy(). That is not enough since it may leave the texture present after unregister_displaychangelistener. And calling surface_gl_destroy_texture() before unregister_displaychangelistener may break the other listeners. > > > > > >> > >> - Introduced extra code to check the compatibility of OpenGL > context providers > >> and OpenGL texture renderers where those are often inherently > tightly coupled > >> since they must share the same hardware. > > > > > > So code checks are meant to prevent misusage. They might be too > limited or broken in some ways, but reverting is likely going to > introduce other regressions I was trying to fix. > > The misuse will not occur because DisplayChangeListeners will be > merged with OpenGL context providers. > > > Ok, but aren't the checks enough to prevent it already? I have to check > the use cases and differences of design, but you might be right that we > don't need such a split after all. Yes, the point is that it requires extra code. > > > > > >> - Needed extra work to broadcast the same change to multiple > dbus listeners. > >> > > > > Compared to what? > > Compared to sharing one DisplayChangeListener for multiple dbus > listeners. > > > Well, you just moved the problem to the dbus display, not removed any work. > > What we have currently is more generic, you should be able to add/remove > various listeners (in theory, we only really do it for dbus at this point). The each DisplayChangeListeners have to upload the DisplaySurface to the graphics accelerator, create a DMA-BUF file descriptor, and make it suitable for D-Bus delivery. The duplicate work can be just done once if we have only one DisplayChangeListener for one console. > > > > > >> > >> This series solve them by implementing the change broadcast in > ui/dbus, which > >> knows exactly what is needed. Changes for the common code to > support multiple > >> OpenGL DisplayChangeListeners were reverted to fix egl-headless > and reduce > >> the code size. > > > > > > Thanks a lot for your work, I am going to take a look at your > approach. But please help us understand better what the problem > actually is, by giving examples & tests to avoid future regressions > and document the expected behaviour. > > The thing is really complicated and I may miss details so please feel > free to ask questions or provide suggestions. > > > Reverting changes and proposing an alternative implementation requires > detailed explanation and convincing arguments. It may take a while, but > we will try to get through the problems and evaluate the alternative > designs. Thanks again for your help! Rather, I think proposing a large change to common console code requires thorough examination and it should be reverted before it reaches the release if it is doubtful that it is correct and reduces the complexity of a few displays (possibly in the future). "No regression" should come first before fix and feature. We can always revisit the change land it in a proper form even after reverting the change. Regards, Akihiko Odaki > > Regards, > Akihiko Odaki > > > > > >> > >> Akihiko Odaki (6): > >> ui/dbus: Share one listener for a console > >> Revert "console: save current scanout details" > >> Revert "ui: split the GL context in a different object" > >> Revert "ui: dispatch GL events to all listeners" > >> Revert "ui: associate GL context outside of display listener > >> registration" > >> Revert "ui: factor out qemu_console_set_display_gl_ctx()" > >> > >> include/ui/console.h | 60 +----- > >> include/ui/egl-context.h | 6 +- > >> include/ui/gtk.h | 11 +- > >> include/ui/sdl2.h | 7 +- > >> include/ui/spice-display.h | 1 - > >> ui/console.c | 258 +++++++---------------- > >> ui/dbus-console.c | 109 ++-------- > >> ui/dbus-listener.c | 417 > +++++++++++++++++++++++++++---------- > >> ui/dbus.c | 22 -- > >> ui/dbus.h | 36 +++- > >> ui/egl-context.c | 6 +- > >> ui/egl-headless.c | 20 +- > >> ui/gtk-egl.c | 10 +- > >> ui/gtk-gl-area.c | 8 +- > >> ui/gtk.c | 25 +-- > >> ui/sdl2-gl.c | 10 +- > >> ui/sdl2.c | 14 +- > >> ui/spice-display.c | 19 +- > >> 18 files changed, 498 insertions(+), 541 deletions(-) > >> > >> -- > >> 2.32.0 (Apple Git-132) > >> > >> > > > > > > -- > > Marc-André Lureau > > > > -- > Marc-André Lureau
Hi On Tue, Feb 15, 2022 at 7:09 AM Akihiko Odaki <akihiko.odaki@gmail.com> wrote: > > > On 2022/02/15 4:49, Marc-André Lureau wrote: > > Hi > > > > On Mon, Feb 14, 2022 at 5:15 PM Akihiko Odaki <akihiko.odaki@gmail.com > > <mailto:akihiko.odaki@gmail.com>> wrote: > > > > On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau > > <marcandre.lureau@gmail.com <mailto:marcandre.lureau@gmail.com>> > wrote: > > > > > > Hi Akihiko > > > > > > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki > > <akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com>> wrote: > > >> > > >> ui/dbus required to have multiple DisplayChangeListeners > > (possibly with OpenGL) > > >> for a console but that caused several problems: > > >> - It broke egl-headless, an unusual display which implements > > OpenGL rendering > > >> for non-OpenGL displays. The code to support multiple > > DisplayChangeListeners > > >> does not consider the case where non-OpenGL displays listens > > OpenGL consoles. > > > > > > > > > Can you provide instructions on what broke? Even better write a > > test, please. > > > > The following command segfaults. Adding a test would be nice, but it > > would need a binary which uses OpenGL. > > qemu-system-x86_64 -device virtio-gpu-gl-pci -display egl-headless > > -vnc :0 -m 8G -cdrom Fedora-Workstation-Live-x86_64-34-1.2.iso -accel > > kvm > > > > > > Thanks! > > > > This is clearly a mistake from commit 7cc712e98 ("ui: dispatch GL events > > to all listener"). > > > > It should have taken into account that some listeners do not have GL > > callbacks, and guard the call. > > > > We should wrap the missing ops->dpy_gl_call() with a if > > (ops->dpy_gl_call) ? I'll send a patch for that. > > > The assumption that OpenGL DisplayChangeListener and non-OpenGL > DisplayChangeListener are exclusive is now broken so we have to examine > Before the changes, there was already such a GL & non-GL listener mixed situation. It's only because the first listener with GL would be registered in register_displaychangelistener() that it "worked". if the whole patch series works correct without the assumption. Other > problem I have found (and forgot to mention) is: > - that console_select and register_displaychangelistener may not call > dpy_gfx_switch and call dpy_gl_scanout_texture instead. It is > incompatible with non-OpenGL displays and > Can you translate that to a reproducible test with expected outcome? > - that the compatibility check breaks if egl-headless is present and a > non-OpenGL DisplayChangeListener with con field is being added. > > Same, thanks > By the way, dbus registers a DisplayChangeListener for the size > detection, apparently it fails to set "con" field, making it an > accidental user of console_select. > I need more details, please. > > > > > > > > > "make check-avocado AVOCADO_TESTS=tests/avocado/virtio-gpu.py", > > which covers egl-headless usage, works. > > > > > >> > > >> - Multiple OpenGL DisplayChangeListeners of dbus shares one > > DisplaySurface and > > >> modifies its texture field, causing OpenGL texture leak and > > use-after-free. > > > > > > > > > Again, please provide instructions. I have regularly run -display > > dbus with multiple clients and qemu compiled with sanitizers. I > > don't see any leak or use after free. > > > > I doubt sanitizers can find this because it is an OpenGL texture. You > > may add a probe around surface_gl_create_texture and > > surface_gl_destroy_texture. > > > > > > Indeed, a surface is created on each frame, because we create a 2d > > surface on each qemu_console_resize(), which is called at each virgl > > scanout. This is a regression introduced by commit ebced09185 ("console: > > save current scanout details"). I can propose an easy fix, please check > it. > > > > And the root of the leak is actually surface_gl_create_texutre(), it > > should be idempotent, just like destroy(). > > That is not enough since it may leave the texture present after > unregister_displaychangelistener. And calling > surface_gl_destroy_texture() before unregister_displaychangelistener may > break the other listeners. > The texture is shared, but owned by the console. It is not leaked. However, given that it is shared, it would be indeed better to refcount the users to avoid destroying the texture by one listener going away. > > > > > > > > > >> > > >> - Introduced extra code to check the compatibility of OpenGL > > context providers > > >> and OpenGL texture renderers where those are often inherently > > tightly coupled > > >> since they must share the same hardware. > > > > > > > > > So code checks are meant to prevent misusage. They might be too > > limited or broken in some ways, but reverting is likely going to > > introduce other regressions I was trying to fix. > > > > The misuse will not occur because DisplayChangeListeners will be > > merged with OpenGL context providers. > > > > > > Ok, but aren't the checks enough to prevent it already? I have to check > > the use cases and differences of design, but you might be right that we > > don't need such a split after all. > > Yes, the point is that it requires extra code. > > > > > > > > > > >> - Needed extra work to broadcast the same change to multiple > > dbus listeners. > > >> > > > > > > Compared to what? > > > > Compared to sharing one DisplayChangeListener for multiple dbus > > listeners. > > > > > > Well, you just moved the problem to the dbus display, not removed any > work. > > > > What we have currently is more generic, you should be able to add/remove > > various listeners (in theory, we only really do it for dbus at this > point). > > The each DisplayChangeListeners have to upload the DisplaySurface to the > graphics accelerator, create a DMA-BUF file descriptor, and make it > suitable for D-Bus delivery. The duplicate work can be just done once if > we have only one DisplayChangeListener for one console. > And we should avoid duplicating the texture/resources if possible. This is not specific to DBus, it's the reason why I would rather have the logic in the console code so we avoid code duplication and we can do further improvement at the lower-level. > > > > > > > > > > >> > > >> This series solve them by implementing the change broadcast in > > ui/dbus, which > > >> knows exactly what is needed. Changes for the common code to > > support multiple > > >> OpenGL DisplayChangeListeners were reverted to fix egl-headless > > and reduce > > >> the code size. > > > > > > > > > Thanks a lot for your work, I am going to take a look at your > > approach. But please help us understand better what the problem > > actually is, by giving examples & tests to avoid future regressions > > and document the expected behaviour. > > > > The thing is really complicated and I may miss details so please feel > > free to ask questions or provide suggestions. > > > > > > Reverting changes and proposing an alternative implementation requires > > detailed explanation and convincing arguments. It may take a while, but > > we will try to get through the problems and evaluate the alternative > > designs. Thanks again for your help! > > Rather, I think proposing a large change to common console code requires > thorough examination and it should be reverted before it reaches the > release if it is doubtful that it is correct and reduces the complexity > of a few displays (possibly in the future). "No regression" should come > first before fix and feature. We can always revisit the change land it > in a proper form even after reverting the change. > > That's not how we usually work, there was a long RFC&PATCH review period during which testing was done. It's true that Gerd, the maintainer, didn't do a thorough review though. If we have to revert, we can do it before the release. However, I would rather fix the current design since it is meant to be more generic & flexible. We still have some time. Thanks again for your help Regards, > Akihiko Odaki > > > > > Regards, > > Akihiko Odaki > > > > > > > > > >> > > >> Akihiko Odaki (6): > > >> ui/dbus: Share one listener for a console > > >> Revert "console: save current scanout details" > > >> Revert "ui: split the GL context in a different object" > > >> Revert "ui: dispatch GL events to all listeners" > > >> Revert "ui: associate GL context outside of display listener > > >> registration" > > >> Revert "ui: factor out qemu_console_set_display_gl_ctx()" > > >> > > >> include/ui/console.h | 60 +----- > > >> include/ui/egl-context.h | 6 +- > > >> include/ui/gtk.h | 11 +- > > >> include/ui/sdl2.h | 7 +- > > >> include/ui/spice-display.h | 1 - > > >> ui/console.c | 258 +++++++---------------- > > >> ui/dbus-console.c | 109 ++-------- > > >> ui/dbus-listener.c | 417 > > +++++++++++++++++++++++++++---------- > > >> ui/dbus.c | 22 -- > > >> ui/dbus.h | 36 +++- > > >> ui/egl-context.c | 6 +- > > >> ui/egl-headless.c | 20 +- > > >> ui/gtk-egl.c | 10 +- > > >> ui/gtk-gl-area.c | 8 +- > > >> ui/gtk.c | 25 +-- > > >> ui/sdl2-gl.c | 10 +- > > >> ui/sdl2.c | 14 +- > > >> ui/spice-display.c | 19 +- > > >> 18 files changed, 498 insertions(+), 541 deletions(-) > > >> > > >> -- > > >> 2.32.0 (Apple Git-132) > > >> > > >> > > > > > > > > > -- > > > Marc-André Lureau > > > > > > > > -- > > Marc-André Lureau >
On 2022/02/17 1:51, Marc-André Lureau wrote: > Hi Akihiko, > > (I suppose you meant to reply-all, feel free to add back qemu-devel) My bad, restored Ccs. > > On Tue, Feb 15, 2022 at 5:56 PM Akihiko Odaki <akihiko.odaki@gmail.com > <mailto:akihiko.odaki@gmail.com>> wrote: > > On Tue, Feb 15, 2022 at 10:32 PM Marc-André Lureau > <marcandre.lureau@gmail.com <mailto:marcandre.lureau@gmail.com>> wrote: > > > > Hi > > > > On Tue, Feb 15, 2022 at 7:09 AM Akihiko Odaki > <akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com>> wrote: > >> > >> > >> > >> On 2022/02/15 4:49, Marc-André Lureau wrote: > >> > Hi > >> > > >> > On Mon, Feb 14, 2022 at 5:15 PM Akihiko Odaki > <akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com> > >> > <mailto:akihiko.odaki@gmail.com > <mailto:akihiko.odaki@gmail.com>>> wrote: > >> > > >> > On Mon, Feb 14, 2022 at 9:07 PM Marc-André Lureau > >> > <marcandre.lureau@gmail.com > <mailto:marcandre.lureau@gmail.com> > <mailto:marcandre.lureau@gmail.com > <mailto:marcandre.lureau@gmail.com>>> wrote: > >> > > > >> > > Hi Akihiko > >> > > > >> > > On Sun, Feb 13, 2022 at 6:44 AM Akihiko Odaki > >> > <akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com> > <mailto:akihiko.odaki@gmail.com <mailto:akihiko.odaki@gmail.com>>> > wrote: > >> > >> > >> > >> ui/dbus required to have multiple DisplayChangeListeners > >> > (possibly with OpenGL) > >> > >> for a console but that caused several problems: > >> > >> - It broke egl-headless, an unusual display which > implements > >> > OpenGL rendering > >> > >> for non-OpenGL displays. The code to support multiple > >> > DisplayChangeListeners > >> > >> does not consider the case where non-OpenGL displays > listens > >> > OpenGL consoles. > >> > > > >> > > > >> > > Can you provide instructions on what broke? Even better > write a > >> > test, please. > >> > > >> > The following command segfaults. Adding a test would be > nice, but it > >> > would need a binary which uses OpenGL. > >> > qemu-system-x86_64 -device virtio-gpu-gl-pci -display > egl-headless > >> > -vnc :0 -m 8G -cdrom > Fedora-Workstation-Live-x86_64-34-1.2.iso -accel > >> > kvm > >> > > >> > > >> > Thanks! > >> > > >> > This is clearly a mistake from commit 7cc712e98 ("ui: dispatch > GL events > >> > to all listener"). > >> > > >> > It should have taken into account that some listeners do not > have GL > >> > callbacks, and guard the call. > >> > > >> > We should wrap the missing ops->dpy_gl_call() with a if > >> > (ops->dpy_gl_call) ? I'll send a patch for that. > >> > >> > >> The assumption that OpenGL DisplayChangeListener and non-OpenGL > >> DisplayChangeListener are exclusive is now broken so we have to > examine > > > > > > Before the changes, there was already such a GL & non-GL listener > mixed situation. It's only because the first listener with GL would > be registered in register_displaychangelistener() that it "worked". > > I mean the dbus patch series should be reexamined since it does not > seem to consider the case. > > > That's what I am doing, with your very good help. > > > > > >> if the whole patch series works correct without the assumption. > Other > >> problem I have found (and forgot to mention) is: > >> - that console_select and register_displaychangelistener may not > call > >> dpy_gfx_switch and call dpy_gl_scanout_texture instead. It is > >> incompatible with non-OpenGL displays and > > > > > > Can you translate that to a reproducible test with expected outcome? > > vnc has the feature to switch consoles with Ctrl+Alt+[1-9] if it is > not bound to a particular console. Invoking the functionality with > egl-headless enabled would trigger the problem. > > > That doesn't seem to happen after the fixes ([PATCH 0/3] GL console > related fixes) I posted. > > > > >> > >> - that the compatibility check breaks if egl-headless is present > and a > >> non-OpenGL DisplayChangeListener with con field is being added. > >> > > > > Same, thanks > > The following command should do: > qemu-system-x86_64 -device virtio-gpu-gl-pci,id=d -display > egl-headless -vnc :0,display=d -m 8G -cdrom > Fedora-Workstation-Live-x86_64-34-1.2.iso -accel kvm > > > Thanks, that helps! I didn't realize DCL had "dynamic" consoles and the > compatibility checks were bypassed without specifying a specific display=d. > > I'll address this in the v2 of my patch fixes series. Please check it. > > > > >> > >> By the way, dbus registers a DisplayChangeListener for the size > >> detection, apparently it fails to set "con" field, making it an > >> accidental user of console_select. > > > > > > I need more details, please. > > A DisplayChangeListener without con field set is passed to > register_displaychangelistener in dbus_display_console_new. > > > Good catch, should be fixed too. > > > > > >> > >> > >> > > >> > > > >> > > "make check-avocado > AVOCADO_TESTS=tests/avocado/virtio-gpu.py", > >> > which covers egl-headless usage, works. > >> > > > >> > >> > >> > >> - Multiple OpenGL DisplayChangeListeners of dbus > shares one > >> > DisplaySurface and > >> > >> modifies its texture field, causing OpenGL texture > leak and > >> > use-after-free. > >> > > > >> > > > >> > > Again, please provide instructions. I have regularly > run -display > >> > dbus with multiple clients and qemu compiled with > sanitizers. I > >> > don't see any leak or use after free. > >> > > >> > I doubt sanitizers can find this because it is an OpenGL > texture. You > >> > may add a probe around surface_gl_create_texture and > >> > surface_gl_destroy_texture. > >> > > >> > > >> > Indeed, a surface is created on each frame, because we create a 2d > >> > surface on each qemu_console_resize(), which is called at each > virgl > >> > scanout. This is a regression introduced by commit ebced09185 > ("console: > >> > save current scanout details"). I can propose an easy fix, > please check it. > >> > > >> > And the root of the leak is actually > surface_gl_create_texutre(), it > >> > should be idempotent, just like destroy(). > >> > >> That is not enough since it may leave the texture present after > >> unregister_displaychangelistener. And calling > >> surface_gl_destroy_texture() before > unregister_displaychangelistener may > >> break the other listeners. > > > > > > The texture is shared, but owned by the console. It is not leaked. > > > > However, given that it is shared, it would be indeed better to > refcount the users to avoid destroying the texture by one listener > going away. > > Nobody destroys the texture in the case so it is a leak. Particularly > after dpy_gfx_replace_surface, even the reference which remains in > DisplaySurface will be gone. > > > We need a QemuConsole finalizer to call qemu_free_displaysurface(). > The UI code needs to evolve to become more robust in dynamic situations. > Let's do that. QemuConsole gets never gone (if it does it would also break other displays), so adding a code to call surface_gl_destroy_texutre before unregister_displaychangelistener should be enough. However, doing so would break the other OpenGL DisplayChangeListeners of the same console if they exist. Allowing to add/remove QemuConsole would be nice for hotplugging and it should also make the code clear overall by being more explicit about the lifetime of the resources, but that needs to update most (probably all) displays. > > > > > >> > >> > > >> > > >> > > > >> > >> > >> > >> - Introduced extra code to check the compatibility of > OpenGL > >> > context providers > >> > >> and OpenGL texture renderers where those are often > inherently > >> > tightly coupled > >> > >> since they must share the same hardware. > >> > > > >> > > > >> > > So code checks are meant to prevent misusage. They > might be too > >> > limited or broken in some ways, but reverting is likely > going to > >> > introduce other regressions I was trying to fix. > >> > > >> > The misuse will not occur because DisplayChangeListeners > will be > >> > merged with OpenGL context providers. > >> > > >> > > >> > Ok, but aren't the checks enough to prevent it already? I have > to check > >> > the use cases and differences of design, but you might be > right that we > >> > don't need such a split after all. > >> > >> Yes, the point is that it requires extra code. > >> > >> > > >> > > >> > > > >> > >> - Needed extra work to broadcast the same change to > multiple > >> > dbus listeners. > >> > >> > >> > > > >> > > Compared to what? > >> > > >> > Compared to sharing one DisplayChangeListener for multiple > dbus > >> > listeners. > >> > > >> > > >> > Well, you just moved the problem to the dbus display, not > removed any work. > >> > > >> > What we have currently is more generic, you should be able to > add/remove > >> > various listeners (in theory, we only really do it for dbus at > this point). > >> > >> The each DisplayChangeListeners have to upload the > DisplaySurface to the > >> graphics accelerator, create a DMA-BUF file descriptor, and make it > >> suitable for D-Bus delivery. The duplicate work can be just done > once if > >> we have only one DisplayChangeListener for one console. > > > > > > And we should avoid duplicating the texture/resources if > possible. This is not specific to DBus, it's the reason why I would > rather have the logic in the console code so we avoid code > duplication and we can do further improvement at the lower-level. > > The resources created are specific to D-Bus. Not all displays have > GUnixFDList, DMA-BUF file descriptors and OpenGL textures. > > > GUnixFDList is in dbus code. dmabuf fd & opengl textures are not > specific to dbus, they are used by spice as well, for example. > You have to consider about the case where DMA-BUF is not used or OpenGL is not enabled if we try to remove the duplicate work in the common code. It would result in much less code if you reuse GUnixFDList in dbus, and that is what this patch series do. > > Regards, > Akihiko Odaki > > > > >> > >> > >> > > >> > > >> > > > >> > >> > >> > >> This series solve them by implementing the change > broadcast in > >> > ui/dbus, which > >> > >> knows exactly what is needed. Changes for the common > code to > >> > support multiple > >> > >> OpenGL DisplayChangeListeners were reverted to fix > egl-headless > >> > and reduce > >> > >> the code size. > >> > > > >> > > > >> > > Thanks a lot for your work, I am going to take a look > at your > >> > approach. But please help us understand better what the > problem > >> > actually is, by giving examples & tests to avoid future > regressions > >> > and document the expected behaviour. > >> > > >> > The thing is really complicated and I may miss details so > please feel > >> > free to ask questions or provide suggestions. > >> > > >> > > >> > Reverting changes and proposing an alternative implementation > requires > >> > detailed explanation and convincing arguments. It may take a > while, but > >> > we will try to get through the problems and evaluate the > alternative > >> > designs. Thanks again for your help! > >> > >> Rather, I think proposing a large change to common console code > requires > >> thorough examination and it should be reverted before it reaches the > >> release if it is doubtful that it is correct and reduces the > complexity > >> of a few displays (possibly in the future). "No regression" > should come > >> first before fix and feature. We can always revisit the change > land it > >> in a proper form even after reverting the change. > >> > > > > That's not how we usually work, there was a long RFC&PATCH review > period during which testing was done. It's true that Gerd, the > maintainer, didn't do a thorough review though. If we have to > revert, we can do it before the release. However, I would rather fix > the current design since it is meant to be more generic & flexible. > We still have some time. > > > > Thanks again for your help > > > >> Regards, > >> Akihiko Odaki > >> > >> > > >> > Regards, > >> > Akihiko Odaki > >> > > >> > > >> > > > >> > >> > >> > >> Akihiko Odaki (6): > >> > >> ui/dbus: Share one listener for a console > >> > >> Revert "console: save current scanout details" > >> > >> Revert "ui: split the GL context in a different object" > >> > >> Revert "ui: dispatch GL events to all listeners" > >> > >> Revert "ui: associate GL context outside of display > listener > >> > >> registration" > >> > >> Revert "ui: factor out > qemu_console_set_display_gl_ctx()" > >> > >> > >> > >> include/ui/console.h | 60 +----- > >> > >> include/ui/egl-context.h | 6 +- > >> > >> include/ui/gtk.h | 11 +- > >> > >> include/ui/sdl2.h | 7 +- > >> > >> include/ui/spice-display.h | 1 - > >> > >> ui/console.c | 258 +++++++---------------- > >> > >> ui/dbus-console.c | 109 ++-------- > >> > >> ui/dbus-listener.c | 417 > >> > +++++++++++++++++++++++++++---------- > >> > >> ui/dbus.c | 22 -- > >> > >> ui/dbus.h | 36 +++- > >> > >> ui/egl-context.c | 6 +- > >> > >> ui/egl-headless.c | 20 +- > >> > >> ui/gtk-egl.c | 10 +- > >> > >> ui/gtk-gl-area.c | 8 +- > >> > >> ui/gtk.c | 25 +-- > >> > >> ui/sdl2-gl.c | 10 +- > >> > >> ui/sdl2.c | 14 +- > >> > >> ui/spice-display.c | 19 +- > >> > >> 18 files changed, 498 insertions(+), 541 deletions(-) > >> > >> > >> > >> -- > >> > >> 2.32.0 (Apple Git-132) > >> > >> > >> > >> > >> > > > >> > > > >> > > -- > >> > > Marc-André Lureau > >> > > >> > > >> > > >> > -- > >> > Marc-André Lureau > > > > > > > > -- > > Marc-André Lureau > > > > -- > Marc-André Lureau