Message ID | 20240312140216.313618-6-marcandre.lureau@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [PULL,1/5] ui/vnc: Respect bound console | expand |
Am 12.03.24 um 15:02 schrieb marcandre.lureau@redhat.com: > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > The current post-loading code for scanout has a FIXME: it doesn't take > the resource region/rect into account. But there is more, when adding > blob migration support in commit f66767f75c9, I didn't realize that blob > resources could be used for scanouts. This situationn leads to a crash > during post-load, as they don't have an associated res->image. > > virtio_gpu_do_set_scanout() handle all cases, but requires the > associated virtio_gpu_framebuffer, which is currently not saved during > migration. > > Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we > can restore blob scanouts, as well as fixing the existing FIXME. > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > Reviewed-by: Sebastian Ott <sebott@redhat.com> Hi, unfortunately, this broke migration from pre-9.0 to 9.0: > vmstate_load_state_field virtio-gpu:virtio-gpu > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.enable > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.conf.max_outputs > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.scanout > vmstate_load_state_field virtio-gpu-one-scanout:resource_id > vmstate_load_state_field virtio-gpu-one-scanout:width > vmstate_load_state_field virtio-gpu-one-scanout:height > vmstate_load_state_field virtio-gpu-one-scanout:x > vmstate_load_state_field virtio-gpu-one-scanout:y > vmstate_load_state_field virtio-gpu-one-scanout:cursor.resource_id > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_x > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_y > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.x > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.y > vmstate_load_state_field virtio-gpu-one-scanout:fb.format > vmstate_load_state_field virtio-gpu-one-scanout:fb.bytes_pp > vmstate_load_state_field virtio-gpu-one-scanout:fb.width > vmstate_load_state_field virtio-gpu-one-scanout:fb.height > vmstate_load_state_field virtio-gpu-one-scanout:fb.stride > vmstate_load_state_field virtio-gpu-one-scanout:fb.offset > qemu-system-x86_64: Missing section footer for 0000:00:02.0/virtio-gpu > qemu-system-x86_64: Error -22 while loading VM state It wrongly tries to load the fb fields even though they should be guarded by version 2. Looking at it with GDB, in vmstate_load_state(), when we come to field->name == parent_obj.scanout, the > } else if (field->flags & VMS_STRUCT) { > ret = vmstate_load_state(f, field->vmsd, curr_elem, > field->vmsd->version_id); branch will be taken and suddenly we'll have a call to vmstate_load_state() for vmsd==vmstate_virtio_gpu_scanout with version_id==2 rather than version_id==1, because that is field->vmsd->version_id (i.e. the .version_id in VMStateDescription vmstate_virtio_gpu_scanout). Would it have been necessary to version the VMStateDescription vmstate_virtio_gpu_scanouts too using VMS_VSTRUCT (or am I misinterpreting the use case for that)? Best Regards, Fiona
On Tue, Apr 30, 2024 at 02:30:19PM +0200, Fiona Ebner wrote: > Am 12.03.24 um 15:02 schrieb marcandre.lureau@redhat.com: > > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > > > The current post-loading code for scanout has a FIXME: it doesn't take > > the resource region/rect into account. But there is more, when adding > > blob migration support in commit f66767f75c9, I didn't realize that blob > > resources could be used for scanouts. This situationn leads to a crash > > during post-load, as they don't have an associated res->image. > > > > virtio_gpu_do_set_scanout() handle all cases, but requires the > > associated virtio_gpu_framebuffer, which is currently not saved during > > migration. > > > > Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we > > can restore blob scanouts, as well as fixing the existing FIXME. > > > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > > Reviewed-by: Sebastian Ott <sebott@redhat.com> > > Hi, > unfortunately, this broke migration from pre-9.0 to 9.0: > > > vmstate_load_state_field virtio-gpu:virtio-gpu > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.enable > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.conf.max_outputs > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.scanout > > vmstate_load_state_field virtio-gpu-one-scanout:resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:width > > vmstate_load_state_field virtio-gpu-one-scanout:height > > vmstate_load_state_field virtio-gpu-one-scanout:x > > vmstate_load_state_field virtio-gpu-one-scanout:y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.y > > vmstate_load_state_field virtio-gpu-one-scanout:fb.format > > vmstate_load_state_field virtio-gpu-one-scanout:fb.bytes_pp > > vmstate_load_state_field virtio-gpu-one-scanout:fb.width > > vmstate_load_state_field virtio-gpu-one-scanout:fb.height > > vmstate_load_state_field virtio-gpu-one-scanout:fb.stride > > vmstate_load_state_field virtio-gpu-one-scanout:fb.offset > > qemu-system-x86_64: Missing section footer for 0000:00:02.0/virtio-gpu > > qemu-system-x86_64: Error -22 while loading VM state > > It wrongly tries to load the fb fields even though they should be > guarded by version 2. > > Looking at it with GDB, in vmstate_load_state(), when we come to > field->name == parent_obj.scanout, the > > > } else if (field->flags & VMS_STRUCT) { > > ret = vmstate_load_state(f, field->vmsd, curr_elem, > > field->vmsd->version_id); > > branch will be taken and suddenly we'll have a call to > vmstate_load_state() for vmsd==vmstate_virtio_gpu_scanout with > version_id==2 rather than version_id==1, because that is > field->vmsd->version_id (i.e. the .version_id in VMStateDescription > vmstate_virtio_gpu_scanout). > > Would it have been necessary to version the VMStateDescription > vmstate_virtio_gpu_scanouts too using VMS_VSTRUCT (or am I > misinterpreting the use case for that)? Looks right. And there's only one such user which is when it's introduced in 2018. It's sad we can't simply already use vmsd subsections even if that was there before this VSTRUCT thing, and that should work with internal versioning. Maybe we introduced that because we can't replace a VMS_STRUCT to subsections? https://lore.kernel.org/qemu-devel/1524670052-28373-1-git-send-email-minyard@acm.org/#t OTOH, I don't think vmsd versioning would work for ping-pong migrations. Migrating backwards should fail with 'not supported' with vmsd versionings. Depending on the requirement (in this virtio-gpu case, it looks like applicable to be used in a cluster and doing back-and-forth moves?), we may want to support bi-directional migrations which should be superior. That will need to stick with machine type check (compat fields in hw_compat_*, then conditionally save/load fields) to guarantee migration can work back and forth. Thanks,
Hi On Tue, Apr 30, 2024 at 4:31 PM Fiona Ebner <f.ebner@proxmox.com> wrote: > > Am 12.03.24 um 15:02 schrieb marcandre.lureau@redhat.com: > > From: Marc-André Lureau <marcandre.lureau@redhat.com> > > > > The current post-loading code for scanout has a FIXME: it doesn't take > > the resource region/rect into account. But there is more, when adding > > blob migration support in commit f66767f75c9, I didn't realize that blob > > resources could be used for scanouts. This situationn leads to a crash > > during post-load, as they don't have an associated res->image. > > > > virtio_gpu_do_set_scanout() handle all cases, but requires the > > associated virtio_gpu_framebuffer, which is currently not saved during > > migration. > > > > Add a v2 of "virtio-gpu-one-scanout" with the framebuffer fields, so we > > can restore blob scanouts, as well as fixing the existing FIXME. > > > > Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> > > Reviewed-by: Sebastian Ott <sebott@redhat.com> > > Hi, > unfortunately, this broke migration from pre-9.0 to 9.0: > > > vmstate_load_state_field virtio-gpu:virtio-gpu > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.enable > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.conf.max_outputs > > vmstate_load_state_field virtio-gpu-scanouts:parent_obj.scanout > > vmstate_load_state_field virtio-gpu-one-scanout:resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:width > > vmstate_load_state_field virtio-gpu-one-scanout:height > > vmstate_load_state_field virtio-gpu-one-scanout:x > > vmstate_load_state_field virtio-gpu-one-scanout:y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.resource_id > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.hot_y > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.x > > vmstate_load_state_field virtio-gpu-one-scanout:cursor.pos.y > > vmstate_load_state_field virtio-gpu-one-scanout:fb.format > > vmstate_load_state_field virtio-gpu-one-scanout:fb.bytes_pp > > vmstate_load_state_field virtio-gpu-one-scanout:fb.width > > vmstate_load_state_field virtio-gpu-one-scanout:fb.height > > vmstate_load_state_field virtio-gpu-one-scanout:fb.stride > > vmstate_load_state_field virtio-gpu-one-scanout:fb.offset > > qemu-system-x86_64: Missing section footer for 0000:00:02.0/virtio-gpu > > qemu-system-x86_64: Error -22 while loading VM state > > It wrongly tries to load the fb fields even though they should be > guarded by version 2. > > Looking at it with GDB, in vmstate_load_state(), when we come to > field->name == parent_obj.scanout, the > > > } else if (field->flags & VMS_STRUCT) { > > ret = vmstate_load_state(f, field->vmsd, curr_elem, > > field->vmsd->version_id); > > branch will be taken and suddenly we'll have a call to > vmstate_load_state() for vmsd==vmstate_virtio_gpu_scanout with > version_id==2 rather than version_id==1, because that is > field->vmsd->version_id (i.e. the .version_id in VMStateDescription > vmstate_virtio_gpu_scanout). > > Would it have been necessary to version the VMStateDescription > vmstate_virtio_gpu_scanouts too using VMS_VSTRUCT (or am I > misinterpreting the use case for that)? > Sigh.. this is embarrassing. This patch didn't receive enough testing and breaks pre-9.0/9.0 forward and backward migrations. I think it would be reasonable to revert the patch, but it's already in 9.0. Is that acceptable? There are various issues about versioning: - VMStateDescription (vmsd) version_id are not in the stream (only start & full section headers), so any nested version bump should somehow be bubbled up/down.. - machine versions don't have specific vmsd version_id associations (that I can see, or is there any way to do that?) - virtio-gpu uses even lower-level vmstate API and hardcodes version to 1 in general Ideally, I wished adding versionized fields the way done in this patch would simply work, but this is really, sadly, not supported atm. And it would be great to have some testing for back/forward migrations. Any thoughts, corrections, encouragement perhaps? :) thanks -- Marc-André Lureau
diff --git a/include/hw/virtio/virtio-gpu.h b/include/hw/virtio/virtio-gpu.h index b28e7ef0d2..ed44cdad6b 100644 --- a/include/hw/virtio/virtio-gpu.h +++ b/include/hw/virtio/virtio-gpu.h @@ -81,6 +81,7 @@ struct virtio_gpu_scanout { uint32_t resource_id; struct virtio_gpu_update_cursor cursor; QEMUCursor *current_cursor; + struct virtio_gpu_framebuffer fb; }; struct virtio_gpu_requested_state { diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index ccbe31d759..78d5a4f164 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -600,6 +600,7 @@ static void virtio_unref_resource(pixman_image_t *image, void *data) static void virtio_gpu_update_scanout(VirtIOGPU *g, uint32_t scanout_id, struct virtio_gpu_simple_resource *res, + struct virtio_gpu_framebuffer *fb, struct virtio_gpu_rect *r) { struct virtio_gpu_simple_resource *ores; @@ -617,9 +618,10 @@ static void virtio_gpu_update_scanout(VirtIOGPU *g, scanout->y = r->y; scanout->width = r->width; scanout->height = r->height; + scanout->fb = *fb; } -static void virtio_gpu_do_set_scanout(VirtIOGPU *g, +static bool virtio_gpu_do_set_scanout(VirtIOGPU *g, uint32_t scanout_id, struct virtio_gpu_framebuffer *fb, struct virtio_gpu_simple_resource *res, @@ -645,7 +647,7 @@ static void virtio_gpu_do_set_scanout(VirtIOGPU *g, r->x, r->y, r->width, r->height, fb->width, fb->height); *error = VIRTIO_GPU_RESP_ERR_INVALID_PARAMETER; - return; + return false; } g->parent_obj.enable = 1; @@ -653,11 +655,12 @@ static void virtio_gpu_do_set_scanout(VirtIOGPU *g, if (res->blob) { if (console_has_gl(scanout->con)) { if (!virtio_gpu_update_dmabuf(g, scanout_id, res, fb, r)) { - virtio_gpu_update_scanout(g, scanout_id, res, r); + virtio_gpu_update_scanout(g, scanout_id, res, fb, r); } else { *error = VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY; + return false; } - return; + return true; } data = res->blob; @@ -693,7 +696,8 @@ static void virtio_gpu_do_set_scanout(VirtIOGPU *g, scanout->ds); } - virtio_gpu_update_scanout(g, scanout_id, res, r); + virtio_gpu_update_scanout(g, scanout_id, res, fb, r); + return true; } static void virtio_gpu_set_scanout(VirtIOGPU *g, @@ -1164,7 +1168,8 @@ static void virtio_gpu_cursor_bh(void *opaque) static const VMStateDescription vmstate_virtio_gpu_scanout = { .name = "virtio-gpu-one-scanout", - .version_id = 1, + .version_id = 2, + .minimum_version_id = 1, .fields = (const VMStateField[]) { VMSTATE_UINT32(resource_id, struct virtio_gpu_scanout), VMSTATE_UINT32(width, struct virtio_gpu_scanout), @@ -1176,6 +1181,12 @@ static const VMStateDescription vmstate_virtio_gpu_scanout = { VMSTATE_UINT32(cursor.hot_y, struct virtio_gpu_scanout), VMSTATE_UINT32(cursor.pos.x, struct virtio_gpu_scanout), VMSTATE_UINT32(cursor.pos.y, struct virtio_gpu_scanout), + VMSTATE_UINT32_V(fb.format, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_V(fb.bytes_pp, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_V(fb.width, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_V(fb.height, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_V(fb.stride, struct virtio_gpu_scanout, 2), + VMSTATE_UINT32_V(fb.offset, struct virtio_gpu_scanout, 2), VMSTATE_END_OF_LIST() }, }; @@ -1347,6 +1358,7 @@ static int virtio_gpu_blob_save(QEMUFile *f, void *opaque, size_t size, if (!res->blob_size) { continue; } + assert(!res->image); qemu_put_be32(f, res->resource_id); qemu_put_be32(f, res->blob_size); qemu_put_be32(f, res->iov_cnt); @@ -1409,21 +1421,40 @@ static int virtio_gpu_post_load(void *opaque, int version_id) int i; for (i = 0; i < g->parent_obj.conf.max_outputs; i++) { - /* FIXME: should take scanout.r.{x,y} into account */ scanout = &g->parent_obj.scanout[i]; if (!scanout->resource_id) { continue; } + res = virtio_gpu_find_resource(g, scanout->resource_id); if (!res) { return -EINVAL; } - scanout->ds = qemu_create_displaysurface_pixman(res->image); + + if (scanout->fb.format != 0) { + uint32_t error = 0; + struct virtio_gpu_rect r = { + .x = scanout->x, + .y = scanout->y, + .width = scanout->width, + .height = scanout->height + }; + + if (!virtio_gpu_do_set_scanout(g, i, &scanout->fb, res, &r, &error)) { + return -EINVAL; + } + } else { + /* legacy v1 migration support */ + if (!res->image) { + return -EINVAL; + } + scanout->ds = qemu_create_displaysurface_pixman(res->image); #ifdef WIN32 - qemu_displaysurface_win32_set_handle(scanout->ds, res->handle, 0); + qemu_displaysurface_win32_set_handle(scanout->ds, res->handle, 0); #endif + dpy_gfx_replace_surface(scanout->con, scanout->ds); + } - dpy_gfx_replace_surface(scanout->con, scanout->ds); dpy_gfx_update_full(scanout->con); if (scanout->cursor.resource_id) { update_cursor(g, &scanout->cursor);