Message ID | 20230706075612.67404-5-david@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | virtio-mem: Support "x-ignore-shared" migration | expand |
David Hildenbrand <david@redhat.com> wrote: > To achieve desired "x-ignore-shared" functionality, we should not > discard all RAM when realizing the device and not mess with > preallocation/postcopy when loading device state. In essence, we should > not touch RAM content. > > As "x-ignore-shared" gets set after realizing the device, we cannot > rely on that. Let's simply skip discarding of RAM on incoming migration. > Note that virtio_mem_post_load() will call > virtio_mem_restore_unplugged() -- unless "x-ignore-shared" is set. So > once migration finished we'll have a consistent state. > > The initial system reset will also not discard any RAM, because > virtio_mem_unplug_all() will not call virtio_mem_unplug_all() when no > memory is plugged (which is the case before loading the device state). > > Note that something like VM templating -- see commit b17fbbe55cba > ("migration: allow private destination ram with x-ignore-shared") And here I am, I reviewed the patch, and 4 years later I don't remember anything about it O:-) > -- is > currently incompatible with virtio-mem and ram_block_discard_range() will > warn in case a private file mapping is supplied by virtio-mem. If it is incompatible, only a warning is not enough. > > For VM templating with virtio-mem, it makes more sense to either > (a) Create the template without the virtio-mem device and hotplug a > virtio-mem device to the new VM instances using proper own memory > backend. > (b) Use a virtio-mem device that doesn't provide any memory in the > template (requested-size=0) and use private anonymous memory. > > Tested-by: Mario Casquero <mcasquer@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > hw/virtio/virtio-mem.c | 47 ++++++++++++++++++++++++++++++++++-------- > 1 file changed, 38 insertions(+), 9 deletions(-) > > diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c > index a922c21380..3f41e00e74 100644 > --- a/hw/virtio/virtio-mem.c > +++ b/hw/virtio/virtio-mem.c > @@ -18,6 +18,7 @@ > #include "sysemu/numa.h" > #include "sysemu/sysemu.h" > #include "sysemu/reset.h" > +#include "sysemu/runstate.h" > #include "hw/virtio/virtio.h" > #include "hw/virtio/virtio-bus.h" > #include "hw/virtio/virtio-mem.h" > @@ -901,11 +902,23 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) > return; > } > > - ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); > - if (ret) { > - error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); > - ram_block_coordinated_discard_require(false); > - return; > + /* > + * We don't know at this point whether shared RAM is migrated using > + * QEMU or migrated using the file content. "x-ignore-shared" will be > + * configured after realizing the device. So in case we have an > + * incoming migration, simply always skip the discard step. > + * > + * Otherwise, make sure that we start with a clean slate: either the > + * memory backend might get reused or the shared file might still have > + * memory allocated. > + */ > + if (!runstate_check(RUN_STATE_INMIGRATE)) { > + ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); > + if (ret) { > + error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); > + ram_block_coordinated_discard_require(false); > + return; > + } > } Makes sense. > > virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); > @@ -977,10 +990,6 @@ static int virtio_mem_post_load(void *opaque, int version_id) > RamDiscardListener *rdl; > int ret; > > - if (vmem->prealloc && !vmem->early_migration) { > - warn_report("Proper preallocation with migration requires a newer QEMU machine"); > - } > - > /* > * We started out with all memory discarded and our memory region is mapped > * into an address space. Replay, now that we updated the bitmap. > @@ -993,6 +1002,18 @@ static int virtio_mem_post_load(void *opaque, int version_id) > } > } > > + /* > + * If shared RAM is migrated using the file content and not using QEMU, > + * don't mess with preallocation and postcopy. > + */ > + if (migrate_ram_is_ignored(vmem->memdev->mr.ram_block)) { > + return 0; > + } > + > + if (vmem->prealloc && !vmem->early_migration) { > + warn_report("Proper preallocation with migration requires a newer QEMU machine"); > + } > + Could you explain why you are putting the check after calling virtio_mem_notify_populate_cb()? What is it expected to for file memory backed RAM? I got lost when I saw that it just calls: static int virtio_mem_notify_populate_cb(MemoryRegionSection *s, void *arg) { RamDiscardListener *rdl = arg; return rdl->notify_populate(rdl, s); } I end in vfio, and got completely confused about what is going on there. > if (migration_in_incoming_postcopy()) { > return 0; > } > @@ -1025,6 +1046,14 @@ static int virtio_mem_post_load_early(void *opaque, int version_id) > return 0; > } > > + /* > + * If shared RAM is migrated using the file content and not using QEMU, > + * don't mess with preallocation and postcopy. > + */ > + if (migrate_ram_is_ignored(rb)) { > + return 0; > + } > + > /* > * We restored the bitmap and verified that the basic properties > * match on source and destination, so we can go ahead and preallocate OK. Thanks, Juan.
On 06.07.23 13:06, Juan Quintela wrote: > David Hildenbrand <david@redhat.com> wrote: >> To achieve desired "x-ignore-shared" functionality, we should not >> discard all RAM when realizing the device and not mess with >> preallocation/postcopy when loading device state. In essence, we should >> not touch RAM content. >> >> As "x-ignore-shared" gets set after realizing the device, we cannot >> rely on that. Let's simply skip discarding of RAM on incoming migration. >> Note that virtio_mem_post_load() will call >> virtio_mem_restore_unplugged() -- unless "x-ignore-shared" is set. So >> once migration finished we'll have a consistent state. >> >> The initial system reset will also not discard any RAM, because >> virtio_mem_unplug_all() will not call virtio_mem_unplug_all() when no >> memory is plugged (which is the case before loading the device state). >> >> Note that something like VM templating -- see commit b17fbbe55cba >> ("migration: allow private destination ram with x-ignore-shared") > > And here I am, I reviewed the patch, and 4 years later I don't remember > anything about it O:-) :) [...] >> + /* >> + * If shared RAM is migrated using the file content and not using QEMU, >> + * don't mess with preallocation and postcopy. >> + */ >> + if (migrate_ram_is_ignored(vmem->memdev->mr.ram_block)) { >> + return 0; >> + } >> + >> + if (vmem->prealloc && !vmem->early_migration) { >> + warn_report("Proper preallocation with migration requires a newer QEMU machine"); >> + } >> + > > Could you explain why you are putting the check after calling > virtio_mem_notify_populate_cb()? > > What is it expected to for file memory backed RAM? I got lost when I > saw that it just calls: > > static int virtio_mem_notify_populate_cb(MemoryRegionSection *s, void *arg) > { > RamDiscardListener *rdl = arg; > > return rdl->notify_populate(rdl, s); > } > > > I end in vfio, and got completely confused about what is going on there. :) Once we reached virtio_mem_post_load(), we restored the bitmap that contains the state of all device blocks (plugged vs. unplugged). Whenever we modify the bitmap (plug / unplug), we have to notify (RamDiscardManager) listeners, such that they are aware of the state change and can perform according action. For example, vfio will go ahead and register the newly plugged blocks with the kernel (DMA map it into the vfio), where the kernel will end up long-term pinning these pages. Effectively, we only end up DMA-mapping plugged memory blocks, so only these get pinned by the kernel (and we can actually release the memory of unplugged blocks). So here (virtio_mem_post_load()), we just restored the bitmap from the migration stream and effectively went from 0 plugged blocks (bitmap empty) before migration to "maybe some plugged blocks in the bitmap". So we go over the bitmap and tell the world (vfio) to go ahead and DMA-map these blocks that are suddenly plugged. And that part is independent of the actual RAM migration / x-ignore-shared, sow have to do it unconditional. Thanks for the thorough review!
David Hildenbrand <david@redhat.com> wrote: > To achieve desired "x-ignore-shared" functionality, we should not > discard all RAM when realizing the device and not mess with > preallocation/postcopy when loading device state. In essence, we should > not touch RAM content. > > As "x-ignore-shared" gets set after realizing the device, we cannot > rely on that. Let's simply skip discarding of RAM on incoming migration. > Note that virtio_mem_post_load() will call > virtio_mem_restore_unplugged() -- unless "x-ignore-shared" is set. So > once migration finished we'll have a consistent state. > > The initial system reset will also not discard any RAM, because > virtio_mem_unplug_all() will not call virtio_mem_unplug_all() when no > memory is plugged (which is the case before loading the device state). > > Note that something like VM templating -- see commit b17fbbe55cba > ("migration: allow private destination ram with x-ignore-shared") -- is > currently incompatible with virtio-mem and ram_block_discard_range() will > warn in case a private file mapping is supplied by virtio-mem. > > For VM templating with virtio-mem, it makes more sense to either > (a) Create the template without the virtio-mem device and hotplug a > virtio-mem device to the new VM instances using proper own memory > backend. > (b) Use a virtio-mem device that doesn't provide any memory in the > template (requested-size=0) and use private anonymous memory. > > Tested-by: Mario Casquero <mcasquer@redhat.com> > Signed-off-by: David Hildenbrand <david@redhat.com> After very nice explanation. Reviewed-by: Juan Quintela <quintela@redhat.com>
diff --git a/hw/virtio/virtio-mem.c b/hw/virtio/virtio-mem.c index a922c21380..3f41e00e74 100644 --- a/hw/virtio/virtio-mem.c +++ b/hw/virtio/virtio-mem.c @@ -18,6 +18,7 @@ #include "sysemu/numa.h" #include "sysemu/sysemu.h" #include "sysemu/reset.h" +#include "sysemu/runstate.h" #include "hw/virtio/virtio.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-mem.h" @@ -901,11 +902,23 @@ static void virtio_mem_device_realize(DeviceState *dev, Error **errp) return; } - ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); - if (ret) { - error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); - ram_block_coordinated_discard_require(false); - return; + /* + * We don't know at this point whether shared RAM is migrated using + * QEMU or migrated using the file content. "x-ignore-shared" will be + * configured after realizing the device. So in case we have an + * incoming migration, simply always skip the discard step. + * + * Otherwise, make sure that we start with a clean slate: either the + * memory backend might get reused or the shared file might still have + * memory allocated. + */ + if (!runstate_check(RUN_STATE_INMIGRATE)) { + ret = ram_block_discard_range(rb, 0, qemu_ram_get_used_length(rb)); + if (ret) { + error_setg_errno(errp, -ret, "Unexpected error discarding RAM"); + ram_block_coordinated_discard_require(false); + return; + } } virtio_mem_resize_usable_region(vmem, vmem->requested_size, true); @@ -977,10 +990,6 @@ static int virtio_mem_post_load(void *opaque, int version_id) RamDiscardListener *rdl; int ret; - if (vmem->prealloc && !vmem->early_migration) { - warn_report("Proper preallocation with migration requires a newer QEMU machine"); - } - /* * We started out with all memory discarded and our memory region is mapped * into an address space. Replay, now that we updated the bitmap. @@ -993,6 +1002,18 @@ static int virtio_mem_post_load(void *opaque, int version_id) } } + /* + * If shared RAM is migrated using the file content and not using QEMU, + * don't mess with preallocation and postcopy. + */ + if (migrate_ram_is_ignored(vmem->memdev->mr.ram_block)) { + return 0; + } + + if (vmem->prealloc && !vmem->early_migration) { + warn_report("Proper preallocation with migration requires a newer QEMU machine"); + } + if (migration_in_incoming_postcopy()) { return 0; } @@ -1025,6 +1046,14 @@ static int virtio_mem_post_load_early(void *opaque, int version_id) return 0; } + /* + * If shared RAM is migrated using the file content and not using QEMU, + * don't mess with preallocation and postcopy. + */ + if (migrate_ram_is_ignored(rb)) { + return 0; + } + /* * We restored the bitmap and verified that the basic properties * match on source and destination, so we can go ahead and preallocate