Message ID | 1733145611-62315-5-git-send-email-steven.sistare@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Live update: cpr-transfer | expand |
Steve Sistare <steven.sistare@oracle.com> writes: > Allocate auxilliary guest RAM as an anonymous file that is shareable > with an external process. This option applies to memory allocated as > a side effect of creating various devices. It does not apply to > memory-backend-objects, whether explicitly specified on the command > line, or implicitly created by the -m command line option. > > This option is intended to support new migration modes, in which the > memory region can be transferred in place to a new QEMU process, by sending > the memfd file descriptor to the process. Memory contents are preserved, > and if the mode also transfers device descriptors, then pages that are > locked in memory for DMA remain locked. This behavior is a pre-requisite > for supporting vfio, vdpa, and iommufd devices with the new modes. > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> [...] > diff --git a/qemu-options.hx b/qemu-options.hx > index dacc979..02b9118 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ > " nvdimm=on|off controls NVDIMM support (default=off)\n" > " memory-encryption=@var{} memory encryption object to use (default=none)\n" > " hmat=on|off controls ACPI HMAT support (default=off)\n" > +#ifdef CONFIG_POSIX > + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" > +#endif > " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" > " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", > QEMU_ARCH_ALL) > @@ -101,6 +104,18 @@ SRST > Enables or disables ACPI Heterogeneous Memory Attribute Table > (HMAT) support. The default is off. > > +#ifdef CONFIG_POSIX > + ``aux-ram-share=on|off`` > + Allocate auxiliary guest RAM as an anonymous file that is > + shareable with an external process. This option applies to > + memory allocated as a side effect of creating various devices. > + It does not apply to memory-backend-objects, whether explicitly > + specified on the command line, or implicitly created by the -m > + command line option. > + > + Some migration modes require aux-ram-share=on. This leaves the one thing users really need to know unsaid: when exactly should users enable it. "Some migration modes require aux-ram-share=on": do they enable it by default, or is that left to the user? If the latter, why? Please document the default, whatever it is. > +#endif > + > ``memory-backend='id'`` > An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options. > Allows to use a memory backend as main RAM. [...]
Steve Sistare <steven.sistare@oracle.com> writes: > Allocate auxilliary guest RAM as an anonymous file that is shareable > with an external process. This option applies to memory allocated as > a side effect of creating various devices. It does not apply to > memory-backend-objects, whether explicitly specified on the command > line, or implicitly created by the -m command line option. > > This option is intended to support new migration modes, in which the > memory region can be transferred in place to a new QEMU process, by sending > the memfd file descriptor to the process. Memory contents are preserved, > and if the mode also transfers device descriptors, then pages that are > locked in memory for DMA remain locked. This behavior is a pre-requisite > for supporting vfio, vdpa, and iommufd devices with the new modes. > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> [...] > diff --git a/qemu-options.hx b/qemu-options.hx > index dacc979..02b9118 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ > " nvdimm=on|off controls NVDIMM support (default=off)\n" > " memory-encryption=@var{} memory encryption object to use (default=none)\n" > " hmat=on|off controls ACPI HMAT support (default=off)\n" > +#ifdef CONFIG_POSIX > + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" > +#endif > " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" > " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", > QEMU_ARCH_ALL) > @@ -101,6 +104,18 @@ SRST > Enables or disables ACPI Heterogeneous Memory Attribute Table > (HMAT) support. The default is off. > > +#ifdef CONFIG_POSIX > + ``aux-ram-share=on|off`` > + Allocate auxiliary guest RAM as an anonymous file that is > + shareable with an external process. This option applies to > + memory allocated as a side effect of creating various devices. > + It does not apply to memory-backend-objects, whether explicitly > + specified on the command line, or implicitly created by the -m > + command line option. > + > + Some migration modes require aux-ram-share=on. > +#endif > + I get Warning, treated as error: .../qemu-options.hx:117:Definition list ends without a blank line; unexpected unindent. Putting the blank line before #endif works for me. > ``memory-backend='id'`` > An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options. > Allows to use a memory backend as main RAM. [...]
Markus Armbruster <armbru@redhat.com> writes: > Steve Sistare <steven.sistare@oracle.com> writes: > >> Allocate auxilliary guest RAM as an anonymous file that is shareable >> with an external process. This option applies to memory allocated as >> a side effect of creating various devices. It does not apply to >> memory-backend-objects, whether explicitly specified on the command >> line, or implicitly created by the -m command line option. >> >> This option is intended to support new migration modes, in which the >> memory region can be transferred in place to a new QEMU process, by sending >> the memfd file descriptor to the process. Memory contents are preserved, >> and if the mode also transfers device descriptors, then pages that are >> locked in memory for DMA remain locked. This behavior is a pre-requisite >> for supporting vfio, vdpa, and iommufd devices with the new modes. >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > > [...] > >> diff --git a/qemu-options.hx b/qemu-options.hx >> index dacc979..02b9118 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ >> " nvdimm=on|off controls NVDIMM support (default=off)\n" >> " memory-encryption=@var{} memory encryption object to use (default=none)\n" >> " hmat=on|off controls ACPI HMAT support (default=off)\n" >> +#ifdef CONFIG_POSIX >> + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" >> +#endif >> " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" >> " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", >> QEMU_ARCH_ALL) >> @@ -101,6 +104,18 @@ SRST >> Enables or disables ACPI Heterogeneous Memory Attribute Table >> (HMAT) support. The default is off. >> >> +#ifdef CONFIG_POSIX >> + ``aux-ram-share=on|off`` >> + Allocate auxiliary guest RAM as an anonymous file that is >> + shareable with an external process. This option applies to >> + memory allocated as a side effect of creating various devices. >> + It does not apply to memory-backend-objects, whether explicitly >> + specified on the command line, or implicitly created by the -m >> + command line option. >> + >> + Some migration modes require aux-ram-share=on. >> +#endif >> + > > I get > > Warning, treated as error: > .../qemu-options.hx:117:Definition list ends without a blank line; unexpected unindent. > > Putting the blank line before #endif works for me. Actually, #ifdef does not work within SRST ... ERST. Elsewhere, we document build-time optional features unconditionally. Simply drop the #ifdef here. >> ``memory-backend='id'`` >> An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options. >> Allows to use a memory backend as main RAM. > > [...]
On 12/5/2024 7:19 AM, Markus Armbruster wrote: > Markus Armbruster <armbru@redhat.com> writes: > >> Steve Sistare <steven.sistare@oracle.com> writes: >> >>> Allocate auxilliary guest RAM as an anonymous file that is shareable >>> with an external process. This option applies to memory allocated as >>> a side effect of creating various devices. It does not apply to >>> memory-backend-objects, whether explicitly specified on the command >>> line, or implicitly created by the -m command line option. >>> >>> This option is intended to support new migration modes, in which the >>> memory region can be transferred in place to a new QEMU process, by sending >>> the memfd file descriptor to the process. Memory contents are preserved, >>> and if the mode also transfers device descriptors, then pages that are >>> locked in memory for DMA remain locked. This behavior is a pre-requisite >>> for supporting vfio, vdpa, and iommufd devices with the new modes. >>> >>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >> >> [...] >> >>> diff --git a/qemu-options.hx b/qemu-options.hx >>> index dacc979..02b9118 100644 >>> --- a/qemu-options.hx >>> +++ b/qemu-options.hx >>> @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ >>> " nvdimm=on|off controls NVDIMM support (default=off)\n" >>> " memory-encryption=@var{} memory encryption object to use (default=none)\n" >>> " hmat=on|off controls ACPI HMAT support (default=off)\n" >>> +#ifdef CONFIG_POSIX >>> + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" >>> +#endif >>> " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" >>> " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", >>> QEMU_ARCH_ALL) >>> @@ -101,6 +104,18 @@ SRST >>> Enables or disables ACPI Heterogeneous Memory Attribute Table >>> (HMAT) support. The default is off. >>> >>> +#ifdef CONFIG_POSIX >>> + ``aux-ram-share=on|off`` >>> + Allocate auxiliary guest RAM as an anonymous file that is >>> + shareable with an external process. This option applies to >>> + memory allocated as a side effect of creating various devices. >>> + It does not apply to memory-backend-objects, whether explicitly >>> + specified on the command line, or implicitly created by the -m >>> + command line option. >>> + >>> + Some migration modes require aux-ram-share=on. >>> +#endif >>> + >> >> I get >> >> Warning, treated as error: >> .../qemu-options.hx:117:Definition list ends without a blank line; unexpected unindent. >> >> Putting the blank line before #endif works for me. > > Actually, #ifdef does not work within SRST ... ERST. > > Elsewhere, we document build-time optional features unconditionally. > Simply drop the #ifdef here. Thanks Markus. I see the "#ifdef" literal emitted in the man page. I'll delete it. - Steve
On 12/5/2024 3:25 AM, Markus Armbruster wrote: > Steve Sistare <steven.sistare@oracle.com> writes: > >> Allocate auxilliary guest RAM as an anonymous file that is shareable >> with an external process. This option applies to memory allocated as >> a side effect of creating various devices. It does not apply to >> memory-backend-objects, whether explicitly specified on the command >> line, or implicitly created by the -m command line option. >> >> This option is intended to support new migration modes, in which the >> memory region can be transferred in place to a new QEMU process, by sending >> the memfd file descriptor to the process. Memory contents are preserved, >> and if the mode also transfers device descriptors, then pages that are >> locked in memory for DMA remain locked. This behavior is a pre-requisite >> for supporting vfio, vdpa, and iommufd devices with the new modes. >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > > [...] > >> diff --git a/qemu-options.hx b/qemu-options.hx >> index dacc979..02b9118 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ >> " nvdimm=on|off controls NVDIMM support (default=off)\n" >> " memory-encryption=@var{} memory encryption object to use (default=none)\n" >> " hmat=on|off controls ACPI HMAT support (default=off)\n" >> +#ifdef CONFIG_POSIX >> + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" >> +#endif >> " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" >> " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", >> QEMU_ARCH_ALL) >> @@ -101,6 +104,18 @@ SRST >> Enables or disables ACPI Heterogeneous Memory Attribute Table >> (HMAT) support. The default is off. >> >> +#ifdef CONFIG_POSIX >> + ``aux-ram-share=on|off`` >> + Allocate auxiliary guest RAM as an anonymous file that is >> + shareable with an external process. This option applies to >> + memory allocated as a side effect of creating various devices. >> + It does not apply to memory-backend-objects, whether explicitly >> + specified on the command line, or implicitly created by the -m >> + command line option. >> + >> + Some migration modes require aux-ram-share=on. > > This leaves the one thing users really need to know unsaid: when exactly > should users enable it. > > "Some migration modes require aux-ram-share=on": do they enable it by > default, or is that left to the user? If the latter, why? > > Please document the default, whatever it is. How about: ``aux-ram-share=on|off` ... command line option. The default is off. To use the cpr-transfer migration mode, you must set aux-ram-share=on. cpr-transfer is a forward reference at this point in the series, so I will move that last line to the "cpr-transfer mode" patch. - Steve
On Mon, Dec 02, 2024 at 05:19:56AM -0800, Steve Sistare wrote: > diff --git a/system/physmem.c b/system/physmem.c > index 36f0811..0bcb2cc 100644 > --- a/system/physmem.c > +++ b/system/physmem.c > @@ -2164,6 +2164,9 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, > new_block->flags = ram_flags; > > if (!host && !xen_enabled()) { > + if (!share_flags && current_machine->aux_ram_share) { > + new_block->flags |= RAM_SHARED; > + } Just to mention that if you agree with what I said in patch 2, here it will need some trivial rebase change. IOW, IMO we shouldn't special case xen either here, so it should also apply to xen if one chose to, changing aux alloc to RAM_SHARED. Frankly I don't know whether xen respects RAM_SHARED at all for anonymous, but it's a separate question to ask.. Basically what will happen later is in cpr-transfer migrate cmd, it'll fail for xen properly seeing fd==-1. That'll be fine, IMHO. > if ((new_block->flags & RAM_SHARED) && > !qemu_ram_alloc_shared(new_block, &local_err)) { > goto err;
On 12/9/2024 2:54 PM, Peter Xu wrote: > On Mon, Dec 02, 2024 at 05:19:56AM -0800, Steve Sistare wrote: >> diff --git a/system/physmem.c b/system/physmem.c >> index 36f0811..0bcb2cc 100644 >> --- a/system/physmem.c >> +++ b/system/physmem.c >> @@ -2164,6 +2164,9 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, >> new_block->flags = ram_flags; >> >> if (!host && !xen_enabled()) { >> + if (!share_flags && current_machine->aux_ram_share) { >> + new_block->flags |= RAM_SHARED; >> + } > > Just to mention that if you agree with what I said in patch 2, here it will > need some trivial rebase change. IOW, IMO we shouldn't special case xen > either here, so it should also apply to xen if one chose to, changing aux > alloc to RAM_SHARED. OK. So, if this only requires a trivial change, do I get your RB? - Steve > > Frankly I don't know whether xen respects RAM_SHARED at all for anonymous, > but it's a separate question to ask.. > > Basically what will happen later is in cpr-transfer migrate cmd, it'll fail > for xen properly seeing fd==-1. That'll be fine, IMHO. > >> if ((new_block->flags & RAM_SHARED) && >> !qemu_ram_alloc_shared(new_block, &local_err)) { >> goto err; > > >
On Thu, Dec 12, 2024 at 03:38:07PM -0500, Steven Sistare wrote: > On 12/9/2024 2:54 PM, Peter Xu wrote: > > On Mon, Dec 02, 2024 at 05:19:56AM -0800, Steve Sistare wrote: > > > diff --git a/system/physmem.c b/system/physmem.c > > > index 36f0811..0bcb2cc 100644 > > > --- a/system/physmem.c > > > +++ b/system/physmem.c > > > @@ -2164,6 +2164,9 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, > > > new_block->flags = ram_flags; > > > if (!host && !xen_enabled()) { > > > + if (!share_flags && current_machine->aux_ram_share) { > > > + new_block->flags |= RAM_SHARED; > > > + } > > > > Just to mention that if you agree with what I said in patch 2, here it will > > need some trivial rebase change. IOW, IMO we shouldn't special case xen > > either here, so it should also apply to xen if one chose to, changing aux > > alloc to RAM_SHARED. > > OK. > > So, if this only requires a trivial change, do I get your RB? Yes please.
diff --git a/hw/core/machine.c b/hw/core/machine.c index a35c4a8..b299b40 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -437,6 +437,20 @@ static void machine_set_mem_merge(Object *obj, bool value, Error **errp) ms->mem_merge = value; } +static bool machine_get_aux_ram_share(Object *obj, Error **errp) +{ + MachineState *ms = MACHINE(obj); + + return ms->aux_ram_share; +} + +static void machine_set_aux_ram_share(Object *obj, bool value, Error **errp) +{ + MachineState *ms = MACHINE(obj); + + ms->aux_ram_share = value; +} + static bool machine_get_usb(Object *obj, Error **errp) { MachineState *ms = MACHINE(obj); @@ -1129,6 +1143,10 @@ static void machine_class_init(ObjectClass *oc, void *data) object_class_property_set_description(oc, "mem-merge", "Enable/disable memory merge support"); + object_class_property_add_bool(oc, "aux-ram-share", + machine_get_aux_ram_share, + machine_set_aux_ram_share); + object_class_property_add_bool(oc, "usb", machine_get_usb, machine_set_usb); object_class_property_set_description(oc, "usb", diff --git a/include/hw/boards.h b/include/hw/boards.h index 36fbb9b..922ecd4 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -410,6 +410,7 @@ struct MachineState { bool enable_graphics; ConfidentialGuestSupport *cgs; HostMemoryBackend *memdev; + bool aux_ram_share; /* * convenience alias to ram_memdev_id backend memory region * or to numa container memory region diff --git a/qemu-options.hx b/qemu-options.hx index dacc979..02b9118 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -38,6 +38,9 @@ DEF("machine", HAS_ARG, QEMU_OPTION_machine, \ " nvdimm=on|off controls NVDIMM support (default=off)\n" " memory-encryption=@var{} memory encryption object to use (default=none)\n" " hmat=on|off controls ACPI HMAT support (default=off)\n" +#ifdef CONFIG_POSIX + " aux-ram-share=on|off allocate auxiliary guest RAM as shared (default: off)\n" +#endif " memory-backend='backend-id' specifies explicitly provided backend for main RAM (default=none)\n" " cxl-fmw.0.targets.0=firsttarget,cxl-fmw.0.targets.1=secondtarget,cxl-fmw.0.size=size[,cxl-fmw.0.interleave-granularity=granularity]\n", QEMU_ARCH_ALL) @@ -101,6 +104,18 @@ SRST Enables or disables ACPI Heterogeneous Memory Attribute Table (HMAT) support. The default is off. +#ifdef CONFIG_POSIX + ``aux-ram-share=on|off`` + Allocate auxiliary guest RAM as an anonymous file that is + shareable with an external process. This option applies to + memory allocated as a side effect of creating various devices. + It does not apply to memory-backend-objects, whether explicitly + specified on the command line, or implicitly created by the -m + command line option. + + Some migration modes require aux-ram-share=on. +#endif + ``memory-backend='id'`` An alternative to legacy ``-mem-path`` and ``mem-prealloc`` options. Allows to use a memory backend as main RAM. diff --git a/system/physmem.c b/system/physmem.c index 36f0811..0bcb2cc 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -2164,6 +2164,9 @@ RAMBlock *qemu_ram_alloc_internal(ram_addr_t size, ram_addr_t max_size, new_block->flags = ram_flags; if (!host && !xen_enabled()) { + if (!share_flags && current_machine->aux_ram_share) { + new_block->flags |= RAM_SHARED; + } if ((new_block->flags & RAM_SHARED) && !qemu_ram_alloc_shared(new_block, &local_err)) { goto err;
Allocate auxilliary guest RAM as an anonymous file that is shareable with an external process. This option applies to memory allocated as a side effect of creating various devices. It does not apply to memory-backend-objects, whether explicitly specified on the command line, or implicitly created by the -m command line option. This option is intended to support new migration modes, in which the memory region can be transferred in place to a new QEMU process, by sending the memfd file descriptor to the process. Memory contents are preserved, and if the mode also transfers device descriptors, then pages that are locked in memory for DMA remain locked. This behavior is a pre-requisite for supporting vfio, vdpa, and iommufd devices with the new modes. Signed-off-by: Steve Sistare <steven.sistare@oracle.com> --- hw/core/machine.c | 18 ++++++++++++++++++ include/hw/boards.h | 1 + qemu-options.hx | 15 +++++++++++++++ system/physmem.c | 3 +++ 4 files changed, 37 insertions(+)