Message ID | 1719776648-435073-1-git-send-email-steven.sistare@oracle.com (mailing list archive) |
---|---|
Headers | show |
Series | Live update: cpr-transfer | expand |
On Sun, Jun 30, 2024 at 12:44:02PM -0700, Steve Sistare wrote: > What? > > This patch series adds the live migration cpr-transfer mode, which > allows the user to transfer a guest to a new QEMU instance on the same > host. It is identical to cpr-exec in most respects, except as described > below. I definitely prefer this one more than the exec solution, thanks for trying this out. It's a matter of whether we'll need both, my answer would be no.. > > The new user-visible interfaces are: > * cpr-transfer (MigMode migration parameter) > * cpr-uri (migration parameter) I wonder whether this parameter can be avoided already, maybe we can let cpr-transfer depend on unix socket in -incoming, then integrate fd sharing in the same channel? > * cpr-uri (command-line argument) > > In this mode, the user starts new QEMU on the same host as old QEMU, with > the same arguments as old QEMU, plus the -incoming and the -cpr-uri options. > The user issues the migrate command to old QEMU, which stops the VM, saves > state to the migration channels, and enters the postmigrate state. Execution > resumes in new QEMU. > > This mode requires a second migration channel, specified by the cpr-uri > migration property on the outgoing side, and by the cpr-uri QEMU command-line > option on the incoming side. The channel must be a type, such as unix socket, > that supports SCM_RIGHTS. > > This series depends on the series "Live update: cpr-exec mode". > > Why? > > cpr-transfer offers the same benefits as cpr-exec mode, but with a model > for launching new QEMU that may be more natural for some management packages. > > How? > > The file descriptors are kept open by sending them to new QEMU via the > cpr-uri, which must support SCM_RIGHTS. > > Example: > > In this example, we simply restart the same version of QEMU, but in > a real scenario one would use a new QEMU binary path in terminal 2. > > Terminal 1: start old QEMU > # qemu-kvm -monitor stdio -object > memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on > -m 4G -machine anon-alloc=memfd ... > > Terminal 2: start new QEMU > # qemu-kvm ... -incoming unix:vm.sock -cpr-uri unix:cpr.sock > > Terminal 1: > QEMU 9.1.50 monitor - type 'help' for more information > (qemu) info status > VM status: running > (qemu) migrate_set_parameter mode cpr-transfer > (qemu) migrate_set_parameter cpr-uri unix:cpr.sock > (qemu) migrate -d unix:vm.sock > (qemu) info status > VM status: paused (postmigrate) > > Terminal 2: > QEMU 9.1.50 monitor - type 'help' for more information > (qemu) info status > VM status: running > > Steve Sistare (6): > migration: SCM_RIGHTS for QEMUFile > migration: VMSTATE_FD > migration: cpr-transfer save and load > migration: cpr-uri parameter > migration: cpr-uri option > migration: cpr-transfer mode > > include/migration/cpr.h | 4 ++ > include/migration/vmstate.h | 9 +++++ > migration/cpr-transfer.c | 81 +++++++++++++++++++++++++++++++++++++++++ > migration/cpr.c | 16 +++++++- > migration/meson.build | 1 + > migration/migration-hmp-cmds.c | 10 +++++ > migration/migration.c | 37 +++++++++++++++++++ > migration/options.c | 29 +++++++++++++++ > migration/options.h | 1 + > migration/qemu-file.c | 83 ++++++++++++++++++++++++++++++++++++++++-- > migration/qemu-file.h | 2 + > migration/ram.c | 1 + > migration/trace-events | 2 + > migration/vmstate-types.c | 33 +++++++++++++++++ > qapi/migration.json | 38 ++++++++++++++++++- > qemu-options.hx | 8 ++++ > stubs/vmstate.c | 7 ++++ > system/vl.c | 3 ++ > 18 files changed, 359 insertions(+), 6 deletions(-) > create mode 100644 migration/cpr-transfer.c > > -- > 1.8.3.1 >
On 7/18/2024 11:36 AM, Peter Xu wrote: > On Sun, Jun 30, 2024 at 12:44:02PM -0700, Steve Sistare wrote: >> What? >> >> This patch series adds the live migration cpr-transfer mode, which >> allows the user to transfer a guest to a new QEMU instance on the same >> host. It is identical to cpr-exec in most respects, except as described >> below. > > I definitely prefer this one more than the exec solution, thanks for trying > this out. It's a matter of whether we'll need both, my answer would be > no.. > >> >> The new user-visible interfaces are: >> * cpr-transfer (MigMode migration parameter) >> * cpr-uri (migration parameter) > > I wonder whether this parameter can be avoided already, maybe we can let > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > in the same channel? You saw the answer in another thread, but I repeat it here for others benefit: "CPR state cannot be sent over the normal migration channel, because devices and backends are created prior to reading the channel, so this mode sends CPR state over a second migration channel that is not visible to the user. New QEMU reads the second channel prior to creating devices or backends." - Steve >> * cpr-uri (command-line argument) >> >> In this mode, the user starts new QEMU on the same host as old QEMU, with >> the same arguments as old QEMU, plus the -incoming and the -cpr-uri options. >> The user issues the migrate command to old QEMU, which stops the VM, saves >> state to the migration channels, and enters the postmigrate state. Execution >> resumes in new QEMU. >> >> This mode requires a second migration channel, specified by the cpr-uri >> migration property on the outgoing side, and by the cpr-uri QEMU command-line >> option on the incoming side. The channel must be a type, such as unix socket, >> that supports SCM_RIGHTS. >> >> This series depends on the series "Live update: cpr-exec mode". >> >> Why? >> >> cpr-transfer offers the same benefits as cpr-exec mode, but with a model >> for launching new QEMU that may be more natural for some management packages. >> >> How? >> >> The file descriptors are kept open by sending them to new QEMU via the >> cpr-uri, which must support SCM_RIGHTS. >> >> Example: >> >> In this example, we simply restart the same version of QEMU, but in >> a real scenario one would use a new QEMU binary path in terminal 2. >> >> Terminal 1: start old QEMU >> # qemu-kvm -monitor stdio -object >> memory-backend-file,id=ram0,size=4G,mem-path=/dev/shm/ram0,share=on >> -m 4G -machine anon-alloc=memfd ... >> >> Terminal 2: start new QEMU >> # qemu-kvm ... -incoming unix:vm.sock -cpr-uri unix:cpr.sock >> >> Terminal 1: >> QEMU 9.1.50 monitor - type 'help' for more information >> (qemu) info status >> VM status: running >> (qemu) migrate_set_parameter mode cpr-transfer >> (qemu) migrate_set_parameter cpr-uri unix:cpr.sock >> (qemu) migrate -d unix:vm.sock >> (qemu) info status >> VM status: paused (postmigrate) >> >> Terminal 2: >> QEMU 9.1.50 monitor - type 'help' for more information >> (qemu) info status >> VM status: running >> >> Steve Sistare (6): >> migration: SCM_RIGHTS for QEMUFile >> migration: VMSTATE_FD >> migration: cpr-transfer save and load >> migration: cpr-uri parameter >> migration: cpr-uri option >> migration: cpr-transfer mode >> >> include/migration/cpr.h | 4 ++ >> include/migration/vmstate.h | 9 +++++ >> migration/cpr-transfer.c | 81 +++++++++++++++++++++++++++++++++++++++++ >> migration/cpr.c | 16 +++++++- >> migration/meson.build | 1 + >> migration/migration-hmp-cmds.c | 10 +++++ >> migration/migration.c | 37 +++++++++++++++++++ >> migration/options.c | 29 +++++++++++++++ >> migration/options.h | 1 + >> migration/qemu-file.c | 83 ++++++++++++++++++++++++++++++++++++++++-- >> migration/qemu-file.h | 2 + >> migration/ram.c | 1 + >> migration/trace-events | 2 + >> migration/vmstate-types.c | 33 +++++++++++++++++ >> qapi/migration.json | 38 ++++++++++++++++++- >> qemu-options.hx | 8 ++++ >> stubs/vmstate.c | 7 ++++ >> system/vl.c | 3 ++ >> 18 files changed, 359 insertions(+), 6 deletions(-) >> create mode 100644 migration/cpr-transfer.c >> >> -- >> 1.8.3.1 >> >
On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > The new user-visible interfaces are: > > > * cpr-transfer (MigMode migration parameter) > > > * cpr-uri (migration parameter) > > > > I wonder whether this parameter can be avoided already, maybe we can let > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > in the same channel? > > You saw the answer in another thread, but I repeat it here for others benefit: > > "CPR state cannot be sent over the normal migration channel, because devices > and backends are created prior to reading the channel, so this mode sends > CPR state over a second migration channel that is not visible to the user. > New QEMU reads the second channel prior to creating devices or backends." Today when looking again, I wonder about the other way round: can we make the new parameter called "-incoming-cpr", working exactly the same as "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically be reused for migration incoming ports? After all, cpr needs to happen already with unix sockets. Having separate cmdline options grants user to make the other one to be non-unix, but that doesn't seem to buy us anything.. then it seems easier to always reuse it, and restrict cpr-transfer to only work with unix sockets for incoming too?
On Thu, Aug 15, 2024 at 04:28:59PM -0400, Peter Xu wrote: > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > > The new user-visible interfaces are: > > > > * cpr-transfer (MigMode migration parameter) > > > > * cpr-uri (migration parameter) > > > > > > I wonder whether this parameter can be avoided already, maybe we can let > > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > > in the same channel? > > > > You saw the answer in another thread, but I repeat it here for others benefit: > > > > "CPR state cannot be sent over the normal migration channel, because devices > > and backends are created prior to reading the channel, so this mode sends > > CPR state over a second migration channel that is not visible to the user. > > New QEMU reads the second channel prior to creating devices or backends." > > Today when looking again, I wonder about the other way round: can we make > the new parameter called "-incoming-cpr", working exactly the same as > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > be reused for migration incoming ports? > > After all, cpr needs to happen already with unix sockets. Having separate > cmdline options grants user to make the other one to be non-unix, but that > doesn't seem to buy us anything.. then it seems easier to always reuse it, > and restrict cpr-transfer to only work with unix sockets for incoming too? IMHO we should not be adding any new command line parameter at all, and in fact we should actually deprecate the existing "-incoming", except when used with "defer". An application managing migration should be doing all the configuration via QMP With regards, Daniel
On 8/15/2024 4:28 PM, Peter Xu wrote: > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: >>>> The new user-visible interfaces are: >>>> * cpr-transfer (MigMode migration parameter) >>>> * cpr-uri (migration parameter) >>> >>> I wonder whether this parameter can be avoided already, maybe we can let >>> cpr-transfer depend on unix socket in -incoming, then integrate fd sharing >>> in the same channel? >> >> You saw the answer in another thread, but I repeat it here for others benefit: >> >> "CPR state cannot be sent over the normal migration channel, because devices >> and backends are created prior to reading the channel, so this mode sends >> CPR state over a second migration channel that is not visible to the user. >> New QEMU reads the second channel prior to creating devices or backends." > > Today when looking again, I wonder about the other way round: can we make > the new parameter called "-incoming-cpr", working exactly the same as > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > be reused for migration incoming ports? > > After all, cpr needs to happen already with unix sockets. Having separate > cmdline options grants user to make the other one to be non-unix, but that > doesn't seem to buy us anything.. then it seems easier to always reuse it, > and restrict cpr-transfer to only work with unix sockets for incoming too? This idea also occurred to me, but I dislike the loss of flexibility for the incoming socket type. The exec URI in particular can do anything, and we would be eliminating it. I also think -incoming-cpr has equal or greater "specification complexity" than -cpr-uri. They both add a new option, and the former also modifies the behavior of an existing option (disallows it and subsumes it). - Steve
On 8/16/2024 4:42 AM, Daniel P. Berrangé wrote: > On Thu, Aug 15, 2024 at 04:28:59PM -0400, Peter Xu wrote: >> On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: >>>>> The new user-visible interfaces are: >>>>> * cpr-transfer (MigMode migration parameter) >>>>> * cpr-uri (migration parameter) >>>> >>>> I wonder whether this parameter can be avoided already, maybe we can let >>>> cpr-transfer depend on unix socket in -incoming, then integrate fd sharing >>>> in the same channel? >>> >>> You saw the answer in another thread, but I repeat it here for others benefit: >>> >>> "CPR state cannot be sent over the normal migration channel, because devices >>> and backends are created prior to reading the channel, so this mode sends >>> CPR state over a second migration channel that is not visible to the user. >>> New QEMU reads the second channel prior to creating devices or backends." >> >> Today when looking again, I wonder about the other way round: can we make >> the new parameter called "-incoming-cpr", working exactly the same as >> "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically >> be reused for migration incoming ports? >> >> After all, cpr needs to happen already with unix sockets. Having separate >> cmdline options grants user to make the other one to be non-unix, but that >> doesn't seem to buy us anything.. then it seems easier to always reuse it, >> and restrict cpr-transfer to only work with unix sockets for incoming too? > > IMHO we should not be adding any new command line parameter at all, > and in fact we should actually deprecate the existing "-incoming", > except when used with "defer". > > An application managing migration should be doing all the configuration > via QMP This is devilish hard to implement for cpr-uri, because it must be known before any backends or devices are created. The existing preconfig phase occurs too late. One must define a new precreate phase which occurs before any backends or devices are created. Requires a new -precreate option and a precreate-exit qmp command. Untangle catch-22 dependencies amongst properties, machine, and accelerator, so that migration_object_init() can be called early, so that migration commands are supported in the monitor. Extract monitor specific options and start a monitor (and first create monitor chardevs, an exception to the "no creation" rule). If/when someone tackles the "all configuration via QMP" project, I would be happy to advise, but right now a cpr-uri command-line parameter is a sane and simple option. - Steve
On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: > On 8/15/2024 4:28 PM, Peter Xu wrote: > > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > > > The new user-visible interfaces are: > > > > > * cpr-transfer (MigMode migration parameter) > > > > > * cpr-uri (migration parameter) > > > > > > > > I wonder whether this parameter can be avoided already, maybe we can let > > > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > > > in the same channel? > > > > > > You saw the answer in another thread, but I repeat it here for others benefit: > > > > > > "CPR state cannot be sent over the normal migration channel, because devices > > > and backends are created prior to reading the channel, so this mode sends > > > CPR state over a second migration channel that is not visible to the user. > > > New QEMU reads the second channel prior to creating devices or backends." > > > > Today when looking again, I wonder about the other way round: can we make > > the new parameter called "-incoming-cpr", working exactly the same as > > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > > be reused for migration incoming ports? > > > > After all, cpr needs to happen already with unix sockets. Having separate > > cmdline options grants user to make the other one to be non-unix, but that > > doesn't seem to buy us anything.. then it seems easier to always reuse it, > > and restrict cpr-transfer to only work with unix sockets for incoming too? > > This idea also occurred to me, but I dislike the loss of flexibility for > the incoming socket type. The exec URI in particular can do anything, and > we would be eliminating it. Ah, I would be guessing that if Juan is still around then exec URI should already been marked deprecated and prone to removal soon.. while I tend to agree that exec does introduce some complexity meanwhile iiuc nobody uses that in production systems. What's the exec use case you're picturing? Would that mostly for debugging purpose, and would that be easily replaceable with another tunnelling like "ncat" or so? > > I also think -incoming-cpr has equal or greater "specification complexity" than > -cpr-uri. They both add a new option, and the former also modifies the behavior > of an existing option (disallows it and subsumes it). Please see Dan's comment elsewhere. I wonder whether we can do it without new qemu cmdlines if that's what we're looking for for the long term. If QMP is accessbile before cpr early loads, then we can set cpr mode and reuse migrate-incoming with no new parameter added.
On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote: > On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: > > On 8/15/2024 4:28 PM, Peter Xu wrote: > > > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > > > > The new user-visible interfaces are: > > > > > > * cpr-transfer (MigMode migration parameter) > > > > > > * cpr-uri (migration parameter) > > > > > > > > > > I wonder whether this parameter can be avoided already, maybe we can let > > > > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > > > > in the same channel? > > > > > > > > You saw the answer in another thread, but I repeat it here for others benefit: > > > > > > > > "CPR state cannot be sent over the normal migration channel, because devices > > > > and backends are created prior to reading the channel, so this mode sends > > > > CPR state over a second migration channel that is not visible to the user. > > > > New QEMU reads the second channel prior to creating devices or backends." > > > > > > Today when looking again, I wonder about the other way round: can we make > > > the new parameter called "-incoming-cpr", working exactly the same as > > > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > > > be reused for migration incoming ports? > > > > > > After all, cpr needs to happen already with unix sockets. Having separate > > > cmdline options grants user to make the other one to be non-unix, but that > > > doesn't seem to buy us anything.. then it seems easier to always reuse it, > > > and restrict cpr-transfer to only work with unix sockets for incoming too? > > > > This idea also occurred to me, but I dislike the loss of flexibility for > > the incoming socket type. The exec URI in particular can do anything, and > > we would be eliminating it. > > Ah, I would be guessing that if Juan is still around then exec URI should > already been marked deprecated and prone to removal soon.. while I tend to > agree that exec does introduce some complexity meanwhile iiuc nobody uses > that in production systems. > > What's the exec use case you're picturing? Would that mostly for debugging > purpose, and would that be easily replaceable with another tunnelling like > "ncat" or so? Conceptually "exec:" is a nice thing, but from a practical POV it introduces difficulties for QEMU. QEMU doesn't know if the exec'd command will provide a unidirectional channel or bidirectional channel, so has to assume the worst - unidirectional. It also can't know if it is safe to run the exec multiple times, or is only valid to run it once - so afgai nhas to assume once only. We could fix those by adding further flags in the migration address to indicate if its bi-directional & multi-channel safe. Technically "exec" is obsolete given "fd", but then that applies to literally all protocols. Implementing them in QEMU is a more user friendly thing. Exec was more compelling when QEMU's other protocols were less mature, lacking TLS for example, but I still find it interesting as a facility. With regards, Daniel
On Fri, Aug 16, 2024 at 04:36:58PM +0100, Daniel P. Berrangé wrote: > On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote: > > On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: > > > On 8/15/2024 4:28 PM, Peter Xu wrote: > > > > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > > > > > The new user-visible interfaces are: > > > > > > > * cpr-transfer (MigMode migration parameter) > > > > > > > * cpr-uri (migration parameter) > > > > > > > > > > > > I wonder whether this parameter can be avoided already, maybe we can let > > > > > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > > > > > in the same channel? > > > > > > > > > > You saw the answer in another thread, but I repeat it here for others benefit: > > > > > > > > > > "CPR state cannot be sent over the normal migration channel, because devices > > > > > and backends are created prior to reading the channel, so this mode sends > > > > > CPR state over a second migration channel that is not visible to the user. > > > > > New QEMU reads the second channel prior to creating devices or backends." > > > > > > > > Today when looking again, I wonder about the other way round: can we make > > > > the new parameter called "-incoming-cpr", working exactly the same as > > > > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > > > > be reused for migration incoming ports? > > > > > > > > After all, cpr needs to happen already with unix sockets. Having separate > > > > cmdline options grants user to make the other one to be non-unix, but that > > > > doesn't seem to buy us anything.. then it seems easier to always reuse it, > > > > and restrict cpr-transfer to only work with unix sockets for incoming too? > > > > > > This idea also occurred to me, but I dislike the loss of flexibility for > > > the incoming socket type. The exec URI in particular can do anything, and > > > we would be eliminating it. > > > > Ah, I would be guessing that if Juan is still around then exec URI should > > already been marked deprecated and prone to removal soon.. while I tend to > > agree that exec does introduce some complexity meanwhile iiuc nobody uses > > that in production systems. > > > > What's the exec use case you're picturing? Would that mostly for debugging > > purpose, and would that be easily replaceable with another tunnelling like > > "ncat" or so? > > Conceptually "exec:" is a nice thing, but from a practical POV it > introduces difficulties for QEMU. QEMU doesn't know if the exec'd > command will provide a unidirectional channel or bidirectional > channel, so has to assume the worst - unidirectional. It also can't > know if it is safe to run the exec multiple times, or is only valid > to run it once - so afgai nhas to assume once only. > > We could fix those by adding further flags in the migration address > to indicate if its bi-directional & multi-channel safe. > > Technically "exec" is obsolete given "fd", but then that applies to > literally all protocols. Implementing them in QEMU is a more user > friendly thing. > > Exec was more compelling when QEMU's other protocols were less > mature, lacking TLS for example, but I still find it interesting > as a facility. Right, it's an interesting idea on its own. It's just that when QEMU grows into not only a tool anymore it adds burden on top as you discussed, in which case we consider dropping things as wins (and we already started doing so at least in migration, but iiuc it's not limited to migration). Again, it looks reasonable to drop because I think it's too easy to tool-up the same "exec:" function with ncat or similar things. E.g. kubevirt does TLS even today without qemu's TLS, and AFAIU that's based on unix sockets not exec, and it tunnels to the daemon for TLS encryption (which is prone of removal, though). So even that is not leveraged as we thought. Thanks,
On Fri, Aug 16, 2024 at 11:14:38AM -0400, Steven Sistare wrote: > On 8/16/2024 4:42 AM, Daniel P. Berrangé wrote: > > On Thu, Aug 15, 2024 at 04:28:59PM -0400, Peter Xu wrote: > > > On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: > > > > > > The new user-visible interfaces are: > > > > > > * cpr-transfer (MigMode migration parameter) > > > > > > * cpr-uri (migration parameter) > > > > > > > > > > I wonder whether this parameter can be avoided already, maybe we can let > > > > > cpr-transfer depend on unix socket in -incoming, then integrate fd sharing > > > > > in the same channel? > > > > > > > > You saw the answer in another thread, but I repeat it here for others benefit: > > > > > > > > "CPR state cannot be sent over the normal migration channel, because devices > > > > and backends are created prior to reading the channel, so this mode sends > > > > CPR state over a second migration channel that is not visible to the user. > > > > New QEMU reads the second channel prior to creating devices or backends." > > > > > > Today when looking again, I wonder about the other way round: can we make > > > the new parameter called "-incoming-cpr", working exactly the same as > > > "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically > > > be reused for migration incoming ports? > > > > > > After all, cpr needs to happen already with unix sockets. Having separate > > > cmdline options grants user to make the other one to be non-unix, but that > > > doesn't seem to buy us anything.. then it seems easier to always reuse it, > > > and restrict cpr-transfer to only work with unix sockets for incoming too? > > > > IMHO we should not be adding any new command line parameter at all, > > and in fact we should actually deprecate the existing "-incoming", > > except when used with "defer". > > > > An application managing migration should be doing all the configuration > > via QMP > > This is devilish hard to implement for cpr-uri, because it must be known > before any backends or devices are created. The existing preconfig phase > occurs too late. > > One must define a new precreate phase which occurs before any backends or > devices are created. Requires a new -precreate option and a precreate-exit > qmp command. > > Untangle catch-22 dependencies amongst properties, machine, and accelerator, > so that migration_object_init() can be called early, so that migration > commands are supported in the monitor. > > Extract monitor specific options and start a monitor (and first create > monitor chardevs, an exception to the "no creation" rule). > > If/when someone tackles the "all configuration via QMP" project, I would > be happy to advise, but right now a cpr-uri command-line parameter is > a sane and simple option. So far it looks ok to me from migration POV, but that definitely sounds like relevant to whoever cares about this go-via-QMP-only project. Let me at least copy Paolo as I know he's in that, maybe anyone else too just to collect feedbacks, or just to let people know one more exception might be needed (if no NACK will come later). Thanks,
On 8/16/2024 11:59 AM, Peter Xu wrote: > On Fri, Aug 16, 2024 at 04:36:58PM +0100, Daniel P. Berrangé wrote: >> On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote: >>> On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: >>>> On 8/15/2024 4:28 PM, Peter Xu wrote: >>>>> On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: >>>>>>>> The new user-visible interfaces are: >>>>>>>> * cpr-transfer (MigMode migration parameter) >>>>>>>> * cpr-uri (migration parameter) >>>>>>> >>>>>>> I wonder whether this parameter can be avoided already, maybe we can let >>>>>>> cpr-transfer depend on unix socket in -incoming, then integrate fd sharing >>>>>>> in the same channel? >>>>>> >>>>>> You saw the answer in another thread, but I repeat it here for others benefit: >>>>>> >>>>>> "CPR state cannot be sent over the normal migration channel, because devices >>>>>> and backends are created prior to reading the channel, so this mode sends >>>>>> CPR state over a second migration channel that is not visible to the user. >>>>>> New QEMU reads the second channel prior to creating devices or backends." >>>>> >>>>> Today when looking again, I wonder about the other way round: can we make >>>>> the new parameter called "-incoming-cpr", working exactly the same as >>>>> "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically >>>>> be reused for migration incoming ports? >>>>> >>>>> After all, cpr needs to happen already with unix sockets. Having separate >>>>> cmdline options grants user to make the other one to be non-unix, but that >>>>> doesn't seem to buy us anything.. then it seems easier to always reuse it, >>>>> and restrict cpr-transfer to only work with unix sockets for incoming too? >>>> >>>> This idea also occurred to me, but I dislike the loss of flexibility for >>>> the incoming socket type. The exec URI in particular can do anything, and >>>> we would be eliminating it. >>> >>> Ah, I would be guessing that if Juan is still around then exec URI should >>> already been marked deprecated and prone to removal soon.. while I tend to >>> agree that exec does introduce some complexity meanwhile iiuc nobody uses >>> that in production systems. >>> >>> What's the exec use case you're picturing? Would that mostly for debugging >>> purpose, and would that be easily replaceable with another tunnelling like >>> "ncat" or so? >> >> Conceptually "exec:" is a nice thing, but from a practical POV it >> introduces difficulties for QEMU. QEMU doesn't know if the exec'd >> command will provide a unidirectional channel or bidirectional >> channel, so has to assume the worst - unidirectional. It also can't >> know if it is safe to run the exec multiple times, or is only valid >> to run it once - so afgai nhas to assume once only. >> >> We could fix those by adding further flags in the migration address >> to indicate if its bi-directional & multi-channel safe. >> >> Technically "exec" is obsolete given "fd", but then that applies to >> literally all protocols. Implementing them in QEMU is a more user >> friendly thing. >> >> Exec was more compelling when QEMU's other protocols were less >> mature, lacking TLS for example, but I still find it interesting >> as a facility. > > Right, it's an interesting idea on its own. It's just that when QEMU grows > into not only a tool anymore it adds burden on top as you discussed, in > which case we consider dropping things as wins (and we already started > doing so at least in migration, but iiuc it's not limited to migration). > > Again, it looks reasonable to drop because I think it's too easy to tool-up > the same "exec:" function with ncat or similar things. E.g. kubevirt does > TLS even today without qemu's TLS, and AFAIU that's based on unix sockets > not exec, and it tunnels to the daemon for TLS encryption (which is prone > of removal, though). So even that is not leveraged as we thought. Also, the "fd" URI would not work. We could not read from it once for cpr state, reopen it, and read again for migration state. Nor multifd. - Steve
On 8/16/2024 2:34 PM, Steven Sistare wrote: > On 8/16/2024 11:59 AM, Peter Xu wrote: >> On Fri, Aug 16, 2024 at 04:36:58PM +0100, Daniel P. Berrangé wrote: >>> On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote: >>>> On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: >>>>> On 8/15/2024 4:28 PM, Peter Xu wrote: >>>>>> On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: >>>>>>>>> The new user-visible interfaces are: >>>>>>>>> * cpr-transfer (MigMode migration parameter) >>>>>>>>> * cpr-uri (migration parameter) >>>>>>>> >>>>>>>> I wonder whether this parameter can be avoided already, maybe we can let >>>>>>>> cpr-transfer depend on unix socket in -incoming, then integrate fd sharing >>>>>>>> in the same channel? >>>>>>> >>>>>>> You saw the answer in another thread, but I repeat it here for others benefit: >>>>>>> >>>>>>> "CPR state cannot be sent over the normal migration channel, because devices >>>>>>> and backends are created prior to reading the channel, so this mode sends >>>>>>> CPR state over a second migration channel that is not visible to the user. >>>>>>> New QEMU reads the second channel prior to creating devices or backends." >>>>>> >>>>>> Today when looking again, I wonder about the other way round: can we make >>>>>> the new parameter called "-incoming-cpr", working exactly the same as >>>>>> "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically >>>>>> be reused for migration incoming ports? >>>>>> >>>>>> After all, cpr needs to happen already with unix sockets. Having separate >>>>>> cmdline options grants user to make the other one to be non-unix, but that >>>>>> doesn't seem to buy us anything.. then it seems easier to always reuse it, >>>>>> and restrict cpr-transfer to only work with unix sockets for incoming too? >>>>> >>>>> This idea also occurred to me, but I dislike the loss of flexibility for >>>>> the incoming socket type. The exec URI in particular can do anything, and >>>>> we would be eliminating it. >>>> >>>> Ah, I would be guessing that if Juan is still around then exec URI should >>>> already been marked deprecated and prone to removal soon.. while I tend to >>>> agree that exec does introduce some complexity meanwhile iiuc nobody uses >>>> that in production systems. >>>> >>>> What's the exec use case you're picturing? Would that mostly for debugging >>>> purpose, and would that be easily replaceable with another tunnelling like >>>> "ncat" or so? >>> >>> Conceptually "exec:" is a nice thing, but from a practical POV it >>> introduces difficulties for QEMU. QEMU doesn't know if the exec'd >>> command will provide a unidirectional channel or bidirectional >>> channel, so has to assume the worst - unidirectional. It also can't >>> know if it is safe to run the exec multiple times, or is only valid >>> to run it once - so afgai nhas to assume once only. >>> >>> We could fix those by adding further flags in the migration address >>> to indicate if its bi-directional & multi-channel safe. >>> >>> Technically "exec" is obsolete given "fd", but then that applies to >>> literally all protocols. Implementing them in QEMU is a more user >>> friendly thing. >>> >>> Exec was more compelling when QEMU's other protocols were less >>> mature, lacking TLS for example, but I still find it interesting >>> as a facility. >> >> Right, it's an interesting idea on its own. It's just that when QEMU grows >> into not only a tool anymore it adds burden on top as you discussed, in >> which case we consider dropping things as wins (and we already started >> doing so at least in migration, but iiuc it's not limited to migration). >> >> Again, it looks reasonable to drop because I think it's too easy to tool-up >> the same "exec:" function with ncat or similar things. E.g. kubevirt does >> TLS even today without qemu's TLS, and AFAIU that's based on unix sockets >> not exec, and it tunnels to the daemon for TLS encryption (which is prone >> of removal, though). So even that is not leveraged as we thought. > > Also, the "fd" URI would not work. We could not read from it once for cpr state, > reopen it, and read again for migration state. > > Nor multifd. Am I wrong? I still go back to my original statement: -incoming-cpr has equal or greater "specification complexity" than -cpr-uri. It is not simpler, and comes with restrictions. - Steve
On 8/20/2024 12:29 PM, Steven Sistare wrote: > On 8/16/2024 2:34 PM, Steven Sistare wrote: >> On 8/16/2024 11:59 AM, Peter Xu wrote: >>> On Fri, Aug 16, 2024 at 04:36:58PM +0100, Daniel P. Berrangé wrote: >>>> On Fri, Aug 16, 2024 at 11:23:01AM -0400, Peter Xu wrote: >>>>> On Fri, Aug 16, 2024 at 11:13:36AM -0400, Steven Sistare wrote: >>>>>> On 8/15/2024 4:28 PM, Peter Xu wrote: >>>>>>> On Sat, Jul 20, 2024 at 04:07:50PM -0400, Steven Sistare wrote: >>>>>>>>>> The new user-visible interfaces are: >>>>>>>>>> * cpr-transfer (MigMode migration parameter) >>>>>>>>>> * cpr-uri (migration parameter) >>>>>>>>> >>>>>>>>> I wonder whether this parameter can be avoided already, maybe we can let >>>>>>>>> cpr-transfer depend on unix socket in -incoming, then integrate fd sharing >>>>>>>>> in the same channel? >>>>>>>> >>>>>>>> You saw the answer in another thread, but I repeat it here for others benefit: >>>>>>>> >>>>>>>> "CPR state cannot be sent over the normal migration channel, because devices >>>>>>>> and backends are created prior to reading the channel, so this mode sends >>>>>>>> CPR state over a second migration channel that is not visible to the user. >>>>>>>> New QEMU reads the second channel prior to creating devices or backends." >>>>>>> >>>>>>> Today when looking again, I wonder about the other way round: can we make >>>>>>> the new parameter called "-incoming-cpr", working exactly the same as >>>>>>> "cpr-uri" qemu cmdline, but then after cpr is loaded it'll be automatically >>>>>>> be reused for migration incoming ports? >>>>>>> >>>>>>> After all, cpr needs to happen already with unix sockets. Having separate >>>>>>> cmdline options grants user to make the other one to be non-unix, but that >>>>>>> doesn't seem to buy us anything.. then it seems easier to always reuse it, >>>>>>> and restrict cpr-transfer to only work with unix sockets for incoming too? >>>>>> >>>>>> This idea also occurred to me, but I dislike the loss of flexibility for >>>>>> the incoming socket type. The exec URI in particular can do anything, and >>>>>> we would be eliminating it. >>>>> >>>>> Ah, I would be guessing that if Juan is still around then exec URI should >>>>> already been marked deprecated and prone to removal soon.. while I tend to >>>>> agree that exec does introduce some complexity meanwhile iiuc nobody uses >>>>> that in production systems. >>>>> >>>>> What's the exec use case you're picturing? Would that mostly for debugging >>>>> purpose, and would that be easily replaceable with another tunnelling like >>>>> "ncat" or so? >>>> >>>> Conceptually "exec:" is a nice thing, but from a practical POV it >>>> introduces difficulties for QEMU. QEMU doesn't know if the exec'd >>>> command will provide a unidirectional channel or bidirectional >>>> channel, so has to assume the worst - unidirectional. It also can't >>>> know if it is safe to run the exec multiple times, or is only valid >>>> to run it once - so afgai nhas to assume once only. >>>> >>>> We could fix those by adding further flags in the migration address >>>> to indicate if its bi-directional & multi-channel safe. >>>> >>>> Technically "exec" is obsolete given "fd", but then that applies to >>>> literally all protocols. Implementing them in QEMU is a more user >>>> friendly thing. >>>> >>>> Exec was more compelling when QEMU's other protocols were less >>>> mature, lacking TLS for example, but I still find it interesting >>>> as a facility. >>> >>> Right, it's an interesting idea on its own. It's just that when QEMU grows >>> into not only a tool anymore it adds burden on top as you discussed, in >>> which case we consider dropping things as wins (and we already started >>> doing so at least in migration, but iiuc it's not limited to migration). >>> >>> Again, it looks reasonable to drop because I think it's too easy to tool-up >>> the same "exec:" function with ncat or similar things. E.g. kubevirt does >>> TLS even today without qemu's TLS, and AFAIU that's based on unix sockets >>> not exec, and it tunnels to the daemon for TLS encryption (which is prone >>> of removal, though). So even that is not leveraged as we thought. >> >> Also, the "fd" URI would not work. We could not read from it once for cpr state, >> reopen it, and read again for migration state. >> >> Nor multifd. > > Am I wrong? > > I still go back to my original statement: -incoming-cpr has equal or greater > "specification complexity" than -cpr-uri. It is not simpler, and comes with > restrictions. Hi Peter, before I post V2 of this series, I would like to reach agreement on this interface. I cannot tell if you have gone quiet on this thread because you agree, disagree, or are on vacation! - Steve
On Wed, Sep 04, 2024 at 05:14:37PM -0400, Steven Sistare wrote: > Hi Peter, before I post V2 of this series, I would like to reach agreement on this > interface. I cannot tell if you have gone quiet on this thread because you agree, > disagree, or are on vacation! Hey Steve, I'm not on vacation. :) But indeed I had some other stuff to look at recently, so can be slower (than my usual slow..). I think I finished reading this one, but I'll double check tomorrow and let you know if there's something.
On Tue, Aug 20, 2024 at 12:29:07PM -0400, Steven Sistare wrote: > I still go back to my original statement: -incoming-cpr has equal or greater > "specification complexity" than -cpr-uri. It is not simpler, and comes with > restrictions. I'm ok with it, as long as Dan and others are ok. I remember I should have copied Paolo but I didn't see him in the loop, do it again. Some context: it's about introducing yet another qemu cmdline for cpr-transfer to specify a URI port for early stage loading of QEMU internal stuff, which Dan raised concern on going against managing the VM using QMP only and removing cmdline efforts. For a generic follow up to yesterday's email: I was trying to re-read the thread but I found that there can be a lot changes, so I figured maybe I should wait for the new version. IOW nothing else I can think of yet to comment on the series. Thanks,