Message ID | 20200116180321.24968-1-olaf@aepfle.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] hw/i386: disable smbus migration for xenfv | expand |
On 16/01/20 19:03, Olaf Hering wrote: > With commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea a new member > smbus_no_migration_support was added, and enabled in two places. > With commit 4ab2f2a8aabfea95cc53c64e13b3f67960b27fdf the vmstate_acpi > got new elements, which are conditionally filled. As a result, an > incoming migration expected smbus related data unless smbus migration > was disabled for a given MachineClass. > > Since commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea forgot to handle > xenfv, live migration to receiving hosts using qemu-4.0 and later is broken. > > Therefore this patch must be applied to stable-4.x as well. > > Signed-off-by: Olaf Hering <olaf@aepfle.de> > --- > > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index fa12203079..e19716d0d3 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -952,6 +952,7 @@ static void xenfv_machine_options(MachineClass *m) > m->desc = "Xen Fully-virtualized PC"; > m->max_cpus = HVM_MAX_VCPUS; > m->default_machine_opts = "accel=xen"; > + m->smbus_no_migration_support = true; > } > > DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init, > This patch is wrong; xenfv does not support cross-version migration compatibility. Even if the migration stream does not change, the hardware exposed to the guest will. My understanding is that Xen is able to use "-M pc-i440fx-VERSION,accel=xen". The presence of the version in the machine type guarantees that the migration stream is compatible and that the hardware exposed to the guest is the same on the source and destination. Is this incorrect? Paolo
Am Thu, 16 Jan 2020 19:26:39 +0100
schrieb Paolo Bonzini <pbonzini@redhat.com>:
> xenfv does not support cross-version migration compatibility.
Wait, what does that mean?
So far live migration of a running domU must be possible from Xen N to Xen N+1.
It would be more than unexpected if the target host running "Xen N+1" must have the very same qemu version as "Xen N".
Olaf
Il gio 16 gen 2020, 19:33 Olaf Hering <olaf@aepfle.de> ha scritto: > Am Thu, 16 Jan 2020 19:26:39 +0100 > schrieb Paolo Bonzini <pbonzini@redhat.com>: > > > xenfv does not support cross-version migration compatibility. > > Wait, what does that mean? > So far live migration of a running domU must be possible from Xen N to Xen > N+1. > The Xen N+1 must know that the source is running Xen N, and use a machine type corresponding to the QEMU version that was shipped with Xen N. For new virtual machines Xen must also use pc-i440fx-M.N and not just PC. This is not new, versioned machine types were introduced in QEMU 0.10 or something like that for exactly this purpose. Paolo It would be more than unexpected if the target host running "Xen N+1" must > have the very same qemu version as "Xen N". > > Olaf >
Am Thu, 16 Jan 2020 19:26:39 +0100 schrieb Paolo Bonzini <pbonzini@redhat.com>: > My understanding is that Xen is able to use "-M > pc-i440fx-VERSION,accel=xen". The presence of the version in the > machine type guarantees that the migration stream is compatible and that > the hardware exposed to the guest is the same on the source and destination. It seems 'xenfv' is not a 'pc-i440fx-X.X', even with accel=xen. The "xen-platform" will not be created and as a result no PV driver can be used. I do not know what a 'xenfv' is supposed to be, and how it is supposed to behave compatible for all existing and future guests. Olaf
On 17/01/20 10:22, Olaf Hering wrote: > Am Thu, 16 Jan 2020 19:26:39 +0100 > schrieb Paolo Bonzini <pbonzini@redhat.com>: > >> My understanding is that Xen is able to use "-M >> pc-i440fx-VERSION,accel=xen". The presence of the version in the >> machine type guarantees that the migration stream is compatible and that >> the hardware exposed to the guest is the same on the source and destination. > > It seems 'xenfv' is not a 'pc-i440fx-X.X', even with accel=xen. > The "xen-platform" will not be created and as a result no PV driver can be used. Just add "-device xen-platform" to the command line. > I do not know what a 'xenfv' is supposed to be, and how it is supposed to > behave compatible for all existing and future guests. It's nothing more than a synonym for "-machine pc -device xen-platform -accel xen" (plus some magic to support igd passthrough, which we could and should move to the pc machine type as well). It doesn't even try to be compatible for all existing and future guests. Paolo
Am Fri, 17 Jan 2020 11:27:59 +0100
schrieb Paolo Bonzini <pbonzini@redhat.com>:
> It doesn't even try to be compatible for all existing and future guests.
This looks like the underlying bug.
What would future domUs lose if 'xenfv' would be locked to 'pc-i440fx-3.0'?
Olaf
On Fri, 17 Jan 2020 at 13:06, Olaf Hering <olaf@aepfle.de> wrote: > > Am Fri, 17 Jan 2020 11:27:59 +0100 > schrieb Paolo Bonzini <pbonzini@redhat.com>: > > > It doesn't even try to be compatible for all existing and future guests. > > This looks like the underlying bug. > > What would future domUs lose if 'xenfv' would be locked to 'pc-i440fx-3.0'? > I guess eventually that pc type would be removed and then we'd have a compat issue. Ideally I think libxl should simply not use xenfv and then it can be deprecated and removed, and then such issues can be dealt with directly in xl/libxl. Paul > > Olaf
Am Mon, 20 Jan 2020 11:18:41 +0000 schrieb Paul Durrant <pdurrant@gmail.com>: > On Fri, 17 Jan 2020 at 13:06, Olaf Hering <olaf@aepfle.de> wrote: > > What would future domUs lose if 'xenfv' would be locked to 'pc-i440fx-3.0'? > I guess eventually that pc type would be removed and then we'd have a > compat issue. Ideally I think libxl should simply not use xenfv and > then it can be deprecated and removed, and then such issues can be > dealt with directly in xl/libxl. I think this does not answer the question at all. What future versions of libxl do is one thing. What existing versions of libxl do with future versions of qemu is another. IMO it was wrong to map "xenfv" to "pc", simply because it entirely ignores live migration. We were just lucky until qemu-3.1. Maybe the creators of 'xenfv' just meant it to be "do everything to make it compatible with HVM". What about a variant of this change, to lock 'xenfv' to 'qemu-3.0'? --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -490,6 +490,13 @@ static void pc_i440fx_3_0_machine_options(MachineClass *m) pc_i440fx_3_1_machine_options(m); compat_props_add(m->compat_props, hw_compat_3_0, hw_compat_3_0_len); compat_props_add(m->compat_props, pc_compat_3_0, pc_compat_3_0_len); + + m->alias = "xenfv"; + if (xen_enabled()) { + m->desc = "Xen Fully-virtualized PC"; + m->max_cpus = HVM_MAX_VCPUS; + m->default_machine_opts = "accel=xen"; + } } DEFINE_I440FX_MACHINE(v3_0, "pc-i440fx-3.0", NULL, @@ -500,6 +507,7 @@ static void pc_i440fx_2_12_machine_options(MachineClass *m) pc_i440fx_3_0_machine_options(m); compat_props_add(m->compat_props, hw_compat_2_12, hw_compat_2_12_len); compat_props_add(m->compat_props, pc_compat_2_12, pc_compat_2_12_len); + m->alias = NULL; } DEFINE_I440FX_MACHINE(v2_12, "pc-i440fx-2.12", NULL, @@ -946,14 +954,3 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa, isapc_machine_options); -#ifdef CONFIG_XEN -static void xenfv_machine_options(MachineClass *m) -{ - m->desc = "Xen Fully-virtualized PC"; - m->max_cpus = HVM_MAX_VCPUS; - m->default_machine_opts = "accel=xen"; -} - -DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init, - xenfv_machine_options); -#endif Olaf
On 20/01/20 12:18, Paul Durrant wrote: >> >> What would future domUs lose if 'xenfv' would be locked to 'pc-i440fx-3.0'? > > I guess eventually that pc type would be removed and then we'd have a > compat issue. That's years away, so ideally libxl would have migrated away from xenfv before that. For now, sticking to a fixed version as in Olaf's patch is a good stopgap measure. Paolo > Ideally I think libxl should simply not use xenfv and > then it can be deprecated and removed, and then such issues can be > dealt with directly in xl/libxl.
Am Mon, 27 Jan 2020 10:35:59 +0100 schrieb Paolo Bonzini <pbonzini@redhat.com>: > That's years away, so ideally libxl would have migrated away from xenfv > before that. For now, sticking to a fixed version as in Olaf's patch is > a good stopgap measure. Is there a way to inspect a running qemu process to see what version it is? I assume one thing is to poke at /proc/$PID/cmdline and make some guesses. Would a running qemu report what pc-i440fx it supports? With such info an enlightened libxl might be able construct a compatible commandline for the receiving host. Olaf
On 27/01/20 14:26, Olaf Hering wrote: >> That's years away, so ideally libxl would have migrated away from >> xenfv before that. For now, sticking to a fixed version as in >> Olaf's patch is a good stopgap measure. > Is there a way to inspect a running qemu process to see what version > it is? I assume one thing is to poke at /proc/$PID/cmdline and make > some guesses. Would a running qemu report what pc-i440fx it supports? Yes, via QMP. For example on QEMU 3.1 with "-M pc" you would get: {"execute":"qom-get", "arguments":{"path":"/machine", "property":"type"}} {"return": "pc-i440fx-3.1-machine"} So libxl would start QEMU with "-M pc,accel=xen -device xen-platform" when _not_ migrating, but on the destination of live migration it would query the machine type and use "-Mpc-i440fx-3.1,accel=xen -device xen-platform". A cleaner possibility is to do {"execute": "query-machines"} and search the result for an entry like {"hotpluggable-cpus": true, "name": "pc-i440fx-3.1", "is-default": true, "cpu-max": 255, "alias": "pc"} i.e. the name corresponding to the entry with "alias": "pc" would be used on the destination. Thanks, Paolo > With such info an enlightened libxl might be able construct a > compatible commandline for the receiving host.
The approach below (making 'xenfv' an alias of 'pc') does not work: xen_enabled() is false when pc_i440fx_3_1_machine_options runs. So, how is this incompatibility between qemu2/3 and qemu4+ supposed to be fixed? Using '-machine pc,accel=xen -device xen-platform' is incompatible with '-machine xenpv' because the platform device has a different PCI address. As such it is not migrateable. Olaf Am Mon, 27 Jan 2020 10:09:51 +0100 schrieb Olaf Hering <olaf@aepfle.de>: > Am Mon, 20 Jan 2020 11:18:41 +0000 > schrieb Paul Durrant <pdurrant@gmail.com>: > > > On Fri, 17 Jan 2020 at 13:06, Olaf Hering <olaf@aepfle.de> wrote: > > > What would future domUs lose if 'xenfv' would be locked to 'pc-i440fx-3.0'? > > > I guess eventually that pc type would be removed and then we'd have a > > compat issue. Ideally I think libxl should simply not use xenfv and > > then it can be deprecated and removed, and then such issues can be > > dealt with directly in xl/libxl. > > I think this does not answer the question at all. > What future versions of libxl do is one thing. > What existing versions of libxl do with future versions of qemu is another. > > > IMO it was wrong to map "xenfv" to "pc", simply because it entirely > ignores live migration. We were just lucky until qemu-3.1. Maybe the > creators of 'xenfv' just meant it to be "do everything to make it > compatible with HVM". > > What about a variant of this change, to lock 'xenfv' to 'qemu-3.0'? > > > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -490,6 +490,13 @@ static void pc_i440fx_3_0_machine_options(MachineClass *m) > pc_i440fx_3_1_machine_options(m); > compat_props_add(m->compat_props, hw_compat_3_0, hw_compat_3_0_len); > compat_props_add(m->compat_props, pc_compat_3_0, pc_compat_3_0_len); > + > + m->alias = "xenfv"; > + if (xen_enabled()) { > + m->desc = "Xen Fully-virtualized PC"; > + m->max_cpus = HVM_MAX_VCPUS; > + m->default_machine_opts = "accel=xen"; > + } > } > > DEFINE_I440FX_MACHINE(v3_0, "pc-i440fx-3.0", NULL, > @@ -500,6 +507,7 @@ static void pc_i440fx_2_12_machine_options(MachineClass *m) > pc_i440fx_3_0_machine_options(m); > compat_props_add(m->compat_props, hw_compat_2_12, hw_compat_2_12_len); > compat_props_add(m->compat_props, pc_compat_2_12, pc_compat_2_12_len); > + m->alias = NULL; > } > > DEFINE_I440FX_MACHINE(v2_12, "pc-i440fx-2.12", NULL, > @@ -946,14 +954,3 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa, > isapc_machine_options); > > > -#ifdef CONFIG_XEN > -static void xenfv_machine_options(MachineClass *m) > -{ > - m->desc = "Xen Fully-virtualized PC"; > - m->max_cpus = HVM_MAX_VCPUS; > - m->default_machine_opts = "accel=xen"; > -} > - > -DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init, > - xenfv_machine_options); > -#endif > > > Olaf
On 18/02/20 18:27, Olaf Hering wrote: > The approach below (making 'xenfv' an alias of 'pc') does not work: > xen_enabled() is false when pc_i440fx_3_1_machine_options runs. Don't use an alias, copy the 3.1 code into the xenfv machine type and/or call the 3.1 functions from the xenfv machine type. Paolo > So, how is this incompatibility between qemu2/3 and qemu4+ supposed to be fixed? > Using '-machine pc,accel=xen -device xen-platform' is incompatible > with '-machine xenpv' because the platform device has a different PCI > address. As such it is not migrateable.
Am Tue, 18 Feb 2020 18:37:09 +0100 schrieb Paolo Bonzini <pbonzini@redhat.com>: > On 18/02/20 18:27, Olaf Hering wrote: > > The approach below (making 'xenfv' an alias of 'pc') does not work: > > xen_enabled() is false when pc_i440fx_3_1_machine_options runs. > Don't use an alias, copy the 3.1 code into the xenfv machine type and/or > call the 3.1 functions from the xenfv machine type. Since pci_create_simple must be called after pc_init1, the change appears to be as simple as this: @@ -949,6 +953,7 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa, #ifdef CONFIG_XEN static void xenfv_machine_options(MachineClass *m) { + pc_i440fx_3_1_machine_options(m); m->desc = "Xen Fully-virtualized PC"; m->max_cpus = HVM_MAX_VCPUS; m->default_machine_opts = "accel=xen"; There will likely be an argument about making xenfv compatible with 3.1 or 4.2. I guess the only consensus will be configure option. Olaf
Am Tue, 18 Feb 2020 18:37:09 +0100 schrieb Paolo Bonzini <pbonzini@redhat.com>: > On 18/02/20 18:27, Olaf Hering wrote: > > The approach below (making 'xenfv' an alias of 'pc') does not work: > > xen_enabled() is false when pc_i440fx_3_1_machine_options runs. > Don't use an alias, copy the 3.1 code into the xenfv machine type and/or > call the 3.1 functions from the xenfv machine type. In the end it may look like this. Let me know about any preferences regarding the naming of configure options and variables. Olaf diff --git a/configure b/configure index 6f5d850949..65ca345fd6 100755 --- a/configure +++ b/configure @@ -368,6 +368,7 @@ vnc_jpeg="" vnc_png="" xkbcommon="" xen="" +xen_hvm_pc_i440fx_version_3_1="" xen_ctrl_version="" xen_pci_passthrough="" linux_aio="" @@ -1162,6 +1163,10 @@ for opt do ;; --enable-xen-pci-passthrough) xen_pci_passthrough="yes" ;; + --disable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="no" + ;; + --enable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="yes" + ;; --disable-brlapi) brlapi="no" ;; --enable-brlapi) brlapi="yes" @@ -7836,6 +7841,9 @@ if supported_xen_target $target; then if test "$xen_pci_passthrough" = yes; then echo "CONFIG_XEN_PCI_PASSTHROUGH=y" >> "$config_target_mak" fi + if test "$xen_hvm_pc_i440fx_version_3_1" = yes; then + echo "CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1=y" >> "$config_target_mak" + fi else echo "$target/config-devices.mak: CONFIG_XEN=n" >> $config_host_mak fi diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index fa12203079..83d1fcc0ba 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -949,6 +949,11 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa, #ifdef CONFIG_XEN static void xenfv_machine_options(MachineClass *m) { +#ifdef CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1 + pc_i440fx_3_1_machine_options(m); +#else + pc_i440fx_4_2_machine_options(m); +#endif m->desc = "Xen Fully-virtualized PC"; m->max_cpus = HVM_MAX_VCPUS; m->default_machine_opts = "accel=xen";
On 18/02/20 20:44, Olaf Hering wrote: > Am Tue, 18 Feb 2020 18:37:09 +0100 > schrieb Paolo Bonzini <pbonzini@redhat.com>: > >> On 18/02/20 18:27, Olaf Hering wrote: >>> The approach below (making 'xenfv' an alias of 'pc') does not work: >>> xen_enabled() is false when pc_i440fx_3_1_machine_options runs. >> Don't use an alias, copy the 3.1 code into the xenfv machine type and/or >> call the 3.1 functions from the xenfv machine type. > > In the end it may look like this. > > Let me know about any preferences regarding the naming of configure options and variables. Has any version of Xen been released with a QEMU version above 3.1? Paolo > > Olaf > > diff --git a/configure b/configure > index 6f5d850949..65ca345fd6 100755 > --- a/configure > +++ b/configure > @@ -368,6 +368,7 @@ vnc_jpeg="" > vnc_png="" > xkbcommon="" > xen="" > +xen_hvm_pc_i440fx_version_3_1="" > xen_ctrl_version="" > xen_pci_passthrough="" > linux_aio="" > @@ -1162,6 +1163,10 @@ for opt do > ;; > --enable-xen-pci-passthrough) xen_pci_passthrough="yes" > ;; > + --disable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="no" > + ;; > + --enable-xenfv-i440fx-version-3_1) xen_hvm_pc_i440fx_version_3_1="yes" > + ;; > --disable-brlapi) brlapi="no" > ;; > --enable-brlapi) brlapi="yes" > @@ -7836,6 +7841,9 @@ if supported_xen_target $target; then > if test "$xen_pci_passthrough" = yes; then > echo "CONFIG_XEN_PCI_PASSTHROUGH=y" >> "$config_target_mak" > fi > + if test "$xen_hvm_pc_i440fx_version_3_1" = yes; then > + echo "CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1=y" >> "$config_target_mak" > + fi > else > echo "$target/config-devices.mak: CONFIG_XEN=n" >> $config_host_mak > fi > diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c > index fa12203079..83d1fcc0ba 100644 > --- a/hw/i386/pc_piix.c > +++ b/hw/i386/pc_piix.c > @@ -949,6 +949,11 @@ DEFINE_PC_MACHINE(isapc, "isapc", pc_init_isa, > #ifdef CONFIG_XEN > static void xenfv_machine_options(MachineClass *m) > { > +#ifdef CONFIG_XEN_HVM_PC_I440FX_VERSION_3_1 > + pc_i440fx_3_1_machine_options(m); > +#else > + pc_i440fx_4_2_machine_options(m); > +#endif > m->desc = "Xen Fully-virtualized PC"; > m->max_cpus = HVM_MAX_VCPUS; > m->default_machine_opts = "accel=xen"; >
Am Wed, 19 Feb 2020 09:05:49 +0100
schrieb Paolo Bonzini <pbonzini@redhat.com>:
> Has any version of Xen been released with a QEMU version above 3.1?
Xen 4.13 has a copy of qemu4. But, Xen can use an external qemu. It is unknown how many supposed-to-be-migrated domUs with qemu4+ are out there. But there is a six digit number of running domUs with qemu3 out there.
Olaf
Am Thu, 16 Jan 2020 19:26:39 +0100 schrieb Paolo Bonzini <pbonzini@redhat.com>: > On 16/01/20 19:03, Olaf Hering wrote: > [...] > > This patch is wrong; xenfv does not support cross-version migration > compatibility. Even if the migration stream does not change, the > hardware exposed to the guest will. > > My understanding is that Xen is able to use "-M > pc-i440fx-VERSION,accel=xen". The presence of the version in the > machine type guarantees that the migration stream is compatible and that > the hardware exposed to the guest is the same on the source and destination. The current idea is to make 'xenfv' a copy of 'pc-i440fx-3.1'. But is this actually the desired behavior? Lets assume xenfv_machine_options calls pc_i440fx_5_0_machine_options. What impact that that have on the result of pc_init1()? Is any of the things done by pc_i440fx_5_0_machine_options and pc_i440fx_machine_options a desired, or even breaking, change for the current result of pc_xen_hvm_init? Also, do the calls to compat_props_add have negative impact on compatibility for running domUs? Olaf
Am Wed, 19 Feb 2020 12:35:30 +0100 schrieb Olaf Hering <olaf@aepfle.de>: > Is any of the things done by pc_i440fx_5_0_machine_options and > pc_i440fx_machine_options a desired, or even breaking, change for the > current result of pc_xen_hvm_init? I tried to follow a few of the initialized members: default_nic_model, perhaps the involved code paths are not called, so the NULL pointer does not matter. pvh_enabled, does this mean the PVH domU type? If yes, this would be lost when xenfv is locked at v3.1. Olaf
On 19/02/20 15:14, Olaf Hering wrote: >> Is any of the things done by pc_i440fx_5_0_machine_options and >> pc_i440fx_machine_options a desired, or even breaking, change for the >> current result of pc_xen_hvm_init? > I tried to follow a few of the initialized members: > > default_nic_model, perhaps the involved code paths are not called, so the NULL pointer does not matter. > > pvh_enabled, does this mean the PVH domU type? If yes, this would be lost when xenfv is locked at v3.1. No, pvh_enabled means recognizing uncompressed kernels and setting up the pvh.bin ROM to boot them. On Xen, this is done by the domain loader. Paolo
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index fa12203079..e19716d0d3 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -952,6 +952,7 @@ static void xenfv_machine_options(MachineClass *m) m->desc = "Xen Fully-virtualized PC"; m->max_cpus = HVM_MAX_VCPUS; m->default_machine_opts = "accel=xen"; + m->smbus_no_migration_support = true; } DEFINE_PC_MACHINE(xenfv, "xenfv", pc_xen_hvm_init,
With commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea a new member smbus_no_migration_support was added, and enabled in two places. With commit 4ab2f2a8aabfea95cc53c64e13b3f67960b27fdf the vmstate_acpi got new elements, which are conditionally filled. As a result, an incoming migration expected smbus related data unless smbus migration was disabled for a given MachineClass. Since commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea forgot to handle xenfv, live migration to receiving hosts using qemu-4.0 and later is broken. Therefore this patch must be applied to stable-4.x as well. Signed-off-by: Olaf Hering <olaf@aepfle.de> ---