Message ID | 1475147192-31228-2-git-send-email-lizhijian@cn.fujitsu.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi, On (Thu) 29 Sep 2016 [19:06:32], Li Zhijian wrote: > Priviously, if the source and distination have different devices, source could goto > the status "paused (postmigrate)", and the distination will exit that means no qemu > is alive. > > After this patch, at above case, source can dectect the some error early from distination > and stop the migration, source keep in status "running". How would incoming migrations from previous versions work? Amit
On 09/30/2016 02:15 PM, Amit Shah wrote: > Hi, > > On (Thu) 29 Sep 2016 [19:06:32], Li Zhijian wrote: >> Priviously, if the source and distination have different devices, source could goto >> the status "paused (postmigrate)", and the distination will exit that means no qemu >> is alive. >> >> After this patch, at above case, source can dectect the some error early from distination >> and stop the migration, source keep in status "running". > > How would incoming migrations from previous versions work? You are right. we need to consider more. How about that: we need to introduce a new section type(e.g: QEMU_VM_SECTION_DEVICE_LIST). source side: - at the beginning of qemu_savevm_state_begin(), send QEMU_VM_SECTION_DEVICE_LIST first - original path dst side: - if we got the QEMU_VM_SECTION_DEVICE_LIST, have a check with the devices(name,version) - otherwise original path Please correct me. Thanks Zhijian > > > Amit > > >
* Li Zhijian (lizhijian@cn.fujitsu.com) wrote: > > > On 09/30/2016 02:15 PM, Amit Shah wrote: > > Hi, > > > > On (Thu) 29 Sep 2016 [19:06:32], Li Zhijian wrote: > > > Priviously, if the source and distination have different devices, source could goto > > > the status "paused (postmigrate)", and the distination will exit that means no qemu > > > is alive. > > > > > > After this patch, at above case, source can dectect the some error early from distination > > > and stop the migration, source keep in status "running". > > > > How would incoming migrations from previous versions work? > > You are right. we need to consider more. > > How about that: > we need to introduce a new section type(e.g: QEMU_VM_SECTION_DEVICE_LIST). > > source side: > - at the beginning of qemu_savevm_state_begin(), send QEMU_VM_SECTION_DEVICE_LIST first > - original path > > dst side: > - if we got the QEMU_VM_SECTION_DEVICE_LIST, have a check with the devices(name,version) > - otherwise original path Yes, and only send it on new machine types. I think a list could be a good idea; I don't worry too much about the 'paused (postmigrate)' case, because libvirt can spot that the destination failed and then restart the source, however a list would also fix the opposite case; where the destination has an extra device that the source does not have, in most cases I think that doesn't cause a failure at the moment but the device has an unusual state. A QEMU_VM_SECTION_DEVICE_LIST would work, but perhaps a subsection of vmstate_globalstate would work? Dave > > Please correct me. > > Thanks > Zhijian > > > > > > > Amit > > > > > > > > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 09/30/2016 05:58 PM, Dr. David Alan Gilbert wrote: > * Li Zhijian (lizhijian@cn.fujitsu.com) wrote: >> >> >> On 09/30/2016 02:15 PM, Amit Shah wrote: >>> Hi, >>> >>> On (Thu) 29 Sep 2016 [19:06:32], Li Zhijian wrote: >>>> Priviously, if the source and distination have different devices, source could goto >>>> the status "paused (postmigrate)", and the distination will exit that means no qemu >>>> is alive. >>>> >>>> After this patch, at above case, source can dectect the some error early from distination >>>> and stop the migration, source keep in status "running". >>> >>> How would incoming migrations from previous versions work? >> >> You are right. we need to consider more. >> >> How about that: >> we need to introduce a new section type(e.g: QEMU_VM_SECTION_DEVICE_LIST). >> >> source side: >> - at the beginning of qemu_savevm_state_begin(), send QEMU_VM_SECTION_DEVICE_LIST first >> - original path >> >> dst side: >> - if we got the QEMU_VM_SECTION_DEVICE_LIST, have a check with the devices(name,version) >> - otherwise original path > > Yes, and only send it on new machine types. > I think a list could be a good idea; I don't worry too much about the 'paused (postmigrate)' > case, because libvirt can spot that the destination failed and then restart the source, Oh, that's what I didn't know before. Thank for your information. > however a list would also fix the opposite case; where the destination has an extra device > that the source does not have, in most cases I think that doesn't cause a failure at the moment What's our expected behavior? I think in most cases(where the destination has an extra device that the source does not have), destination seems fine after migration. If we expect migration failure, it needs more check. > but the device has an unusual state. > > A QEMU_VM_SECTION_DEVICE_LIST would work, but perhaps a subsection of vmstate_globalstate > would work? Currently, the vmstate_globalstate was saved/restored at the migration completion stage while I expect the beginning of migration. And i cook the V2, with a new section type called SECTION_HEADER, any comment is welcomed. Thanks > > Dave > >> >> Please correct me. >> >> Thanks >> Zhijian >> >>> >>> >>> Amit >>> >>> >>> >> >> >> > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > . >
* Li Zhijian (lizhijian@cn.fujitsu.com) wrote: > > > On 09/30/2016 05:58 PM, Dr. David Alan Gilbert wrote: > > * Li Zhijian (lizhijian@cn.fujitsu.com) wrote: > > > > > > > > > On 09/30/2016 02:15 PM, Amit Shah wrote: > > > > Hi, > > > > > > > > On (Thu) 29 Sep 2016 [19:06:32], Li Zhijian wrote: > > > > > Priviously, if the source and distination have different devices, source could goto > > > > > the status "paused (postmigrate)", and the distination will exit that means no qemu > > > > > is alive. > > > > > > > > > > After this patch, at above case, source can dectect the some error early from distination > > > > > and stop the migration, source keep in status "running". > > > > > > > > How would incoming migrations from previous versions work? > > > > > > You are right. we need to consider more. > > > > > > How about that: > > > we need to introduce a new section type(e.g: QEMU_VM_SECTION_DEVICE_LIST). > > > > > > source side: > > > - at the beginning of qemu_savevm_state_begin(), send QEMU_VM_SECTION_DEVICE_LIST first > > > - original path > > > > > > dst side: > > > - if we got the QEMU_VM_SECTION_DEVICE_LIST, have a check with the devices(name,version) > > > - otherwise original path > > > > Yes, and only send it on new machine types. > > I think a list could be a good idea; I don't worry too much about the 'paused (postmigrate)' > > case, because libvirt can spot that the destination failed and then restart the source, > Oh, that's what I didn't know before. Thank for your information. > > > > however a list would also fix the opposite case; where the destination has an extra device > > that the source does not have, in most cases I think that doesn't cause a failure at the moment > What's our expected behavior? I think in most cases(where the destination has an extra device > that the source does not have), destination seems fine after migration. > > If we expect migration failure, it needs more check. I think it's a check we could add that would detect more misconfigurations. (you'd have to be careful about devices that just didn't need to send any migration state; and make sure it didn't break existing migrations). > > but the device has an unusual state. > > > > A QEMU_VM_SECTION_DEVICE_LIST would work, but perhaps a subsection of vmstate_globalstate > > would work? > > Currently, the vmstate_globalstate was saved/restored at the migration completion stage while I expect > the beginning of migration. Ah sorry, I was confused; it's vmstate_configuration (see migration/savevm.c) that's saved at the beginning. Dave > And i cook the V2, with a new section type called SECTION_HEADER, any comment is welcomed. > > Thanks > > > > > Dave > > > > > > > > Please correct me. > > > > > > Thanks > > > Zhijian > > > > > > > > > > > > > > > Amit > > > > > > > > > > > > > > > > > > > > > > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > > . > > > > -- > Best regards. > Li Zhijian (8555) > > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff --git a/migration/savevm.c b/migration/savevm.c index 09d8e99..0c07671 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -887,6 +887,7 @@ void qemu_savevm_state_begin(QEMUFile *f, trace_savevm_state_begin(); QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + save_section_header(f, se, QEMU_VM_SECTION_START); if (!se->ops || !se->ops->set_params) { continue; } @@ -902,7 +903,7 @@ void qemu_savevm_state_begin(QEMUFile *f, continue; } } - save_section_header(f, se, QEMU_VM_SECTION_START); + save_section_header(f, se, QEMU_VM_SECTION_FULL); ret = se->ops->save_live_setup(f, se->opaque); save_section_footer(f, se); @@ -1775,7 +1776,7 @@ qemu_loadvm_section_start(QEMUFile *f, MigrationIncomingState *mis, } static int -qemu_loadvm_section_full(QEMUFile *f, MigrationIncomingState *mis) +qemu_loadvm_section_start_full(QEMUFile *f, MigrationIncomingState *mis) { uint32_t version_id, section_id; SaveStateEntry *se; @@ -1846,14 +1847,21 @@ static int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis) { uint8_t section_type; int ret; + SaveStateEntry *se; + uint32_t instance, version; while ((section_type = qemu_get_byte(f)) != QEMU_VM_EOF) { trace_qemu_loadvm_state_section(section_type); switch (section_type) { case QEMU_VM_SECTION_START: + ret = qemu_loadvm_section_start(f, mis, &se, &instance, &version); + if (ret < 0) { + return ret; + } + break; case QEMU_VM_SECTION_FULL: - ret = qemu_loadvm_section_full(f, mis); + ret = qemu_loadvm_section_start_full(f, mis); if (ret < 0) { return ret; }
Priviously, if the source and distination have different devices, source could goto the status "paused (postmigrate)", and the distination will exit that means no qemu is alive. After this patch, at above case, source can dectect the some error early from distination and stop the migration, source keep in status "running". Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> --- migration/savevm.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)