Message ID | 1534243693-9560-1-git-send-email-jianjay.zhou@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] vl: fix migration when watchdog expires | expand |
On 14/08/2018 12:48, Jay Zhou wrote: > I got the following error when migrating a VM with watchdog > device: > > {"timestamp": {"seconds": 1533884471, "microseconds": 668099}, > "event": "WATCHDOG", "data": {"action": "reset"}} > {"timestamp": {"seconds": 1533884471, "microseconds": 677658}, > "event": "RESET", "data": {"guest": true}} > {"timestamp": {"seconds": 1533884471, "microseconds": 677874}, > "event": "STOP"} > qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'postmigrate' > Aborted > > The run state transition is RUN_STATE_FINISH_MIGRATE to RUN_STATE_PRELAUNCH, > then the migration thread aborted when it tries to set RUN_STATE_POSTMIGRATE. > There is a race between the main loop thread and the migration thread I think. In that case I think you shouldn't go to POSTMIGRATE at all, because the VM has been reset. Alternatively, when the watchdog fires in RUN_STATE_FINISH_MIGRATE state, it might delay the action until after the "cont" command is invoked on the source, but I'm not sure what's the best way to achieve that... Paolo
* Paolo Bonzini (pbonzini@redhat.com) wrote: > On 14/08/2018 12:48, Jay Zhou wrote: > > I got the following error when migrating a VM with watchdog > > device: > > > > {"timestamp": {"seconds": 1533884471, "microseconds": 668099}, > > "event": "WATCHDOG", "data": {"action": "reset"}} > > {"timestamp": {"seconds": 1533884471, "microseconds": 677658}, > > "event": "RESET", "data": {"guest": true}} > > {"timestamp": {"seconds": 1533884471, "microseconds": 677874}, > > "event": "STOP"} > > qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'postmigrate' > > Aborted > > > > The run state transition is RUN_STATE_FINISH_MIGRATE to RUN_STATE_PRELAUNCH, > > then the migration thread aborted when it tries to set RUN_STATE_POSTMIGRATE. > > There is a race between the main loop thread and the migration thread I think. > > In that case I think you shouldn't go to POSTMIGRATE at all, because the > VM has been reset. Migration has the VM stopped; it's not expecting the state to change at that point. > Alternatively, when the watchdog fires in RUN_STATE_FINISH_MIGRATE > state, it might delay the action until after the "cont" command is > invoked on the source, but I'm not sure what's the best way to achieve > that... Jay: Which watchdog were you using? a) Should the watchdog expire when the VM is stopped; I think it shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so is the bug here that the watchdog being used didn't use a virtual timer? b) If the watchdog expires just before the VM gets stopped, is there a race which could hit this? Possibly. c) Could main_loop_should_exit guard all the 'request's by something that checks whether the VM is stopped? Dave > Paolo -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: > a) Should the watchdog expire when the VM is stopped; I think it > shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so > is the bug here that the watchdog being used didn't use a virtual > timer? All watchdogs do. > b) If the watchdog expires just before the VM gets stopped, is there > a race which could hit this? Possibly. Yes, I think it is a race that happens just before vm_stop, but I don't understand why the "qemu_clock_enable" in pause_all_vcpus does not prevent it. It should be possible to write a deterministic testcase with qtest... Paolo
> -----Original Message----- > From: Dr. David Alan Gilbert [mailto:dgilbert@redhat.com] > Sent: Tuesday, August 14, 2018 7:52 PM > To: Paolo Bonzini <pbonzini@redhat.com> > Cc: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org; > quintela@redhat.com; wangxin (U) <wangxinxin.wang@huawei.com> > Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > > * Paolo Bonzini (pbonzini@redhat.com) wrote: > > On 14/08/2018 12:48, Jay Zhou wrote: > > > I got the following error when migrating a VM with watchdog > > > device: > > > > > > {"timestamp": {"seconds": 1533884471, "microseconds": 668099}, > > > "event": "WATCHDOG", "data": {"action": "reset"}} > > > {"timestamp": {"seconds": 1533884471, "microseconds": 677658}, > > > "event": "RESET", "data": {"guest": true}} > > > {"timestamp": {"seconds": 1533884471, "microseconds": 677874}, > > > "event": "STOP"} > > > qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> > 'postmigrate' > > > Aborted > > > > > > The run state transition is RUN_STATE_FINISH_MIGRATE to > > > RUN_STATE_PRELAUNCH, then the migration thread aborted when it tries to > set RUN_STATE_POSTMIGRATE. > > > There is a race between the main loop thread and the migration thread I > think. > > > > In that case I think you shouldn't go to POSTMIGRATE at all, because > > the VM has been reset. > > Migration has the VM stopped; it's not expecting the state to change at that > point. > > > Alternatively, when the watchdog fires in RUN_STATE_FINISH_MIGRATE > > state, it might delay the action until after the "cont" command is > > invoked on the source, but I'm not sure what's the best way to achieve > > that... > > Jay: Which watchdog were you using? Hi Dave, it is i6300esb, which uses QEMU_CLOCK_VIRTUAL. > > a) Should the watchdog expire when the VM is stopped; I think it shouldn't - > hw/acpi/tco.c uses a virtual timer as does i6300esb; so is the bug here that > the watchdog being used didn't use a virtual timer? > > b) If the watchdog expires just before the VM gets stopped, is there a race > which could hit this? Possibly. This is the case I met I think. Regards, Jay Zhou > > c) Could main_loop_should_exit guard all the 'request's by something that > checks whether the VM is stopped? > > Dave > > > > Paolo > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
> -----Original Message----- > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Tuesday, August 14, 2018 8:02 PM > To: Dr. David Alan Gilbert <dgilbert@redhat.com> > Cc: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org; > quintela@redhat.com; wangxin (U) <wangxinxin.wang@huawei.com> > Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > > On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: > > a) Should the watchdog expire when the VM is stopped; I think it > > shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so is > > the bug here that the watchdog being used didn't use a virtual timer? > > All watchdogs do. > > > b) If the watchdog expires just before the VM gets stopped, is there > > a race which could hit this? Possibly. > > Yes, I think it is a race that happens just before vm_stop, but I don't > understand why the "qemu_clock_enable" in pause_all_vcpus does not prevent it. Hi Paolo, The sequence is like this I think | | <----- watchdog expired, which set reset_requested to SHUTDOWN_CAUSE_GUEST_RESET | | <----- migration thread sets to RUN_STATE_FINISH_MIGRATE, it will disable QEMU_CLOCK_VIRTUAL clock, | but it is done after the setting of reset_requested | | <----- main loop thread sets to RUN_STATE_PRELAUNCH since it detected a reset request | | <----- migration thread sets to RUN_STATE_POSTMIGRATE Regards, Jay Zhou > > It should be possible to write a deterministic testcase with qtest... > > Paolo
On 14/08/2018 15:03, Zhoujian (jay) wrote: >> -----Original Message----- >> From: Paolo Bonzini [mailto:pbonzini@redhat.com] >> Sent: Tuesday, August 14, 2018 8:02 PM >> To: Dr. David Alan Gilbert <dgilbert@redhat.com> >> Cc: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org; >> quintela@redhat.com; wangxin (U) <wangxinxin.wang@huawei.com> >> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires >> >> On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: >>> a) Should the watchdog expire when the VM is stopped; I think it >>> shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so is >>> the bug here that the watchdog being used didn't use a virtual timer? >> >> All watchdogs do. >> >>> b) If the watchdog expires just before the VM gets stopped, is there >>> a race which could hit this? Possibly. >> >> Yes, I think it is a race that happens just before vm_stop, but I don't >> understand why the "qemu_clock_enable" in pause_all_vcpus does not prevent it. > > Hi Paolo, > The sequence is like this I think > > | > | <----- watchdog expired, which set reset_requested to SHUTDOWN_CAUSE_GUEST_RESET > | > | <----- migration thread sets to RUN_STATE_FINISH_MIGRATE, it will disable QEMU_CLOCK_VIRTUAL clock, > | but it is done after the setting of reset_requested So the fix would be to process the reset request here? (In do_vm_stop or pause_all_vcpus). The code is currently in main_loop_should_exit(). Paolo > | <----- main loop thread sets to RUN_STATE_PRELAUNCH since it detected a reset request > | > | <----- migration thread sets to RUN_STATE_POSTMIGRATE > > > Regards, > Jay Zhou > >> >> It should be possible to write a deterministic testcase with qtest... >> >> Paolo
> -----Original Message----- > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Tuesday, August 14, 2018 9:07 PM > To: Zhoujian (jay) <jianjay.zhou@huawei.com>; Dr. David Alan Gilbert > <dgilbert@redhat.com> > Cc: qemu-devel@nongnu.org; quintela@redhat.com; wangxin (U) > <wangxinxin.wang@huawei.com> > Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > > On 14/08/2018 15:03, Zhoujian (jay) wrote: > >> -----Original Message----- > >> From: Paolo Bonzini [mailto:pbonzini@redhat.com] > >> Sent: Tuesday, August 14, 2018 8:02 PM > >> To: Dr. David Alan Gilbert <dgilbert@redhat.com> > >> Cc: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org; > >> quintela@redhat.com; wangxin (U) <wangxinxin.wang@huawei.com> > >> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > >> > >> On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: > >>> a) Should the watchdog expire when the VM is stopped; I think it > >>> shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so > >>> is the bug here that the watchdog being used didn't use a virtual timer? > >> > >> All watchdogs do. > >> > >>> b) If the watchdog expires just before the VM gets stopped, is > >>> there a race which could hit this? Possibly. > >> > >> Yes, I think it is a race that happens just before vm_stop, but I > >> don't understand why the "qemu_clock_enable" in pause_all_vcpus does not > prevent it. > > > > Hi Paolo, > > The sequence is like this I think > > > > | > > | <----- watchdog expired, which set reset_requested to > SHUTDOWN_CAUSE_GUEST_RESET > > | > > | <----- migration thread sets to RUN_STATE_FINISH_MIGRATE, it > will disable QEMU_CLOCK_VIRTUAL clock, > > | but it is done after the setting of reset_requested > > So the fix would be to process the reset request here? (In do_vm_stop or > pause_all_vcpus). The code is currently in main_loop_should_exit(). I think this makes sense, process the reset request after the QEMU_CLOCK_VIRTUAL disabled, will have a try. Regards, Jay Zhou > > Paolo > > > | <----- main loop thread sets to RUN_STATE_PRELAUNCH since it > detected a reset request > > | > > | <----- migration thread sets to RUN_STATE_POSTMIGRATE > > > > > > Regards, > > Jay Zhou > > > >> > >> It should be possible to write a deterministic testcase with qtest... > >> > >> Paolo
> -----Original Message----- > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Tuesday, August 14, 2018 9:07 PM > To: Zhoujian (jay) <jianjay.zhou@huawei.com>; Dr. David Alan Gilbert > <dgilbert@redhat.com> > Cc: qemu-devel@nongnu.org; quintela@redhat.com; wangxin (U) > <wangxinxin.wang@huawei.com> > Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > > On 14/08/2018 15:03, Zhoujian (jay) wrote: > >> -----Original Message----- > >> From: Paolo Bonzini [mailto:pbonzini@redhat.com] > >> Sent: Tuesday, August 14, 2018 8:02 PM > >> To: Dr. David Alan Gilbert <dgilbert@redhat.com> > >> Cc: Zhoujian (jay) <jianjay.zhou@huawei.com>; qemu-devel@nongnu.org; > >> quintela@redhat.com; wangxin (U) <wangxinxin.wang@huawei.com> > >> Subject: Re: [RFC PATCH] vl: fix migration when watchdog expires > >> > >> On 14/08/2018 13:52, Dr. David Alan Gilbert wrote: > >>> a) Should the watchdog expire when the VM is stopped; I think it > >>> shouldn't - hw/acpi/tco.c uses a virtual timer as does i6300esb; so > >>> is the bug here that the watchdog being used didn't use a virtual timer? > >> > >> All watchdogs do. > >> > >>> b) If the watchdog expires just before the VM gets stopped, is > >>> there a race which could hit this? Possibly. > >> > >> Yes, I think it is a race that happens just before vm_stop, but I > >> don't understand why the "qemu_clock_enable" in pause_all_vcpus does not > prevent it. > > > > Hi Paolo, > > The sequence is like this I think > > > > | > > | <----- watchdog expired, which set reset_requested to > SHUTDOWN_CAUSE_GUEST_RESET > > | > > | <----- migration thread sets to RUN_STATE_FINISH_MIGRATE, it > will disable QEMU_CLOCK_VIRTUAL clock, > > | but it is done after the setting of reset_requested > > So the fix would be to process the reset request here? (In do_vm_stop or > pause_all_vcpus). The code is currently in main_loop_should_exit(). After a second thought, I think it should keep the reset request process in main_loop_should_exit(), since pause_all_vcpus(or do_vm_stop) is not in a loop, it can't detect all the reset requests immediately. If processing the reset request both in main_loop_should_exit() and do_vm_stop or pause_all_vcpus will lead to a race of referencing to the global variable 'reset_requested'. Could we add the check !runstate_check(RUN_STATE_FINISH_MIGRATE) before setting to RUN_STATE_PRELAUNCH, just like !runstate_check(RUN_STATE_RUNNING) and !runstate_check(RUN_STATE_INMIGRATE) did? But I'm not sure whether this will cause any side effect. Regards, Jay Zhou > > Paolo > > > | <----- main loop thread sets to RUN_STATE_PRELAUNCH since it > detected a reset request > > | > > | <----- migration thread sets to RUN_STATE_POSTMIGRATE > > > > > > Regards, > > Jay Zhou > > > >> > >> It should be possible to write a deterministic testcase with qtest... > >> > >> Paolo
On 16/08/2018 09:22, Zhoujian (jay) wrote: > Could we add the check !runstate_check(RUN_STATE_FINISH_MIGRATE) before setting > to RUN_STATE_PRELAUNCH, just like !runstate_check(RUN_STATE_RUNNING) and > !runstate_check(RUN_STATE_INMIGRATE) did? But I'm not sure whether this will > cause any side effect. I think there is a bigger problem to fix before. global_state_store sees RUNNING, not PRELAUNCH; the VM is migrated with state that is already reset, but the state on the destination is RUNNING and the VM restarts running on the destination. Likewise, if migration fails the VM is resumed on the source because s->vm_was_running is true. David, Juan, do you have any ideas here? Paolo
* Paolo Bonzini (pbonzini@redhat.com) wrote: > On 16/08/2018 09:22, Zhoujian (jay) wrote: > > Could we add the check !runstate_check(RUN_STATE_FINISH_MIGRATE) before setting > > to RUN_STATE_PRELAUNCH, just like !runstate_check(RUN_STATE_RUNNING) and > > !runstate_check(RUN_STATE_INMIGRATE) did? But I'm not sure whether this will > > cause any side effect. > > I think there is a bigger problem to fix before. global_state_store > sees RUNNING, not PRELAUNCH; the VM is migrated with state that is > already reset, but the state on the destination is RUNNING and the VM > restarts running on the destination. Likewise, if migration fails the > VM is resumed on the source because s->vm_was_running is true. > > David, Juan, do you have any ideas here? Would it help if we didn't process the reset-request at all while the guest was paused; so the guest state didn't get reset? I think you'd have to then migrate the 'reset-requested' flag to cause it to happen on the destination. Dave > Paolo -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff --git a/vl.c b/vl.c index 16b913f..298ce91 100644 --- a/vl.c +++ b/vl.c @@ -637,6 +637,7 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_PRELAUNCH, RUN_STATE_RUNNING }, { RUN_STATE_PRELAUNCH, RUN_STATE_FINISH_MIGRATE }, { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE }, + { RUN_STATE_PRELAUNCH, RUN_STATE_POSTMIGRATE }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING }, { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PAUSED },
I got the following error when migrating a VM with watchdog device: {"timestamp": {"seconds": 1533884471, "microseconds": 668099}, "event": "WATCHDOG", "data": {"action": "reset"}} {"timestamp": {"seconds": 1533884471, "microseconds": 677658}, "event": "RESET", "data": {"guest": true}} {"timestamp": {"seconds": 1533884471, "microseconds": 677874}, "event": "STOP"} qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'postmigrate' Aborted The run state transition is RUN_STATE_FINISH_MIGRATE to RUN_STATE_PRELAUNCH, then the migration thread aborted when it tries to set RUN_STATE_POSTMIGRATE. There is a race between the main loop thread and the migration thread I think. Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com> --- vl.c | 1 + 1 file changed, 1 insertion(+)