Message ID | 20250213175927.19642-7-farosas@suse.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | migration: Fix issues during qmp_migrate_cancel | expand |
On Thu, Feb 13, 2025 at 02:59:24PM -0300, Fabiano Rosas wrote: > The expected outcome from qmp_migrate_cancel() is that the source > migration goes to the terminal state > MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when > cancelling. > > Make sure there is never a state transition from an unspecified state > into FAILED. Code that sets FAILED, should always either make sure > that the old state is not CANCELLING or specify the old state. > > Note that the destination is allowed to go into FAILED, so there's no > issue there. > > (I don't think this is relevant as a backport because cancelling does > work, it just doesn't show the right state at the end) > > Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start") > Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously") > Fixes: 8518278a6a ("migration: implementation of background snapshot thread") > Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures") > Signed-off-by: Fabiano Rosas <farosas@suse.de> Not like migrate_set_state_failure(MigrationState *s)? Not a huge deal, though.. Reviewed-by: Peter Xu <peterx@redhat.com>
Peter Xu <peterx@redhat.com> writes: > On Thu, Feb 13, 2025 at 02:59:24PM -0300, Fabiano Rosas wrote: >> The expected outcome from qmp_migrate_cancel() is that the source >> migration goes to the terminal state >> MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when >> cancelling. >> >> Make sure there is never a state transition from an unspecified state >> into FAILED. Code that sets FAILED, should always either make sure >> that the old state is not CANCELLING or specify the old state. >> >> Note that the destination is allowed to go into FAILED, so there's no >> issue there. >> >> (I don't think this is relevant as a backport because cancelling does >> work, it just doesn't show the right state at the end) >> >> Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start") >> Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously") >> Fixes: 8518278a6a ("migration: implementation of background snapshot thread") >> Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures") >> Signed-off-by: Fabiano Rosas <farosas@suse.de> > > Not like migrate_set_state_failure(MigrationState *s)? Not a huge deal, > though.. I thought we had agreed over IRC that it was best to hold that until the other MigrationStatus work happens? Anyway, looking closer at this, there are places that handle CANCELLING beforehand (_detect_error) and places that only set FAILED after specific states (multifd), so a single helper will require more churn. Let's postpone that please. > > Reviewed-by: Peter Xu <peterx@redhat.com>
On Fri, Feb 14, 2025 at 09:25:12AM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > On Thu, Feb 13, 2025 at 02:59:24PM -0300, Fabiano Rosas wrote: > >> The expected outcome from qmp_migrate_cancel() is that the source > >> migration goes to the terminal state > >> MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when > >> cancelling. > >> > >> Make sure there is never a state transition from an unspecified state > >> into FAILED. Code that sets FAILED, should always either make sure > >> that the old state is not CANCELLING or specify the old state. > >> > >> Note that the destination is allowed to go into FAILED, so there's no > >> issue there. > >> > >> (I don't think this is relevant as a backport because cancelling does > >> work, it just doesn't show the right state at the end) > >> > >> Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start") > >> Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously") > >> Fixes: 8518278a6a ("migration: implementation of background snapshot thread") > >> Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures") > >> Signed-off-by: Fabiano Rosas <farosas@suse.de> > > > > Not like migrate_set_state_failure(MigrationState *s)? Not a huge deal, > > though.. > > I thought we had agreed over IRC that it was best to hold that until the > other MigrationStatus work happens? If we touched this anyway, IMHO no hurt to add a helper too. migrate_set_state_failure() can then be renamed to migrate_set_failure(), take a Error* instead so it might help that effort too. > > Anyway, looking closer at this, there are places that handle CANCELLING > beforehand (_detect_error) and places that only set FAILED after > specific states (multifd), so a single helper will require more > churn. Let's postpone that please. Sure. Let's go ahead with this. > > > > > Reviewed-by: Peter Xu <peterx@redhat.com> >
diff --git a/migration/migration.c b/migration/migration.c index 48c9ad3c96..c597aa707e 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2648,7 +2648,10 @@ static int postcopy_start(MigrationState *ms, Error **errp) if (migrate_postcopy_preempt()) { migration_wait_main_channel(ms); if (postcopy_preempt_establish_channel(ms)) { - migrate_set_state(&ms->state, ms->state, MIGRATION_STATUS_FAILED); + if (ms->state != MIGRATION_STATUS_CANCELLING) { + migrate_set_state(&ms->state, ms->state, + MIGRATION_STATUS_FAILED); + } error_setg(errp, "%s: Failed to establish preempt channel", __func__); return -1; @@ -2986,7 +2989,9 @@ fail: error_free(local_err); } - migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + if (s->state != MIGRATION_STATUS_CANCELLING) { + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + } } /** @@ -3009,7 +3014,7 @@ static void bg_migration_completion(MigrationState *s) qemu_put_buffer(s->to_dst_file, s->bioc->data, s->bioc->usage); qemu_fflush(s->to_dst_file); } else if (s->state == MIGRATION_STATUS_CANCELLING) { - goto fail; + return; } if (qemu_file_get_error(s->to_dst_file)) { @@ -3953,7 +3958,9 @@ void migration_connect(MigrationState *s, Error *error_in) fail: migrate_set_error(s, local_err); - migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + if (s->state != MIGRATION_STATUS_CANCELLING) { + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + } error_report_err(local_err); migration_cleanup(s); }
The expected outcome from qmp_migrate_cancel() is that the source migration goes to the terminal state MIGRATION_STATUS_CANCELLED. Anything different from this is a bug when cancelling. Make sure there is never a state transition from an unspecified state into FAILED. Code that sets FAILED, should always either make sure that the old state is not CANCELLING or specify the old state. Note that the destination is allowed to go into FAILED, so there's no issue there. (I don't think this is relevant as a backport because cancelling does work, it just doesn't show the right state at the end) Fixes: 3dde8fdbad ("migration: Merge precopy/postcopy on switchover start") Fixes: d0edb8a173 ("migration: Create the postcopy preempt channel asynchronously") Fixes: 8518278a6a ("migration: implementation of background snapshot thread") Fixes: bf78a046b9 ("migration: refactor migrate_fd_connect failures") Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/migration.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)