From patchwork Wed May 20 20:42:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 11561567 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DDA41138A for ; Wed, 20 May 2020 20:56:05 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5BB4620758 for ; Wed, 20 May 2020 20:56:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=web.de header.i=@web.de header.b="fp/IYFg1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5BB4620758 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Received: from localhost ([::1]:59434 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jbVkx-0005Y2-VJ for patchwork-qemu-devel@patchwork.kernel.org; Wed, 20 May 2020 16:56:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44416) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbVXt-0005ou-ID for qemu-devel@nongnu.org; Wed, 20 May 2020 16:42:33 -0400 Received: from mout.web.de ([212.227.15.3]:34685) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jbVXs-00079F-JK for qemu-devel@nongnu.org; Wed, 20 May 2020 16:42:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1590007345; bh=rI0siZjB1fnXQBtvS0qkV9XKqMdAQZHTmQcm9q2ms80=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=fp/IYFg1mY9KYSufwivykFYkZ3B2aZ/fxAJNGFe69WDcp2D7YR1uaALHs6HSU48rI ezq3UHtb0TVep4SIAkwcN3n822UhpgKfS8r806lo+BF110I8pSWp5r1jR05n4jrMy2 mrja6sRfVapra7W85h/0nkczFoJRoSy8/ogk3blo= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([88.130.61.105]) by smtp.web.de (mrweb001 [213.165.67.108]) with ESMTPSA (Nemesis) id 0LshGz-1ivPZY1Ygz-012Foq; Wed, 20 May 2020 22:42:25 +0200 Date: Wed, 20 May 2020 22:42:23 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH v2 4/6] migration/colo.c: Relaunch failover even if there was an error Message-ID: <85df331bfe69661072d9f29b616f065ca261f471.1590007004.git.lukasstraub2@web.de> In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:L3cYkJcfcO04JSDwvaPz2lFaFK2ur+1NvpQsnqJLp6zlV6L0/mq JL2MeDqtaOUEPGh389v5NTKzCUr/A9421cqcAETznYwfIZrpRIhdiMZAjlKnTZSDuWD8hAt 4no/Y1BGx5dwFHW+3ZhduK5WDX0lY3B3/TZ5B5hrG9ORdp/V3yyD6qqgDJuXEI6G1CMc8SN OmLxzUDnOhXZWjMARVkAg== X-UI-Out-Filterresults: notjunk:1;V03:K0:H45V3UIycU0=:MhMKrNBUt9EbNQipKzwj2k qpmONqDPhBGkvRM7ts+m6UUPWKeB+0SW+Y5tPSaVDtZDsf7Gr14KqdCt8QP5NhHynvt9cqdVm 7kGRROzIUvj6DbgT1VHUdBR214kWvHge01FMZhCoM4GF7AHrDKtFRWNUpmOlJa03tAq6uMuen ujSsN8lNH/qiQsx9yWtfqkEhds37z67MdjRKp9fJ0KFO26SvUJd7MxEMj1pn3AbqAVDNAYQNj fFvM9Vl9vIZJSg8PtJkzHK/stPZEPHc252NN9lLcO8lWYggUlLilR60IjmxEH3I+gYSpl9BaS DpS+m/VLbJrHnutoXWMoV5BXRQuZCHcyPOlpXQXDTcvak1wrXyST0D96Mr7/N+N6lSpyBLlgl sZ+V4Aw+H0ZGw6uCFXi7vlIiaJf97oPg2UmROVMYZruOBC6q3Zhaz6vQurvlekDzYAXJhj9Uz ByHXtilUJ69hABaTs4BwKaqf9Grd+DnnRDxpC+fY4A1p6xTVUJGS9eKbpbAhboItoLRCEnobE KWP6elnGdpl/1mvPuyJN37GZ4vllG3N6nbVfuOGA7Xl84nSRR8CVGhuFBESTUTgODjMY9tI5D n9MAOifQF+WncHG30lujDSVteNUnimrJsvXCBC96Bwvr168c9m14XBThXp5/Tq94NsHJn3GSF kY7nRHCKrSDCrfKpd0/5rMikQ3yy8Z85KItVCblrUGJPGT3ZGrw305YHYEwl3TbBN/gjyoXnb 15YaxD6/0vNuIsc+km2hn+0ccAy/sUfc6H6NlwcXktSQUEwfJ3s1o54YXXYYnsSHNnY+VtBZR KFVH9f3qJ6YOMdwU53q6ybaU29LiQmwq8T6z4TXlbJ0I03DAvyWgzM+MYioEOmGjxvLIn/QTt 21vPJSIP+C/NZqDwAsSOxCbNc7gxfqoC2m4oh1V/okT7CtLi8H4ApAZtNEtqRvzOXUpy1QIAu Qtlmbf/2ThtPX5shsVCWEV+D18Dx+GZ5arYCrWxT5DjdkQ/FyRojkJzqoyuD2tU+VxQcTYJWh TMVVa42KFHAx/eQ9ZqmGqpZ/83H1bS5LzCFZQLyOzHgkKq8jiJ4HWcVEcRbmfcj8ssfR0yYmz mwXPUWDiHuBiyWtyeV+koYiF+Wvfs0KWuCfB4iv5nis5jFZfBupVH8ADjp39oHLIg11scuv3a btXAZC0zke785hcP0jHucM3z6OjLDKbNAVAFuPiVARoGAn0EjLpLdiQPsXfYt/HTwsfNNNNA9 5C3pQSYe9sorHV0rq Received-SPF: pass client-ip=212.227.15.3; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/20 16:42:20 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" If vmstate_loading is true, secondary_vm_do_failover will set failover status to FAILOVER_STATUS_RELAUNCH and return success without initiating failover. However, if there is an error during the vmstate_loading section, failover isn't relaunched. Instead we then wait for failover on colo_incoming_sem. Fix this by relaunching failover even if there was an error. Also, to make this work properly, set vmstate_loading to false when returning during the vmstate_loading section. Signed-off-by: Lukas Straub Reviewed-by: zhanghailiang --- migration/colo.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) -- 2.20.1 diff --git a/migration/colo.c b/migration/colo.c index 2947363ae5..a69782efc5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -743,6 +743,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, ret = qemu_load_device_state(fb); if (ret < 0) { error_setg(errp, "COLO: load device state failed"); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -751,6 +752,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, replication_get_error_all(&local_err); if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -759,6 +761,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, replication_do_checkpoint_all(&local_err); if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -770,6 +773,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -780,9 +784,6 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, qemu_mutex_unlock_iothread(); if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { - failover_set_state(FAILOVER_STATUS_RELAUNCH, - FAILOVER_STATUS_NONE); - failover_request_active(NULL); return; } @@ -881,6 +882,14 @@ void *colo_process_incoming_thread(void *opaque) error_report_err(local_err); break; } + + if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { + failover_set_state(FAILOVER_STATUS_RELAUNCH, + FAILOVER_STATUS_NONE); + failover_request_active(NULL); + break; + } + if (failover_get_state() != FAILOVER_STATUS_NONE) { error_report("failover request"); break; @@ -888,8 +897,6 @@ void *colo_process_incoming_thread(void *opaque) } out: - vmstate_loading = false; - /* * There are only two reasons we can get here, some error happened * or the user triggered failover.