From patchwork Mon Feb 22 02:52:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Congyang X-Patchwork-Id: 8370631 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id DA3949F314 for ; Mon, 22 Feb 2016 02:54:17 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9DC992045B for ; Mon, 22 Feb 2016 02:54:16 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D63F720437 for ; Mon, 22 Feb 2016 02:54:13 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aXgbd-00018k-JS; Mon, 22 Feb 2016 02:52:13 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aXgbc-00015S-4p for xen-devel@lists.xen.org; Mon, 22 Feb 2016 02:52:12 +0000 Received: from [193.109.254.147] by server-3.bemta-14.messagelabs.com id 83/D3-28972-BD77AC65; Mon, 22 Feb 2016 02:52:11 +0000 X-Env-Sender: wency@cn.fujitsu.com X-Msg-Ref: server-5.tower-27.messagelabs.com!1456109516!25574131!7 X-Originating-IP: [59.151.112.132] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 63242 invoked from network); 22 Feb 2016 02:52:10 -0000 Received: from cn.fujitsu.com (HELO heian.cn.fujitsu.com) (59.151.112.132) by server-5.tower-27.messagelabs.com with SMTP; 22 Feb 2016 02:52:10 -0000 X-IronPort-AV: E=Sophos;i="5.20,346,1444665600"; d="scan'208";a="3766056" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 22 Feb 2016 10:52:08 +0800 Received: from G08CNEXCHPEKD01.g08.fujitsu.local (unknown [10.167.33.80]) by cn.fujitsu.com (Postfix) with ESMTP id 1983C42AC846; Mon, 22 Feb 2016 10:50:13 +0800 (CST) Received: from G08FNSTD140052.g08.fujitsu.local (10.167.226.52) by G08CNEXCHPEKD01.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.181.6; Mon, 22 Feb 2016 10:52:03 +0800 From: Wen Congyang To: xen devel , Konrad Rzeszutek Wilk , Andrew Cooper , Ian Campbell , Ian Jackson , Wei Liu Date: Mon, 22 Feb 2016 10:52:22 +0800 Message-ID: <1456109555-28299-19-git-send-email-wency@cn.fujitsu.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1456109555-28299-1-git-send-email-wency@cn.fujitsu.com> References: <1456109555-28299-1-git-send-email-wency@cn.fujitsu.com> MIME-Version: 1.0 X-Originating-IP: [10.167.226.52] X-yoursite-MailScanner-ID: 1983C42AC846.A6592 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: wency@cn.fujitsu.com X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Cc: Lars Kurth , Changlong Xie , Wen Congyang , Gui Jianfeng , Jiang Yunhong , Dong Eddie , Shriram Rajagopalan , Yang Hongyang Subject: [Xen-devel] [PATCH v10 18/31] libxc/restore: support COLO restore X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP call the callbacks resume/checkpoint/suspend while secondary vm status is consistent with primary. Signed-off-by: Yang Hongyang Signed-off-by: Wen Congyang CC: Andrew Cooper --- tools/libxc/xc_sr_common.h | 4 +-- tools/libxc/xc_sr_restore.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 62 insertions(+), 2 deletions(-) diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h index 2bfed64..a24a9ad 100644 --- a/tools/libxc/xc_sr_common.h +++ b/tools/libxc/xc_sr_common.h @@ -234,13 +234,13 @@ struct xc_sr_context uint32_t guest_page_size; /* Plain VM, or checkpoints over time. */ - bool checkpointed; + int checkpointed; /* Currently buffering records between a checkpoint */ bool buffer_all_records; /* - * With Remus, we buffer the records sent by the primary at checkpoint, + * With Remus/COLO, we buffer the records sent by the primary at checkpoint, * in case the primary will fail, we can recover from the last * checkpoint state. * This should be enough for most of the cases because primary only send diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c index aef9bca..2ae8154 100644 --- a/tools/libxc/xc_sr_restore.c +++ b/tools/libxc/xc_sr_restore.c @@ -460,6 +460,49 @@ static int handle_checkpoint(struct xc_sr_context *ctx) else ctx->restore.buffer_all_records = true; + if ( ctx->restore.checkpointed == MIG_STREAM_COLO ) + { +#define HANDLE_CALLBACK_RETURN_VALUE(ret) \ + do { \ + if ( ret == 1 ) \ + rc = 0; /* Success */ \ + else \ + { \ + if ( ret == 2 ) \ + rc = BROKEN_CHANNEL; \ + else \ + rc = -1; /* Some unspecified error */ \ + goto err; \ + } \ + } while (0) + + /* COLO */ + + /* We need to resume guest */ + rc = ctx->restore.ops.stream_complete(ctx); + if ( rc ) + goto err; + + /* TODO: call restore_results */ + + /* Resume secondary vm */ + ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + + /* Wait for a new checkpoint */ + ret = ctx->restore.callbacks->wait_checkpoint( + ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + + /* suspend secondary vm */ + ret = ctx->restore.callbacks->suspend(ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + +#undef HANDLE_CALLBACK_RETURN_VALUE + + /* TODO: send dirty pfn list to primary */ + } + err: return rc; } @@ -631,6 +674,15 @@ static int restore(struct xc_sr_context *ctx) } while ( rec.type != REC_TYPE_END ); remus_failover: + + if ( ctx->restore.checkpointed == MIG_STREAM_COLO ) + { + /* With COLO, we have already called stream_complete */ + rc = 0; + IPRINTF("COLO Failover"); + goto done; + } + /* * With Remus, if we reach here, there must be some error on primary, * failover from the last checkpoint state. @@ -685,6 +737,14 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom, if ( checkpointed_stream ) assert(callbacks->checkpoint); + if ( ctx.restore.checkpointed == MIG_STREAM_COLO ) + { + /* this is COLO restore */ + assert(callbacks->suspend && + callbacks->postcopy && + callbacks->wait_checkpoint); + } + DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d" ", checkpointed_stream %d", io_fd, dom, hvm, pae, superpages, checkpointed_stream);