From patchwork Wed Dec 30 02:37:43 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Congyang X-Patchwork-Id: 7931521 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 49785BEEE5 for ; Wed, 30 Dec 2015 02:40:24 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 5537A201F4 for ; Wed, 30 Dec 2015 02:40:23 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 51F972013D for ; Wed, 30 Dec 2015 02:40:22 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aE6f6-0002wT-Bx; Wed, 30 Dec 2015 02:38:52 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aE6f3-0002tU-Tm for xen-devel@lists.xen.org; Wed, 30 Dec 2015 02:38:50 +0000 Received: from [193.109.254.147] by server-10.bemta-14.messagelabs.com id 7B/8A-25438-9B343865; Wed, 30 Dec 2015 02:38:49 +0000 X-Env-Sender: wency@cn.fujitsu.com X-Msg-Ref: server-9.tower-27.messagelabs.com!1451443108!13748866!4 X-Originating-IP: [59.151.112.132] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 30472 invoked from network); 30 Dec 2015 02:38:44 -0000 Received: from cn.fujitsu.com (HELO heian.cn.fujitsu.com) (59.151.112.132) by server-9.tower-27.messagelabs.com with SMTP; 30 Dec 2015 02:38:44 -0000 X-IronPort-AV: E=Sophos;i="5.20,346,1444665600"; d="scan'208";a="2080178" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 30 Dec 2015 10:38:29 +0800 Received: from G08CNEXCHPEKD01.g08.fujitsu.local (unknown [10.167.33.80]) by cn.fujitsu.com (Postfix) with ESMTP id 3AD5641887F4; Wed, 30 Dec 2015 10:38:12 +0800 (CST) Received: from G08FNSTD140052.g08.fujitsu.local (10.167.226.52) by G08CNEXCHPEKD01.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.181.6; Wed, 30 Dec 2015 10:38:12 +0800 From: Wen Congyang To: xen devel , Andrew Cooper , Ian Campbell , Ian Jackson , Wei Liu Date: Wed, 30 Dec 2015 10:37:43 +0800 Message-ID: <1451443075-27428-14-git-send-email-wency@cn.fujitsu.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1451443075-27428-1-git-send-email-wency@cn.fujitsu.com> References: <1451443075-27428-1-git-send-email-wency@cn.fujitsu.com> MIME-Version: 1.0 X-Originating-IP: [10.167.226.52] X-yoursite-MailScanner-ID: 3AD5641887F4.A61EF X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: wency@cn.fujitsu.com X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Cc: Lars Kurth , Changlong Xie , Wen Congyang , Gui Jianfeng , Jiang Yunhong , Dong Eddie , Shriram Rajagopalan , Yang Hongyang Subject: [Xen-devel] [PATCH v9 13/25] libxc/restore: support COLO restore X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP call the callbacks resume/checkpoint/suspend while secondary vm status is consistent with primary. Signed-off-by: Yang Hongyang Signed-off-by: Wen Congyang CC: Andrew Cooper --- tools/libxc/xc_sr_common.h | 6 +++-- tools/libxc/xc_sr_restore.c | 60 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 64 insertions(+), 2 deletions(-) diff --git a/tools/libxc/xc_sr_common.h b/tools/libxc/xc_sr_common.h index 53d6129..e768a6d 100644 --- a/tools/libxc/xc_sr_common.h +++ b/tools/libxc/xc_sr_common.h @@ -175,10 +175,12 @@ struct xc_sr_context * migration stream * 0: Plain VM * 1: Remus + * 2: COLO */ enum { MIG_STREAM_NONE, /* plain stream */ MIG_STREAM_REMUS, + MIG_STREAM_COLO, } migration_stream; union /* Common save or restore data. */ @@ -223,13 +225,13 @@ struct xc_sr_context uint32_t guest_page_size; /* Plain VM, or checkpoints over time. */ - bool checkpointed; + int checkpointed; /* Currently buffering records between a checkpoint */ bool buffer_all_records; /* - * With Remus, we buffer the records sent by the primary at checkpoint, + * With Remus/COLO, we buffer the records sent by the primary at checkpoint, * in case the primary will fail, we can recover from the last * checkpoint state. * This should be enough for most of the cases because primary only send diff --git a/tools/libxc/xc_sr_restore.c b/tools/libxc/xc_sr_restore.c index e543be3..f01a081 100644 --- a/tools/libxc/xc_sr_restore.c +++ b/tools/libxc/xc_sr_restore.c @@ -456,6 +456,49 @@ static int handle_checkpoint(struct xc_sr_context *ctx) else ctx->restore.buffer_all_records = true; + if ( ctx->restore.checkpointed == MIG_STREAM_COLO ) + { +#define HANDLE_CALLBACK_RETURN_VALUE(ret) \ + do { \ + if ( ret == 1 ) \ + rc = 0; /* Success */ \ + else \ + { \ + if ( ret == 2 ) \ + rc = BROKEN_CHANNEL; \ + else \ + rc = -1; /* Some unspecified error */ \ + goto err; \ + } \ + } while (0) + + /* COLO */ + + /* We need to resume guest */ + rc = ctx->restore.ops.stream_complete(ctx); + if ( rc ) + goto err; + + /* TODO: call restore_results */ + + /* Resume secondary vm */ + ret = ctx->restore.callbacks->postcopy(ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + + /* Wait for a new checkpoint */ + ret = ctx->restore.callbacks->should_checkpoint( + ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + + /* suspend secondary vm */ + ret = ctx->restore.callbacks->suspend(ctx->restore.callbacks->data); + HANDLE_CALLBACK_RETURN_VALUE(ret); + +#undef HANDLE_CALLBACK_RETURN_VALUE + + /* TODO: send dirty pfn list to primary */ + } + err: return rc; } @@ -627,6 +670,15 @@ static int restore(struct xc_sr_context *ctx) } while ( rec.type != REC_TYPE_END ); remus_failover: + + if ( ctx->restore.checkpointed == MIG_STREAM_COLO ) + { + /* With COLO, we have already called stream_complete */ + rc = 0; + IPRINTF("COLO Failover"); + goto done; + } + /* * With Remus, if we reach here, there must be some error on primary, * failover from the last checkpoint state. @@ -681,6 +733,14 @@ int xc_domain_restore(xc_interface *xch, int io_fd, uint32_t dom, if ( checkpointed_stream ) assert(callbacks->checkpoint); + if ( ctx.restore.checkpointed == MIG_STREAM_COLO ) + { + /* this is COLO restore */ + assert(callbacks->suspend && + callbacks->postcopy && + callbacks->should_checkpoint); + } + DPRINTF("fd %d, dom %u, hvm %u, pae %u, superpages %d" ", checkpointed_stream %d", io_fd, dom, hvm, pae, superpages, checkpointed_stream);