From patchwork Fri Jan 29 05:27:19 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Congyang X-Patchwork-Id: 8159391 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 81099BEEE5 for ; Fri, 29 Jan 2016 05:30:48 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 140AC20351 for ; Fri, 29 Jan 2016 05:30:46 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 981882012E for ; Fri, 29 Jan 2016 05:30:43 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aP1ah-0007aT-Co; Fri, 29 Jan 2016 05:27:27 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aP1af-0007Z1-Fd for xen-devel@lists.xen.org; Fri, 29 Jan 2016 05:27:25 +0000 Received: from [193.109.254.147] by server-15.bemta-14.messagelabs.com id 02/30-10115-D38FAA65; Fri, 29 Jan 2016 05:27:25 +0000 X-Env-Sender: wency@cn.fujitsu.com X-Msg-Ref: server-6.tower-27.messagelabs.com!1454045237!19992669!4 X-Originating-IP: [59.151.112.132] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 38356 invoked from network); 29 Jan 2016 05:27:23 -0000 Received: from cn.fujitsu.com (HELO heian.cn.fujitsu.com) (59.151.112.132) by server-6.tower-27.messagelabs.com with SMTP; 29 Jan 2016 05:27:23 -0000 X-IronPort-AV: E=Sophos;i="5.20,346,1444665600"; d="scan'208";a="3106333" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 29 Jan 2016 13:27:19 +0800 Received: from G08CNEXCHPEKD01.g08.fujitsu.local (unknown [10.167.33.80]) by cn.fujitsu.com (Postfix) with ESMTP id 3AB8F41896ED; Fri, 29 Jan 2016 13:26:24 +0800 (CST) Received: from G08FNSTD140052.g08.fujitsu.local (10.167.226.52) by G08CNEXCHPEKD01.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.181.6; Fri, 29 Jan 2016 13:26:59 +0800 From: Wen Congyang To: xen devel , Konrad Rzeszutek Wilk , Andrew Cooper , Ian Campbell , Ian Jackson , Wei Liu Date: Fri, 29 Jan 2016 13:27:19 +0800 Message-ID: <1454045254-3711-4-git-send-email-wency@cn.fujitsu.com> X-Mailer: git-send-email 2.5.0 In-Reply-To: <1454045254-3711-1-git-send-email-wency@cn.fujitsu.com> References: <1454045254-3711-1-git-send-email-wency@cn.fujitsu.com> MIME-Version: 1.0 X-Originating-IP: [10.167.226.52] X-yoursite-MailScanner-ID: 3AB8F41896ED.AAFC4 X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: wency@cn.fujitsu.com X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 Cc: Lars Kurth , Changlong Xie , Wen Congyang , Gui Jianfeng , Jiang Yunhong , Dong Eddie , Shriram Rajagopalan , Ian Jackson , Yang Hongyang Subject: [Xen-devel] [PATCH v7 03/18] tools/libxl: move save/restore code into libxl_dom_save.c X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This is purely code motion. Signed-off-by: Yang Hongyang Signed-off-by: Wen Congyang CC: Ian Jackson CC: Wei Liu Acked-by: Ian Campbell Reviewed-by: Konrad Rzeszutek Wilk Acked-by: Wei Liu --- tools/libxl/Makefile | 2 +- tools/libxl/libxl_dom.c | 514 ---------------------------------------- tools/libxl/libxl_dom_save.c | 543 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 544 insertions(+), 515 deletions(-) create mode 100644 tools/libxl/libxl_dom_save.c diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile index 7d64ecc..263ea0e 100644 --- a/tools/libxl/Makefile +++ b/tools/libxl/Makefile @@ -105,7 +105,7 @@ LIBXL_OBJS = flexarray.o libxl.o libxl_create.o libxl_dm.o libxl_pci.o \ libxl_stream_read.o libxl_stream_write.o \ libxl_save_callout.o _libxl_save_msgs_callout.o \ libxl_qmp.o libxl_event.o libxl_fork.o \ - libxl_dom_suspend.o $(LIBXL_OBJS-y) + libxl_dom_suspend.o libxl_dom_save.o $(LIBXL_OBJS-y) LIBXL_OBJS += libxl_genid.o LIBXL_OBJS += _libxl_types.o libxl_flask.o _libxl_types_internal.o diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c index 81bd464..664adad 100644 --- a/tools/libxl/libxl_dom.c +++ b/tools/libxl/libxl_dom.c @@ -24,7 +24,6 @@ #include #include #include -#include libxl_domain_type libxl__domain_type(libxl__gc *gc, uint32_t domid) { @@ -1107,519 +1106,6 @@ int libxl__qemu_traditional_cmd(libxl__gc *gc, uint32_t domid, return libxl__xs_printf(gc, XBT_NULL, path, "%s", cmd); } -/* - * Inspect the buffer between start and end, and return a pointer to the - * character following the NUL terminator of start, or NULL if start is not - * terminated before end. - */ -static const char *next_string(const char *start, const char *end) -{ - if (start >= end) return NULL; - - size_t total_len = end - start; - size_t len = strnlen(start, total_len); - - if (len == total_len) - return NULL; - else - return start + len + 1; -} - -int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs, - const char *ptr, uint32_t size) -{ - STATE_AO_GC(dcs->ao); - const char *next = ptr, *end = ptr + size, *key, *val; - int rc; - - const uint32_t domid = dcs->guest_domid; - const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); - const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, ""); - - while (next < end) { - key = next; - next = next_string(next, end); - - /* Sanitise 'key'. */ - if (!next) { - rc = ERROR_FAIL; - LOG(ERROR, "Key in xenstore data not NUL terminated"); - goto out; - } - if (key[0] == '\0') { - rc = ERROR_FAIL; - LOG(ERROR, "empty key found in xenstore data"); - goto out; - } - if (key[0] == '/') { - rc = ERROR_FAIL; - LOG(ERROR, "Key in xenstore data not relative"); - goto out; - } - - val = next; - next = next_string(next, end); - - /* Sanitise 'val'. */ - if (!next) { - rc = ERROR_FAIL; - LOG(ERROR, "Val in xenstore data not NUL terminated"); - goto out; - } - - libxl__xs_printf(gc, XBT_NULL, - GCSPRINTF("%s/%s", xs_root, key), - "%s", val); - } - - rc = 0; - - out: - return rc; -} - -/*==================== Domain suspend (save) ====================*/ - -static void stream_done(libxl__egc *egc, - libxl__stream_write_state *sws, int rc); -static void domain_save_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc); - -/*----- complicated callback, called by xc_domain_save -----*/ - -/* - * We implement the other end of protocol for controlling qemu-dm's - * logdirty. There is no documentation for this protocol, but our - * counterparty's implementation is in - * qemu-xen-traditional.git:xenstore.c in the function - * xenstore_process_logdirty_event - */ - -static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev, - const struct timeval *requested_abs, - int rc); -static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*, - const char *watch_path, const char *event_path); -static void switch_logdirty_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc); - -static void logdirty_init(libxl__logdirty_switch *lds) -{ - lds->cmd_path = 0; - libxl__ev_xswatch_init(&lds->watch); - libxl__ev_time_init(&lds->timeout); -} - -static void domain_suspend_switch_qemu_xen_traditional_logdirty - (int domid, unsigned enable, - libxl__save_helper_state *shs) -{ - libxl__egc *egc = shs->egc; - libxl__domain_suspend_state *dss = shs->caller_state; - libxl__logdirty_switch *lds = &dss->logdirty; - STATE_AO_GC(dss->ao); - int rc; - xs_transaction_t t = 0; - const char *got; - - if (!lds->cmd_path) { - uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); - lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid, - "/logdirty/cmd"); - lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid, - "/logdirty/ret"); - } - lds->cmd = enable ? "enable" : "disable"; - - rc = libxl__ev_xswatch_register(gc, &lds->watch, - switch_logdirty_xswatch, lds->ret_path); - if (rc) goto out; - - rc = libxl__ev_time_register_rel(ao, &lds->timeout, - switch_logdirty_timeout, 10*1000); - if (rc) goto out; - - for (;;) { - rc = libxl__xs_transaction_start(gc, &t); - if (rc) goto out; - - rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got); - if (rc) goto out; - - if (got) { - const char *got_ret; - rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret); - if (rc) goto out; - - if (!got_ret || strcmp(got, got_ret)) { - LOG(ERROR,"controlling logdirty: qemu was already sent" - " command `%s' (xenstore path `%s') but result is `%s'", - got, lds->cmd_path, got_ret ? got_ret : ""); - rc = ERROR_FAIL; - goto out; - } - rc = libxl__xs_rm_checked(gc, t, lds->cmd_path); - if (rc) goto out; - } - - rc = libxl__xs_rm_checked(gc, t, lds->ret_path); - if (rc) goto out; - - rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd); - if (rc) goto out; - - rc = libxl__xs_transaction_commit(gc, &t); - if (!rc) break; - if (rc<0) goto out; - } - - /* OK, wait for some callback */ - return; - - out: - LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc); - libxl__xs_transaction_abort(gc, &t); - switch_logdirty_done(egc,dss,rc); -} - -static void domain_suspend_switch_qemu_xen_logdirty - (int domid, unsigned enable, - libxl__save_helper_state *shs) -{ - libxl__egc *egc = shs->egc; - libxl__domain_suspend_state *dss = shs->caller_state; - STATE_AO_GC(dss->ao); - int rc; - - rc = libxl__qmp_set_global_dirty_log(gc, domid, enable); - if (!rc) { - libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0); - } else { - LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc); - dss->rc = rc; - libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); - } -} - -void libxl__domain_suspend_common_switch_qemu_logdirty - (int domid, unsigned enable, void *user) -{ - libxl__save_helper_state *shs = user; - libxl__egc *egc = shs->egc; - libxl__domain_suspend_state *dss = shs->caller_state; - STATE_AO_GC(dss->ao); - - switch (libxl__device_model_version_running(gc, domid)) { - case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: - domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs); - break; - case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: - domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs); - break; - case LIBXL_DEVICE_MODEL_VERSION_NONE: - libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0); - break; - default: - LOG(ERROR,"logdirty switch failed" - ", no valid device model version found, abandoning suspend"); - dss->rc = ERROR_FAIL; - libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); - } -} -static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev, - const struct timeval *requested_abs, - int rc) -{ - libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout); - STATE_AO_GC(dss->ao); - LOG(ERROR,"logdirty switch: wait for device model timed out"); - switch_logdirty_done(egc,dss,ERROR_FAIL); -} - -static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch, - const char *watch_path, const char *event_path) -{ - libxl__domain_suspend_state *dss = - CONTAINER_OF(watch, *dss, logdirty.watch); - libxl__logdirty_switch *lds = &dss->logdirty; - STATE_AO_GC(dss->ao); - const char *got; - xs_transaction_t t = 0; - int rc; - - for (;;) { - rc = libxl__xs_transaction_start(gc, &t); - if (rc) goto out; - - rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got); - if (rc) goto out; - - if (!got) { - rc = +1; - goto out; - } - - if (strcmp(got, lds->cmd)) { - LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'" - " (xenstore paths `%s' / `%s')", lds->cmd, got, - lds->cmd_path, lds->ret_path); - rc = ERROR_FAIL; - goto out; - } - - rc = libxl__xs_rm_checked(gc, t, lds->cmd_path); - if (rc) goto out; - - rc = libxl__xs_rm_checked(gc, t, lds->ret_path); - if (rc) goto out; - - rc = libxl__xs_transaction_commit(gc, &t); - if (!rc) break; - if (rc<0) goto out; - } - - out: - /* rc < 0: error - * rc == 0: ok, we are done - * rc == +1: need to keep waiting - */ - libxl__xs_transaction_abort(gc, &t); - - if (rc <= 0) { - if (rc < 0) - LOG(ERROR,"logdirty switch: failed (rc=%d)",rc); - switch_logdirty_done(egc,dss,rc); - } -} - -static void switch_logdirty_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, - int rc) -{ - STATE_AO_GC(dss->ao); - libxl__logdirty_switch *lds = &dss->logdirty; - - libxl__ev_xswatch_deregister(gc, &lds->watch); - libxl__ev_time_deregister(gc, &lds->timeout); - - int broke; - if (rc) { - broke = -1; - dss->rc = rc; - } else { - broke = 0; - } - libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke); -} - -/*----- callbacks, called by xc_domain_save -----*/ - -/* - * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL - * terminator. - */ -static void append_string(libxl__gc *gc, char **buf, uint32_t *len, - const char *str) -{ - size_t extralen = strlen(str) + 1; - char *new = libxl__realloc(gc, *buf, *len + extralen); - - *buf = new; - memcpy(new + *len, str, extralen); - *len += extralen; -} - -int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss, - char **callee_buf, - uint32_t *callee_len) -{ - STATE_AO_GC(dss->ao); - const char *xs_root; - char **entries, *buf = NULL; - unsigned int nr_entries, i, j, len = 0; - int rc; - - const uint32_t domid = dss->domid; - const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); - - xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, ""); - - entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root), - &nr_entries); - if (!entries || nr_entries == 0) { rc = 0; goto out; } - - for (i = 0; i < nr_entries; ++i) { - static const char *const physmap_subkeys[] = { - "start_addr", "size", "name" - }; - - for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) { - const char *key = GCSPRINTF("physmap/%s/%s", - entries[i], physmap_subkeys[j]); - - const char *val = - libxl__xs_read(gc, XBT_NULL, - GCSPRINTF("%s/%s", xs_root, key)); - - if (!val) { rc = ERROR_FAIL; goto out; } - - append_string(gc, &buf, &len, key); - append_string(gc, &buf, &len, val); - } - } - - rc = 0; - - out: - if (!rc) { - *callee_buf = buf; - *callee_len = len; - } - - return rc; -} - -/*----- main code for saving, in order of execution -----*/ - -void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss) -{ - STATE_AO_GC(dss->ao); - int port; - int rc, ret; - - /* Convenience aliases */ - const uint32_t domid = dss->domid; - const libxl_domain_type type = dss->type; - const int live = dss->live; - const int debug = dss->debug; - const libxl_domain_remus_info *const r_info = dss->remus; - libxl__srm_save_autogen_callbacks *const callbacks = - &dss->sws.shs.callbacks.save.a; - unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0; - - dss->rc = 0; - logdirty_init(&dss->logdirty); - libxl__xswait_init(&dss->pvcontrol); - libxl__ev_evtchn_init(&dss->guest_evtchn); - libxl__ev_xswatch_init(&dss->guest_watch); - libxl__ev_time_init(&dss->guest_timeout); - - switch (type) { - case LIBXL_DOMAIN_TYPE_HVM: { - dss->hvm = 1; - break; - } - case LIBXL_DOMAIN_TYPE_PV: - dss->hvm = 0; - break; - default: - abort(); - } - - dss->xcflags = (live ? XCFLAGS_LIVE : 0) - | (debug ? XCFLAGS_DEBUG : 0) - | (dss->hvm ? XCFLAGS_HVM : 0); - - /* Disallow saving a guest with vNUMA configured because migration - * stream does not preserve node information. - * - * Reject any domain which has vnuma enabled, even if the - * configuration is empty. Only domains which have no vnuma - * configuration at all are supported. - */ - ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges, - &nr_vcpus, NULL, NULL, NULL); - if (ret != -1 || errno != XEN_EOPNOTSUPP) { - LOG(ERROR, "Cannot save a guest with vNUMA configured"); - rc = ERROR_FAIL; - goto out; - } - - dss->guest_evtchn.port = -1; - dss->guest_evtchn_lockfd = -1; - dss->guest_responded = 0; - dss->dm_savefile = libxl__device_model_savefile(gc, domid); - - if (r_info != NULL) { - dss->interval = r_info->interval; - dss->xcflags |= XCFLAGS_CHECKPOINTED; - if (libxl_defbool_val(r_info->compression)) - dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS; - } - - port = xs_suspend_evtchn_port(dss->domid); - - if (port >= 0) { - rc = libxl__ctx_evtchn_init(gc); - if (rc) goto out; - - dss->guest_evtchn.port = - xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce, - dss->domid, port, &dss->guest_evtchn_lockfd); - - if (dss->guest_evtchn.port < 0) { - LOG(WARN, "Suspend event channel initialization failed"); - rc = ERROR_FAIL; - goto out; - } - } - - memset(callbacks, 0, sizeof(*callbacks)); - if (r_info != NULL) { - callbacks->suspend = libxl__remus_domain_suspend_callback; - callbacks->postcopy = libxl__remus_domain_resume_callback; - callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback; - } else - callbacks->suspend = libxl__domain_suspend_callback; - - callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty; - - dss->sws.ao = dss->ao; - dss->sws.dss = dss; - dss->sws.fd = dss->fd; - dss->sws.completion_callback = stream_done; - - libxl__stream_write_start(egc, &dss->sws); - return; - - out: - domain_save_done(egc, dss, rc); -} - -static void stream_done(libxl__egc *egc, - libxl__stream_write_state *sws, int rc) -{ - domain_save_done(egc, sws->dss, rc); -} - -static void domain_save_done(libxl__egc *egc, - libxl__domain_suspend_state *dss, int rc) -{ - STATE_AO_GC(dss->ao); - - /* Convenience aliases */ - const uint32_t domid = dss->domid; - - libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn); - - if (dss->guest_evtchn.port > 0) - xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid, - dss->guest_evtchn.port, &dss->guest_evtchn_lockfd); - - if (dss->remus) { - /* - * With Remus, if we reach this point, it means either - * backup died or some network error occurred preventing us - * from sending checkpoints. Teardown the network buffers and - * release netlink resources. This is an async op. - */ - libxl__remus_teardown(egc, dss, rc); - return; - } - - dss->callback(egc, dss, rc); -} - /*==================== Miscellaneous ====================*/ char *libxl__uuid2string(libxl__gc *gc, const libxl_uuid uuid) diff --git a/tools/libxl/libxl_dom_save.c b/tools/libxl/libxl_dom_save.c new file mode 100644 index 0000000..27fd58b --- /dev/null +++ b/tools/libxl/libxl_dom_save.c @@ -0,0 +1,543 @@ +/* + * Copyright (C) 2009 Citrix Ltd. + * Author Vincent Hanquez + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include "libxl_osdeps.h" /* must come before any other headers */ + +#include "libxl_internal.h" + +#include + +/*========================= Domain save ============================*/ + +static void stream_done(libxl__egc *egc, + libxl__stream_write_state *sws, int rc); +static void domain_save_done(libxl__egc *egc, + libxl__domain_suspend_state *dss, int rc); + +/*----- complicated callback, called by xc_domain_save -----*/ + +/* + * We implement the other end of protocol for controlling qemu-dm's + * logdirty. There is no documentation for this protocol, but our + * counterparty's implementation is in + * qemu-xen-traditional.git:xenstore.c in the function + * xenstore_process_logdirty_event + */ + +static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs, + int rc); +static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch*, + const char *watch_path, const char *event_path); +static void switch_logdirty_done(libxl__egc *egc, + libxl__domain_suspend_state *dss, int rc); + +static void logdirty_init(libxl__logdirty_switch *lds) +{ + lds->cmd_path = 0; + libxl__ev_xswatch_init(&lds->watch); + libxl__ev_time_init(&lds->timeout); +} + +static void domain_suspend_switch_qemu_xen_traditional_logdirty + (int domid, unsigned enable, + libxl__save_helper_state *shs) +{ + libxl__egc *egc = shs->egc; + libxl__domain_suspend_state *dss = shs->caller_state; + libxl__logdirty_switch *lds = &dss->logdirty; + STATE_AO_GC(dss->ao); + int rc; + xs_transaction_t t = 0; + const char *got; + + if (!lds->cmd_path) { + uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); + lds->cmd_path = libxl__device_model_xs_path(gc, dm_domid, domid, + "/logdirty/cmd"); + lds->ret_path = libxl__device_model_xs_path(gc, dm_domid, domid, + "/logdirty/ret"); + } + lds->cmd = enable ? "enable" : "disable"; + + rc = libxl__ev_xswatch_register(gc, &lds->watch, + switch_logdirty_xswatch, lds->ret_path); + if (rc) goto out; + + rc = libxl__ev_time_register_rel(ao, &lds->timeout, + switch_logdirty_timeout, 10*1000); + if (rc) goto out; + + for (;;) { + rc = libxl__xs_transaction_start(gc, &t); + if (rc) goto out; + + rc = libxl__xs_read_checked(gc, t, lds->cmd_path, &got); + if (rc) goto out; + + if (got) { + const char *got_ret; + rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got_ret); + if (rc) goto out; + + if (!got_ret || strcmp(got, got_ret)) { + LOG(ERROR,"controlling logdirty: qemu was already sent" + " command `%s' (xenstore path `%s') but result is `%s'", + got, lds->cmd_path, got_ret ? got_ret : ""); + rc = ERROR_FAIL; + goto out; + } + rc = libxl__xs_rm_checked(gc, t, lds->cmd_path); + if (rc) goto out; + } + + rc = libxl__xs_rm_checked(gc, t, lds->ret_path); + if (rc) goto out; + + rc = libxl__xs_write_checked(gc, t, lds->cmd_path, lds->cmd); + if (rc) goto out; + + rc = libxl__xs_transaction_commit(gc, &t); + if (!rc) break; + if (rc<0) goto out; + } + + /* OK, wait for some callback */ + return; + + out: + LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc); + libxl__xs_transaction_abort(gc, &t); + switch_logdirty_done(egc,dss,rc); +} + +static void domain_suspend_switch_qemu_xen_logdirty + (int domid, unsigned enable, + libxl__save_helper_state *shs) +{ + libxl__egc *egc = shs->egc; + libxl__domain_suspend_state *dss = shs->caller_state; + STATE_AO_GC(dss->ao); + int rc; + + rc = libxl__qmp_set_global_dirty_log(gc, domid, enable); + if (!rc) { + libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0); + } else { + LOG(ERROR,"logdirty switch failed (rc=%d), abandoning suspend",rc); + dss->rc = rc; + libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); + } +} + +void libxl__domain_suspend_common_switch_qemu_logdirty + (int domid, unsigned enable, void *user) +{ + libxl__save_helper_state *shs = user; + libxl__egc *egc = shs->egc; + libxl__domain_suspend_state *dss = shs->caller_state; + STATE_AO_GC(dss->ao); + + switch (libxl__device_model_version_running(gc, domid)) { + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL: + domain_suspend_switch_qemu_xen_traditional_logdirty(domid, enable, shs); + break; + case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN: + domain_suspend_switch_qemu_xen_logdirty(domid, enable, shs); + break; + case LIBXL_DEVICE_MODEL_VERSION_NONE: + libxl__xc_domain_saverestore_async_callback_done(egc, shs, 0); + break; + default: + LOG(ERROR,"logdirty switch failed" + ", no valid device model version found, abandoning suspend"); + dss->rc = ERROR_FAIL; + libxl__xc_domain_saverestore_async_callback_done(egc, shs, -1); + } +} +static void switch_logdirty_timeout(libxl__egc *egc, libxl__ev_time *ev, + const struct timeval *requested_abs, + int rc) +{ + libxl__domain_suspend_state *dss = CONTAINER_OF(ev, *dss, logdirty.timeout); + STATE_AO_GC(dss->ao); + LOG(ERROR,"logdirty switch: wait for device model timed out"); + switch_logdirty_done(egc,dss,ERROR_FAIL); +} + +static void switch_logdirty_xswatch(libxl__egc *egc, libxl__ev_xswatch *watch, + const char *watch_path, const char *event_path) +{ + libxl__domain_suspend_state *dss = + CONTAINER_OF(watch, *dss, logdirty.watch); + libxl__logdirty_switch *lds = &dss->logdirty; + STATE_AO_GC(dss->ao); + const char *got; + xs_transaction_t t = 0; + int rc; + + for (;;) { + rc = libxl__xs_transaction_start(gc, &t); + if (rc) goto out; + + rc = libxl__xs_read_checked(gc, t, lds->ret_path, &got); + if (rc) goto out; + + if (!got) { + rc = +1; + goto out; + } + + if (strcmp(got, lds->cmd)) { + LOG(ERROR,"logdirty switch: sent command `%s' but got reply `%s'" + " (xenstore paths `%s' / `%s')", lds->cmd, got, + lds->cmd_path, lds->ret_path); + rc = ERROR_FAIL; + goto out; + } + + rc = libxl__xs_rm_checked(gc, t, lds->cmd_path); + if (rc) goto out; + + rc = libxl__xs_rm_checked(gc, t, lds->ret_path); + if (rc) goto out; + + rc = libxl__xs_transaction_commit(gc, &t); + if (!rc) break; + if (rc<0) goto out; + } + + out: + /* rc < 0: error + * rc == 0: ok, we are done + * rc == +1: need to keep waiting + */ + libxl__xs_transaction_abort(gc, &t); + + if (rc <= 0) { + if (rc < 0) + LOG(ERROR,"logdirty switch: failed (rc=%d)",rc); + switch_logdirty_done(egc,dss,rc); + } +} + +static void switch_logdirty_done(libxl__egc *egc, + libxl__domain_suspend_state *dss, + int rc) +{ + STATE_AO_GC(dss->ao); + libxl__logdirty_switch *lds = &dss->logdirty; + + libxl__ev_xswatch_deregister(gc, &lds->watch); + libxl__ev_time_deregister(gc, &lds->timeout); + + int broke; + if (rc) { + broke = -1; + dss->rc = rc; + } else { + broke = 0; + } + libxl__xc_domain_saverestore_async_callback_done(egc, &dss->sws.shs, broke); +} + +/*----- callbacks, called by xc_domain_save -----*/ + +/* + * Expand the buffer 'buf' of length 'len', to append 'str' including its NUL + * terminator. + */ +static void append_string(libxl__gc *gc, char **buf, uint32_t *len, + const char *str) +{ + size_t extralen = strlen(str) + 1; + char *new = libxl__realloc(gc, *buf, *len + extralen); + + *buf = new; + memcpy(new + *len, str, extralen); + *len += extralen; +} + +int libxl__save_emulator_xenstore_data(libxl__domain_suspend_state *dss, + char **callee_buf, + uint32_t *callee_len) +{ + STATE_AO_GC(dss->ao); + const char *xs_root; + char **entries, *buf = NULL; + unsigned int nr_entries, i, j, len = 0; + int rc; + + const uint32_t domid = dss->domid; + const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); + + xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, ""); + + entries = libxl__xs_directory(gc, 0, GCSPRINTF("%s/physmap", xs_root), + &nr_entries); + if (!entries || nr_entries == 0) { rc = 0; goto out; } + + for (i = 0; i < nr_entries; ++i) { + static const char *const physmap_subkeys[] = { + "start_addr", "size", "name" + }; + + for (j = 0; j < ARRAY_SIZE(physmap_subkeys); ++j) { + const char *key = GCSPRINTF("physmap/%s/%s", + entries[i], physmap_subkeys[j]); + + const char *val = + libxl__xs_read(gc, XBT_NULL, + GCSPRINTF("%s/%s", xs_root, key)); + + if (!val) { rc = ERROR_FAIL; goto out; } + + append_string(gc, &buf, &len, key); + append_string(gc, &buf, &len, val); + } + } + + rc = 0; + + out: + if (!rc) { + *callee_buf = buf; + *callee_len = len; + } + + return rc; +} + +/*----- main code for saving, in order of execution -----*/ + +void libxl__domain_save(libxl__egc *egc, libxl__domain_suspend_state *dss) +{ + STATE_AO_GC(dss->ao); + int port; + int rc, ret; + + /* Convenience aliases */ + const uint32_t domid = dss->domid; + const libxl_domain_type type = dss->type; + const int live = dss->live; + const int debug = dss->debug; + const libxl_domain_remus_info *const r_info = dss->remus; + libxl__srm_save_autogen_callbacks *const callbacks = + &dss->sws.shs.callbacks.save.a; + unsigned int nr_vnodes = 0, nr_vmemranges = 0, nr_vcpus = 0; + + dss->rc = 0; + logdirty_init(&dss->logdirty); + libxl__xswait_init(&dss->pvcontrol); + libxl__ev_evtchn_init(&dss->guest_evtchn); + libxl__ev_xswatch_init(&dss->guest_watch); + libxl__ev_time_init(&dss->guest_timeout); + + switch (type) { + case LIBXL_DOMAIN_TYPE_HVM: { + dss->hvm = 1; + break; + } + case LIBXL_DOMAIN_TYPE_PV: + dss->hvm = 0; + break; + default: + abort(); + } + + dss->xcflags = (live ? XCFLAGS_LIVE : 0) + | (debug ? XCFLAGS_DEBUG : 0) + | (dss->hvm ? XCFLAGS_HVM : 0); + + /* Disallow saving a guest with vNUMA configured because migration + * stream does not preserve node information. + * + * Reject any domain which has vnuma enabled, even if the + * configuration is empty. Only domains which have no vnuma + * configuration at all are supported. + */ + ret = xc_domain_getvnuma(CTX->xch, domid, &nr_vnodes, &nr_vmemranges, + &nr_vcpus, NULL, NULL, NULL); + if (ret != -1 || errno != XEN_EOPNOTSUPP) { + LOG(ERROR, "Cannot save a guest with vNUMA configured"); + rc = ERROR_FAIL; + goto out; + } + + dss->guest_evtchn.port = -1; + dss->guest_evtchn_lockfd = -1; + dss->guest_responded = 0; + dss->dm_savefile = libxl__device_model_savefile(gc, domid); + + if (r_info != NULL) { + dss->interval = r_info->interval; + dss->xcflags |= XCFLAGS_CHECKPOINTED; + if (libxl_defbool_val(r_info->compression)) + dss->xcflags |= XCFLAGS_CHECKPOINT_COMPRESS; + } + + port = xs_suspend_evtchn_port(dss->domid); + + if (port >= 0) { + rc = libxl__ctx_evtchn_init(gc); + if (rc) goto out; + + dss->guest_evtchn.port = + xc_suspend_evtchn_init_exclusive(CTX->xch, CTX->xce, + dss->domid, port, &dss->guest_evtchn_lockfd); + + if (dss->guest_evtchn.port < 0) { + LOG(WARN, "Suspend event channel initialization failed"); + rc = ERROR_FAIL; + goto out; + } + } + + memset(callbacks, 0, sizeof(*callbacks)); + if (r_info != NULL) { + callbacks->suspend = libxl__remus_domain_suspend_callback; + callbacks->postcopy = libxl__remus_domain_resume_callback; + callbacks->checkpoint = libxl__remus_domain_save_checkpoint_callback; + } else + callbacks->suspend = libxl__domain_suspend_callback; + + callbacks->switch_qemu_logdirty = libxl__domain_suspend_common_switch_qemu_logdirty; + + dss->sws.ao = dss->ao; + dss->sws.dss = dss; + dss->sws.fd = dss->fd; + dss->sws.completion_callback = stream_done; + + libxl__stream_write_start(egc, &dss->sws); + return; + + out: + domain_save_done(egc, dss, rc); +} + +static void stream_done(libxl__egc *egc, + libxl__stream_write_state *sws, int rc) +{ + domain_save_done(egc, sws->dss, rc); +} + +static void domain_save_done(libxl__egc *egc, + libxl__domain_suspend_state *dss, int rc) +{ + STATE_AO_GC(dss->ao); + + /* Convenience aliases */ + const uint32_t domid = dss->domid; + + libxl__ev_evtchn_cancel(gc, &dss->guest_evtchn); + + if (dss->guest_evtchn.port > 0) + xc_suspend_evtchn_release(CTX->xch, CTX->xce, domid, + dss->guest_evtchn.port, &dss->guest_evtchn_lockfd); + + if (dss->remus) { + /* + * With Remus, if we reach this point, it means either + * backup died or some network error occurred preventing us + * from sending checkpoints. Teardown the network buffers and + * release netlink resources. This is an async op. + */ + libxl__remus_teardown(egc, dss, rc); + return; + } + + dss->callback(egc, dss, rc); +} + +/*========================= Domain restore ============================*/ + +/* + * Inspect the buffer between start and end, and return a pointer to the + * character following the NUL terminator of start, or NULL if start is not + * terminated before end. + */ +static const char *next_string(const char *start, const char *end) +{ + if (start >= end) return NULL; + + size_t total_len = end - start; + size_t len = strnlen(start, total_len); + + if (len == total_len) + return NULL; + else + return start + len + 1; +} + +int libxl__restore_emulator_xenstore_data(libxl__domain_create_state *dcs, + const char *ptr, uint32_t size) +{ + STATE_AO_GC(dcs->ao); + const char *next = ptr, *end = ptr + size, *key, *val; + int rc; + + const uint32_t domid = dcs->guest_domid; + const uint32_t dm_domid = libxl_get_stubdom_id(CTX, domid); + const char *xs_root = libxl__device_model_xs_path(gc, dm_domid, domid, ""); + + while (next < end) { + key = next; + next = next_string(next, end); + + /* Sanitise 'key'. */ + if (!next) { + rc = ERROR_FAIL; + LOG(ERROR, "Key in xenstore data not NUL terminated"); + goto out; + } + if (key[0] == '\0') { + rc = ERROR_FAIL; + LOG(ERROR, "empty key found in xenstore data"); + goto out; + } + if (key[0] == '/') { + rc = ERROR_FAIL; + LOG(ERROR, "Key in xenstore data not relative"); + goto out; + } + + val = next; + next = next_string(next, end); + + /* Sanitise 'val'. */ + if (!next) { + rc = ERROR_FAIL; + LOG(ERROR, "Val in xenstore data not NUL terminated"); + goto out; + } + + libxl__xs_printf(gc, XBT_NULL, + GCSPRINTF("%s/%s", xs_root, key), + "%s", val); + } + + rc = 0; + + out: + return rc; +} + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */