diff mbox series

[v10,06/25] tools/xenstore: add live update command to xenstore-control

Message ID 20201215163603.21700-7-jgross@suse.com (mailing list archive)
State Superseded
Headers show
Series tools/xenstore: support live update for xenstored | expand

Commit Message

Jürgen Groß Dec. 15, 2020, 4:35 p.m. UTC
Add the "live-update" command to xenstore-control enabling updating
xenstored to a new version in a running Xen system.

With -c <arg> it is possible to pass a different command line to the
new instance of xenstored. This will replace the command line used
for the invocation of the just running xenstored instance.

The running xenstored (or xenstore-stubdom) needs to support live
updating, of course.

For now just add a small dummy handler to C xenstore denying any
live update action.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
---
V2:
- add 0 byte after kernel chunk
- add comment regrading add_to_buf() semantics (Pawel Wieczorkiewicz)
- use %u for unsigned in format (Pawel Wieczorkiewicz)
- explain buffer size better (Pawel Wieczorkiewicz)
- add loop around "-s" option for client side retry in case of timeout

V3:
- add live-update command to docs/misc/xenstore.txt (Paul Durrant)
- fix indent (Paul Durrant)

V4:
- made several parameters const (Julien Grall)
- added more details to xenstore.txt (Julien Grall)

V5:
- set old_binary to NULL initially (Paul Durrant)

V6:
- use strerror(errno) in error message (Julien Grall)

V10:
- make binary specification mandatory (Andrew Cooper)

Signed-off-by: Juergen Gross <jgross@suse.com>
---
 docs/misc/xenstore.txt             |  21 ++
 tools/xenstore/Makefile            |   3 +-
 tools/xenstore/xenstore_control.c  | 332 +++++++++++++++++++++++++++--
 tools/xenstore/xenstored_control.c |  30 +++
 4 files changed, 369 insertions(+), 17 deletions(-)

Comments

Edwin Török Jan. 6, 2021, 2:42 p.m. UTC | #1
On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote:

> [...] 
> +static int live_update_start(struct xs_handle *xsh, bool force,
> unsigned int to)
> +{
> +    int len = 0;
> +    char *buf = NULL, *ret;
> +    time_t time_start;
> +
> +    if (asprintf(&ret, "%u", to) < 0)
> +        return 1;
> +    len = add_to_buf(&buf, "-s", len);
> +    len = add_to_buf(&buf, "-t", len);
> +    len = add_to_buf(&buf, ret, len);
> +    free(ret);
> +    if (force)
> +        len = add_to_buf(&buf, "-F", len);
> +    if (len < 0)
> +        return 1;
> +
> +    for (time_start = time(NULL); time(NULL) - time_start < to;) {
> +        ret = xs_control_command(xsh, "live-update", buf, len);
> +        if (!ret)
> +            goto err;
> +        if (strcmp(ret, "BUSY"))
> +            break;
> +    }

We've discovered issues with this during testing: when a live-update is
not possible (e.g. guest with active transaction held open on purpose)
xenstored gets flooded with live-update requests until the timeout is
reached.

This is problematic for multiple reasons:
* flooding xenstored reduces its throughput, and makes it use 100% CPU.
Depending on the implementation and configuration it may also flood the
logs (which isn't fatal, the system still stays alive if you ENOSPC on
/var/log, but it makes it difficult to debug issues if a live update
gets you to ENOSPC as you may not see actual failures from after the
live update).
* flooding xenstored causes the evtchn to overflow and AFAICT neither
xenstored or oxenstored is capable of coping with that (oxenstored
infinite loops when that happens). IIUC this needs to be fixed in the
kernel, so it doesn't return EFBIG in the first place. As a workaround
I added a sleep(1) in this loop
* the timeout is hit on both client and server sides, but depending on
rounding almost always happens on the client side first which prevents
us from displaying a more informative error message from the server. As
a workaround I increased the client side timeout by 2s to cope with
rounding and give the server a chance to reply. Oxenstored can reply
with a list of domains preventing the live-update for example.

My workarounds looked like this:
@@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
         len = add_to_buf(&buf, "-F", len);
     if (len < 0)
         return 1;
+    /* +1 for rounding issues
+     * +1 to give oxenstored a chance to timeout and report back first
*/
+    to += 2;
 
     for (time_start = time(NULL); time(NULL) - time_start < to;) {
         ret = xs_control_command(xsh, "live-update", buf, len);
@@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh,
bool force, unsigned int to)
             goto err;
         if (strcmp(ret, "BUSY"))
             break;
+        /* TODO: use task ID for commands, avoid busy loop polling
here
+         * oxenstored checks BUSY condition internally on every main
loop iteration anyway.
+         * Avoid flooding xenstored with live-update requests.
+         * The flooding can also cause the evtchn to overflow in
xenstored which makes
+         * xenstored enter an infinite loop */
+        sleep(1);
     }

This also required various heuristics in oxenstored to differentiate
between a new live-update command and querying the status of an already
completed live-update command, otherwise I almost always ended up doing
2 live-updates in a row (first one queued up, returned BUSY, completed
after a while, and then another one from all the repeated live-update
requests).

I'd prefer it if there was a more asynchronous protocol available for
live-update:
* the live-update on its own queues up the live update request and
returns a generation ID. xenstored retries on its own during each of
its main loop iterations whether the conditions for live-update are now
met
* when a generation ID is specified with the live-update command it
acts as a query to return the status. A query for generation ID <
current ID return success, and for generation ID = current ID it
returns either BUSY, or a specific error if known (e.g. timeout reached
and list of domains preventing live update)
* the generation ID gets incremented every time a live update succeeds
 
This would eliminiate the need for a client-side timeout, since each of
these commands would be non-blocking.
It'd also avoid the busy-poll flood.

Obviously any change here has to be backwards compatible with the
already deployed xenstored and oxenstored which doesn't know about
generation IDs, but you can tell them apart based on the reply: if you
get back OK/BUSY/some error it is the old version, if you get back a
generation ID it is the new version.

I ran out of time to implement this before the embargo was up, should I
go ahead with prototyping a patch for this now, or do you see an
alternative way to make the live-update command more robust?


Best regards,
--Edwin
Jürgen Groß Jan. 13, 2021, 4:34 p.m. UTC | #2
On 06.01.21 15:42, Edwin Torok wrote:
> On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote:
> 
>> [...]
>> +static int live_update_start(struct xs_handle *xsh, bool force,
>> unsigned int to)
>> +{
>> +    int len = 0;
>> +    char *buf = NULL, *ret;
>> +    time_t time_start;
>> +
>> +    if (asprintf(&ret, "%u", to) < 0)
>> +        return 1;
>> +    len = add_to_buf(&buf, "-s", len);
>> +    len = add_to_buf(&buf, "-t", len);
>> +    len = add_to_buf(&buf, ret, len);
>> +    free(ret);
>> +    if (force)
>> +        len = add_to_buf(&buf, "-F", len);
>> +    if (len < 0)
>> +        return 1;
>> +
>> +    for (time_start = time(NULL); time(NULL) - time_start < to;) {
>> +        ret = xs_control_command(xsh, "live-update", buf, len);
>> +        if (!ret)
>> +            goto err;
>> +        if (strcmp(ret, "BUSY"))
>> +            break;
>> +    }
> 
> We've discovered issues with this during testing: when a live-update is
> not possible (e.g. guest with active transaction held open on purpose)
> xenstored gets flooded with live-update requests until the timeout is
> reached.
> 
> This is problematic for multiple reasons:
> * flooding xenstored reduces its throughput, and makes it use 100% CPU.
> Depending on the implementation and configuration it may also flood the
> logs (which isn't fatal, the system still stays alive if you ENOSPC on
> /var/log, but it makes it difficult to debug issues if a live update
> gets you to ENOSPC as you may not see actual failures from after the
> live update).
> * flooding xenstored causes the evtchn to overflow and AFAICT neither
> xenstored or oxenstored is capable of coping with that (oxenstored
> infinite loops when that happens). IIUC this needs to be fixed in the
> kernel, so it doesn't return EFBIG in the first place. As a workaround
> I added a sleep(1) in this loop
> * the timeout is hit on both client and server sides, but depending on
> rounding almost always happens on the client side first which prevents
> us from displaying a more informative error message from the server. As
> a workaround I increased the client side timeout by 2s to cope with
> rounding and give the server a chance to reply. Oxenstored can reply
> with a list of domains preventing the live-update for example.
> 
> My workarounds looked like this:
> @@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh,
> bool force, unsigned int to)
>           len = add_to_buf(&buf, "-F", len);
>       if (len < 0)
>           return 1;
> +    /* +1 for rounding issues
> +     * +1 to give oxenstored a chance to timeout and report back first
> */
> +    to += 2;
>   
>       for (time_start = time(NULL); time(NULL) - time_start < to;) {
>           ret = xs_control_command(xsh, "live-update", buf, len);
> @@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh,
> bool force, unsigned int to)
>               goto err;
>           if (strcmp(ret, "BUSY"))
>               break;
> +        /* TODO: use task ID for commands, avoid busy loop polling
> here
> +         * oxenstored checks BUSY condition internally on every main
> loop iteration anyway.
> +         * Avoid flooding xenstored with live-update requests.
> +         * The flooding can also cause the evtchn to overflow in
> xenstored which makes
> +         * xenstored enter an infinite loop */
> +        sleep(1);
>       }
> 
> This also required various heuristics in oxenstored to differentiate
> between a new live-update command and querying the status of an already
> completed live-update command, otherwise I almost always ended up doing
> 2 live-updates in a row (first one queued up, returned BUSY, completed
> after a while, and then another one from all the repeated live-update
> requests).
> 
> I'd prefer it if there was a more asynchronous protocol available for
> live-update:
> * the live-update on its own queues up the live update request and
> returns a generation ID. xenstored retries on its own during each of
> its main loop iterations whether the conditions for live-update are now
> met
> * when a generation ID is specified with the live-update command it
> acts as a query to return the status. A query for generation ID <
> current ID return success, and for generation ID = current ID it
> returns either BUSY, or a specific error if known (e.g. timeout reached
> and list of domains preventing live update)
> * the generation ID gets incremented every time a live update succeeds
>   
> This would eliminiate the need for a client-side timeout, since each of
> these commands would be non-blocking.
> It'd also avoid the busy-poll flood.
> 
> Obviously any change here has to be backwards compatible with the
> already deployed xenstored and oxenstored which doesn't know about
> generation IDs, but you can tell them apart based on the reply: if you
> get back OK/BUSY/some error it is the old version, if you get back a
> generation ID it is the new version.
> 
> I ran out of time to implement this before the embargo was up, should I
> go ahead with prototyping a patch for this now, or do you see an
> alternative way to make the live-update command more robust?

I think this can be made much easier:

The live-update should be handled completely in the daemon, so
returning only with success or failure. Returning BUSY wouldn't
occur this way, but either "OK" (after the successful LU) or a
failure reason (e.g. list of domains blocking) would be
returned.

I'll have a try with this approach.


Juergen
Edwin Török Jan. 13, 2021, 5:42 p.m. UTC | #3
On Wed, 2021-01-13 at 17:34 +0100, Jürgen Groß wrote:
> On 06.01.21 15:42, Edwin Torok wrote:
> > [...]
> > 
> > I'd prefer it if there was a more asynchronous protocol available
> > for
> > live-update:
> > * the live-update on its own queues up the live update request and
> > returns a generation ID. xenstored retries on its own during each
> > of
> > its main loop iterations whether the conditions for live-update are
> > now
> > met
> > * when a generation ID is specified with the live-update command it
> > acts as a query to return the status. A query for generation ID <
> > current ID return success, and for generation ID = current ID it
> > returns either BUSY, or a specific error if known (e.g. timeout
> > reached
> > and list of domains preventing live update)
> > * the generation ID gets incremented every time a live update
> > succeeds
> >   
> > This would eliminiate the need for a client-side timeout, since
> > each of
> > these commands would be non-blocking.
> > It'd also avoid the busy-poll flood.
> > 
> > Obviously any change here has to be backwards compatible with the
> > already deployed xenstored and oxenstored which doesn't know about
> > generation IDs, but you can tell them apart based on the reply: if
> > you
> > get back OK/BUSY/some error it is the old version, if you get back
> > a
> > generation ID it is the new version.
> > 
> > I ran out of time to implement this before the embargo was up,
> > should I
> > go ahead with prototyping a patch for this now, or do you see an
> > alternative way to make the live-update command more robust?
> 
> I think this can be made much easier:
> 
> The live-update should be handled completely in the daemon, so
> returning only with success or failure. Returning BUSY wouldn't
> occur this way, but either "OK" (after the successful LU) or a
> failure reason (e.g. list of domains blocking) would be
> returned.
> 
> I'll have a try with this approach.
> 
> 

In oxenstored each request wants an immediate reply, so delaying the
reply to after the live-update is not trivial.
Should be possible to do though (e.g. mark the socket such that no
further xenstore protocol is processed on it, and "put back" the live
update request, to be replied by the new xenstored after the live
update completes, which would clear the 'do not process this socket'
flag), I'll be curious to see what your approach will look like.

Best regards,
--Edwin
diff mbox series

Patch

diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt
index 2081f20f55..1480742330 100644
--- a/docs/misc/xenstore.txt
+++ b/docs/misc/xenstore.txt
@@ -317,6 +317,27 @@  CONTROL			<command>|[<parameters>|]
 	Current commands are:
 	check
 		checks xenstored innards
+	live-update|<params>|+
+		perform a live-update of the Xenstore daemon, only to
+		be used via xenstore-control command.
+		<params> are implementation specific and are used for
+		different steps of the live-update processing. Currently
+		supported <params> are:
+		-f <file>  specify new daemon binary
+		-b <size>  specify size of new stubdom binary
+		-d <chunk-size> <binary-chunk>  transfer chunk of new
+			stubdom binary
+		-c <pars>  specify new command line to use
+		-s [-t <sec>] [-F]  start live update process (-t specifies
+			timeout in seconds to wait for active transactions
+			to finish, default is 60 seconds; -F will force
+			live update to happen even with running transactions
+			after timeout elapsed)
+		-a  abort live update handling
+		All sub-options will return "OK" in case of success or an
+		error string in case of failure. -s can return "BUSY" in case
+		of an active transaction, a retry of -s can be done in that
+		case.
 	log|on
 		turn xenstore logging on
 	log|off
diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile
index 9a0f0d012d..ab89e22d3a 100644
--- a/tools/xenstore/Makefile
+++ b/tools/xenstore/Makefile
@@ -11,6 +11,7 @@  CFLAGS += -include $(XEN_ROOT)/tools/config.h
 CFLAGS += -I./include
 CFLAGS += $(CFLAGS_libxenevtchn)
 CFLAGS += $(CFLAGS_libxenctrl)
+CFLAGS += $(CFLAGS_libxenguest)
 CFLAGS += $(CFLAGS_libxentoolcore)
 CFLAGS += -DXEN_LIB_STORED="\"$(XEN_LIB_STORED)\""
 CFLAGS += -DXEN_RUN_STORED="\"$(XEN_RUN_STORED)\""
@@ -81,7 +82,7 @@  xenstore: xenstore_client.o
 	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xenstore-control: xenstore_control.o
-	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
+	$(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS)
 
 xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o
 	$(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS)
diff --git a/tools/xenstore/xenstore_control.c b/tools/xenstore/xenstore_control.c
index afa04495a7..5ca015a07d 100644
--- a/tools/xenstore/xenstore_control.c
+++ b/tools/xenstore/xenstore_control.c
@@ -1,9 +1,311 @@ 
+#define _GNU_SOURCE
+#include <stdbool.h>
 #include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
+#include <time.h>
+#include <xenctrl.h>
+#include <xenguest.h>
 
 #include "xenstore.h"
 
+/* Add a string plus terminating 0 byte to buf, returning new len. */
+static int add_to_buf(char **buf, const char *val, int len)
+{
+    int vallen = strlen(val) + 1;
+
+    if (len < 0)
+        return -1;
+
+    *buf = realloc(*buf, len + vallen);
+    if (!*buf)
+        return -1;
+
+    strcpy(*buf + len, val);
+
+    return len + vallen;
+}
+
+static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to)
+{
+    int len = 0;
+    char *buf = NULL, *ret;
+    time_t time_start;
+
+    if (asprintf(&ret, "%u", to) < 0)
+        return 1;
+    len = add_to_buf(&buf, "-s", len);
+    len = add_to_buf(&buf, "-t", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (force)
+        len = add_to_buf(&buf, "-F", len);
+    if (len < 0)
+        return 1;
+
+    for (time_start = time(NULL); time(NULL) - time_start < to;) {
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret)
+            goto err;
+        if (strcmp(ret, "BUSY"))
+            break;
+    }
+
+    if (strcmp(ret, "OK"))
+        goto err;
+
+    free(buf);
+    free(ret);
+
+    return 0;
+
+ err:
+    fprintf(stderr, "Starting live update failed:\n%s\n",
+            ret ? : strerror(errno));
+    free(buf);
+    free(ret);
+
+    return 3;
+}
+
+static int live_update_cmdline(struct xs_handle *xsh, const char *cmdline)
+{
+    int len = 0, rc = 0;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-c", len);
+    len = add_to_buf(&buf, cmdline, len);
+    if (len < 0)
+        return 1;
+
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+
+    return rc;
+}
+
+static int send_kernel_blob(struct xs_handle *xsh, const char *binary)
+{
+    int rc = 0, len = 0;
+    xc_interface *xch;
+    struct xc_dom_image *dom;
+    char *ret, *buf = NULL;
+    size_t off, sz;
+#define BLOB_CHUNK_SZ 2048
+
+    xch = xc_interface_open(NULL, NULL, 0);
+    if (!xch) {
+        fprintf(stderr, "xc_interface_open() failed\n");
+        return 1;
+    }
+
+    dom = xc_dom_allocate(xch, NULL, NULL);
+    if (!dom) {
+        rc = 1;
+        goto out_close;
+    }
+
+    rc = xc_dom_kernel_file(dom, binary);
+    if (rc) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    if (asprintf(&ret, "%zu", dom->kernel_size) < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    len = add_to_buf(&buf, "-b", len);
+    len = add_to_buf(&buf, ret, len);
+    free(ret);
+    if (len < 0) {
+        rc = 1;
+        goto out_rel;
+    }
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Starting live update failed:\n%s\n",
+                ret ? : strerror(errno));
+        rc = 3;
+    }
+    free(ret);
+    if (rc)
+        goto out_rel;
+
+    /* buf capable to hold "-d" <1..2048> BLOB_CHUNK_SZ and a terminating 0. */
+    buf = malloc(3 + 5 + BLOB_CHUNK_SZ + 1);
+    if (!buf) {
+        rc = 1;
+        goto out_rel;
+    }
+
+    strcpy(buf, "-d");
+    sz = BLOB_CHUNK_SZ;
+    for (off = 0; off < dom->kernel_size; off += BLOB_CHUNK_SZ) {
+        if (dom->kernel_size - off < BLOB_CHUNK_SZ)
+            sz = dom->kernel_size - off;
+        sprintf(buf + 3, "%zu", sz);
+        len = 3 + strlen(buf + 3) + 1;
+        memcpy(buf + len, dom->kernel_blob + off, sz);
+        buf[len + sz] = 0;
+        len += sz + 1;
+        ret = xs_control_command(xsh, "live-update", buf, len);
+        if (!ret || strcmp(ret, "OK")) {
+            fprintf(stderr, "Transfer of new binary failed:\n%s\n",
+                    ret ? : strerror(errno));
+            rc = 3;
+            free(ret);
+            break;
+        }
+        free(ret);
+    }
+
+    free(buf);
+
+ out_rel:
+    xc_dom_release(dom);
+
+ out_close:
+    xc_interface_close(xch);
+
+    return rc;
+}
+
+/*
+ * Live update of Xenstore stubdom
+ *
+ * Sequence of actions:
+ * 1. transfer new stubdom binary
+ *    a) specify size
+ *    b) transfer unpacked binary in chunks
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_stubdom(struct xs_handle *xsh, const char *binary,
+                               const char *cmdline, bool force, unsigned int to)
+{
+    int rc;
+
+    rc = send_kernel_blob(xsh, binary);
+    if (rc)
+        goto abort;
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+/*
+ * Live update of Xenstore daemon
+ *
+ * Sequence of actions:
+ * 1. transfer new binary filename
+ * 2. transfer new cmdline (optional)
+ * 3. start update (includes flags)
+ */
+static int live_update_daemon(struct xs_handle *xsh, const char *binary,
+                              const char *cmdline, bool force, unsigned int to)
+{
+    int len = 0, rc;
+    char *buf = NULL, *ret;
+
+    len = add_to_buf(&buf, "-f", len);
+    len = add_to_buf(&buf, binary, len);
+    if (len < 0)
+        return 1;
+    ret = xs_control_command(xsh, "live-update", buf, len);
+    free(buf);
+    if (!ret || strcmp(ret, "OK")) {
+        fprintf(stderr, "Setting update binary failed:\n%s\n",
+                ret ? : strerror(errno));
+        free(ret);
+        return 3;
+    }
+    free(ret);
+
+    if (cmdline) {
+        rc = live_update_cmdline(xsh, cmdline);
+        if (rc)
+            goto abort;
+    }
+
+    rc = live_update_start(xsh, force, to);
+    if (rc)
+        goto abort;
+
+    return 0;
+
+ abort:
+    xs_control_command(xsh, "live-update", "-a", 3);
+    return rc;
+}
+
+static int live_update(struct xs_handle *xsh, int argc, char **argv)
+{
+    int rc = 0;
+    unsigned int i, to = 60;
+    char *binary = NULL, *cmdline = NULL, *val;
+    bool force = false;
+
+    for (i = 0; i < argc; i++) {
+        if (!strcmp(argv[i], "-c")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing command line value\n");
+                rc = 2;
+                goto out;
+            }
+            cmdline = argv[i];
+        } else if (!strcmp(argv[i], "-t")) {
+            i++;
+            if (i == argc) {
+                fprintf(stderr, "Missing timeout value\n");
+                rc = 2;
+                goto out;
+            }
+            to = atoi(argv[i]);
+        } else if (!strcmp(argv[i], "-F"))
+            force = true;
+        else
+            binary = argv[i];
+    }
+
+    if (!binary) {
+        fprintf(stderr, "Missing binary specification\n");
+        rc = 2;
+        goto out;
+    }
+
+    val = xs_read(xsh, XBT_NULL, "/tool/xenstored/domid", &i);
+    if (val)
+        rc = live_update_stubdom(xsh, binary, cmdline, force, to);
+    else
+        rc = live_update_daemon(xsh, binary, cmdline, force, to);
+
+    free(val);
+
+ out:
+    return rc;
+}
 
 int main(int argc, char **argv)
 {
@@ -20,22 +322,6 @@  int main(int argc, char **argv)
         goto out;
     }
 
-    for (p = 2; p < argc; p++)
-        len += strlen(argv[p]) + 1;
-    if (len) {
-        par = malloc(len);
-        if (!par) {
-            fprintf(stderr, "Allocation error.\n");
-            rc = 1;
-            goto out;
-        }
-        len = 0;
-        for (p = 2; p < argc; p++) {
-            memcpy(par + len, argv[p], strlen(argv[p]) + 1);
-            len += strlen(argv[p]) + 1;
-        }
-    }
-
     xsh = xs_open(0);
     if (xsh == NULL) {
         fprintf(stderr, "Failed to contact Xenstored.\n");
@@ -43,6 +329,19 @@  int main(int argc, char **argv)
         goto out;
     }
 
+    if (!strcmp(argv[1], "live-update")) {
+        rc = live_update(xsh, argc - 2, argv + 2);
+        goto out_close;
+    }
+
+    for (p = 2; p < argc; p++)
+        len = add_to_buf(&par, argv[p], len);
+    if (len < 0) {
+        fprintf(stderr, "Allocation error.\n");
+        rc = 1;
+        goto out_close;
+    }
+
     ret = xs_control_command(xsh, argv[1], par, len);
     if (!ret) {
         rc = 3;
@@ -59,6 +358,7 @@  int main(int argc, char **argv)
     } else if (strlen(ret) > 0)
         printf("%s\n", ret);
 
+ out_close:
     xs_close(xsh);
 
  out:
diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c
index 8d29db8270..00fda5acdb 100644
--- a/tools/xenstore/xenstored_control.c
+++ b/tools/xenstore/xenstored_control.c
@@ -149,11 +149,41 @@  static int do_control_print(void *ctx, struct connection *conn,
 	return 0;
 }
 
+static int do_control_lu(void *ctx, struct connection *conn,
+			 char **vec, int num)
+{
+	const char *resp;
+
+	resp = talloc_strdup(ctx, "NYI");
+	send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1);
+	return 0;
+}
+
 static int do_control_help(void *, struct connection *, char **, int);
 
 static struct cmd_s cmds[] = {
 	{ "check", do_control_check, "" },
 	{ "log", do_control_log, "on|off" },
+
+	/*
+	 * The parameters are those of the xenstore-control utility!
+	 * Depending on environment (Mini-OS or daemon) the live-update
+	 * sequence is split into several sub-operations:
+	 * 1. Specification of new binary
+	 *    daemon:  -f <filename>
+	 *    Mini-OS: -b <binary-size>
+	 *             -d <size> <data-bytes> (multiple of those)
+	 * 2. New command-line (optional): -c <cmdline>
+	 * 3. Start of update: -s [-F] [-t <timeout>]
+	 * Any sub-operation needs to respond with the string "OK" in case
+	 * of success, any other response indicates failure.
+	 * A started live-update sequence can be aborted via "-a" (not
+	 * needed in case of failure for the first or last live-update
+	 * sub-operation).
+	 */
+	{ "live-update", do_control_lu,
+		"[-c <cmdline>] [-F] [-t <timeout>] <file>\n"
+		"    Default timeout is 60 seconds.", 4 },
 #ifdef __MINIOS__
 	{ "memreport", do_control_memreport, "" },
 #else