Message ID | 20201215163603.21700-7-jgross@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | tools/xenstore: support live update for xenstored | expand |
On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote: > [...] > +static int live_update_start(struct xs_handle *xsh, bool force, > unsigned int to) > +{ > + int len = 0; > + char *buf = NULL, *ret; > + time_t time_start; > + > + if (asprintf(&ret, "%u", to) < 0) > + return 1; > + len = add_to_buf(&buf, "-s", len); > + len = add_to_buf(&buf, "-t", len); > + len = add_to_buf(&buf, ret, len); > + free(ret); > + if (force) > + len = add_to_buf(&buf, "-F", len); > + if (len < 0) > + return 1; > + > + for (time_start = time(NULL); time(NULL) - time_start < to;) { > + ret = xs_control_command(xsh, "live-update", buf, len); > + if (!ret) > + goto err; > + if (strcmp(ret, "BUSY")) > + break; > + } We've discovered issues with this during testing: when a live-update is not possible (e.g. guest with active transaction held open on purpose) xenstored gets flooded with live-update requests until the timeout is reached. This is problematic for multiple reasons: * flooding xenstored reduces its throughput, and makes it use 100% CPU. Depending on the implementation and configuration it may also flood the logs (which isn't fatal, the system still stays alive if you ENOSPC on /var/log, but it makes it difficult to debug issues if a live update gets you to ENOSPC as you may not see actual failures from after the live update). * flooding xenstored causes the evtchn to overflow and AFAICT neither xenstored or oxenstored is capable of coping with that (oxenstored infinite loops when that happens). IIUC this needs to be fixed in the kernel, so it doesn't return EFBIG in the first place. As a workaround I added a sleep(1) in this loop * the timeout is hit on both client and server sides, but depending on rounding almost always happens on the client side first which prevents us from displaying a more informative error message from the server. As a workaround I increased the client side timeout by 2s to cope with rounding and give the server a chance to reply. Oxenstored can reply with a list of domains preventing the live-update for example. My workarounds looked like this: @@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to) len = add_to_buf(&buf, "-F", len); if (len < 0) return 1; + /* +1 for rounding issues + * +1 to give oxenstored a chance to timeout and report back first */ + to += 2; for (time_start = time(NULL); time(NULL) - time_start < to;) { ret = xs_control_command(xsh, "live-update", buf, len); @@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to) goto err; if (strcmp(ret, "BUSY")) break; + /* TODO: use task ID for commands, avoid busy loop polling here + * oxenstored checks BUSY condition internally on every main loop iteration anyway. + * Avoid flooding xenstored with live-update requests. + * The flooding can also cause the evtchn to overflow in xenstored which makes + * xenstored enter an infinite loop */ + sleep(1); } This also required various heuristics in oxenstored to differentiate between a new live-update command and querying the status of an already completed live-update command, otherwise I almost always ended up doing 2 live-updates in a row (first one queued up, returned BUSY, completed after a while, and then another one from all the repeated live-update requests). I'd prefer it if there was a more asynchronous protocol available for live-update: * the live-update on its own queues up the live update request and returns a generation ID. xenstored retries on its own during each of its main loop iterations whether the conditions for live-update are now met * when a generation ID is specified with the live-update command it acts as a query to return the status. A query for generation ID < current ID return success, and for generation ID = current ID it returns either BUSY, or a specific error if known (e.g. timeout reached and list of domains preventing live update) * the generation ID gets incremented every time a live update succeeds This would eliminiate the need for a client-side timeout, since each of these commands would be non-blocking. It'd also avoid the busy-poll flood. Obviously any change here has to be backwards compatible with the already deployed xenstored and oxenstored which doesn't know about generation IDs, but you can tell them apart based on the reply: if you get back OK/BUSY/some error it is the old version, if you get back a generation ID it is the new version. I ran out of time to implement this before the embargo was up, should I go ahead with prototyping a patch for this now, or do you see an alternative way to make the live-update command more robust? Best regards, --Edwin
On 06.01.21 15:42, Edwin Torok wrote: > On Tue, 2020-12-15 at 17:35 +0100, Juergen Gross wrote: > >> [...] >> +static int live_update_start(struct xs_handle *xsh, bool force, >> unsigned int to) >> +{ >> + int len = 0; >> + char *buf = NULL, *ret; >> + time_t time_start; >> + >> + if (asprintf(&ret, "%u", to) < 0) >> + return 1; >> + len = add_to_buf(&buf, "-s", len); >> + len = add_to_buf(&buf, "-t", len); >> + len = add_to_buf(&buf, ret, len); >> + free(ret); >> + if (force) >> + len = add_to_buf(&buf, "-F", len); >> + if (len < 0) >> + return 1; >> + >> + for (time_start = time(NULL); time(NULL) - time_start < to;) { >> + ret = xs_control_command(xsh, "live-update", buf, len); >> + if (!ret) >> + goto err; >> + if (strcmp(ret, "BUSY")) >> + break; >> + } > > We've discovered issues with this during testing: when a live-update is > not possible (e.g. guest with active transaction held open on purpose) > xenstored gets flooded with live-update requests until the timeout is > reached. > > This is problematic for multiple reasons: > * flooding xenstored reduces its throughput, and makes it use 100% CPU. > Depending on the implementation and configuration it may also flood the > logs (which isn't fatal, the system still stays alive if you ENOSPC on > /var/log, but it makes it difficult to debug issues if a live update > gets you to ENOSPC as you may not see actual failures from after the > live update). > * flooding xenstored causes the evtchn to overflow and AFAICT neither > xenstored or oxenstored is capable of coping with that (oxenstored > infinite loops when that happens). IIUC this needs to be fixed in the > kernel, so it doesn't return EFBIG in the first place. As a workaround > I added a sleep(1) in this loop > * the timeout is hit on both client and server sides, but depending on > rounding almost always happens on the client side first which prevents > us from displaying a more informative error message from the server. As > a workaround I increased the client side timeout by 2s to cope with > rounding and give the server a chance to reply. Oxenstored can reply > with a list of domains preventing the live-update for example. > > My workarounds looked like this: > @@ -42,6 +43,9 @@ static int live_update_start(struct xs_handle *xsh, > bool force, unsigned int to) > len = add_to_buf(&buf, "-F", len); > if (len < 0) > return 1; > + /* +1 for rounding issues > + * +1 to give oxenstored a chance to timeout and report back first > */ > + to += 2; > > for (time_start = time(NULL); time(NULL) - time_start < to;) { > ret = xs_control_command(xsh, "live-update", buf, len); > @@ -49,6 +53,12 @@ static int live_update_start(struct xs_handle *xsh, > bool force, unsigned int to) > goto err; > if (strcmp(ret, "BUSY")) > break; > + /* TODO: use task ID for commands, avoid busy loop polling > here > + * oxenstored checks BUSY condition internally on every main > loop iteration anyway. > + * Avoid flooding xenstored with live-update requests. > + * The flooding can also cause the evtchn to overflow in > xenstored which makes > + * xenstored enter an infinite loop */ > + sleep(1); > } > > This also required various heuristics in oxenstored to differentiate > between a new live-update command and querying the status of an already > completed live-update command, otherwise I almost always ended up doing > 2 live-updates in a row (first one queued up, returned BUSY, completed > after a while, and then another one from all the repeated live-update > requests). > > I'd prefer it if there was a more asynchronous protocol available for > live-update: > * the live-update on its own queues up the live update request and > returns a generation ID. xenstored retries on its own during each of > its main loop iterations whether the conditions for live-update are now > met > * when a generation ID is specified with the live-update command it > acts as a query to return the status. A query for generation ID < > current ID return success, and for generation ID = current ID it > returns either BUSY, or a specific error if known (e.g. timeout reached > and list of domains preventing live update) > * the generation ID gets incremented every time a live update succeeds > > This would eliminiate the need for a client-side timeout, since each of > these commands would be non-blocking. > It'd also avoid the busy-poll flood. > > Obviously any change here has to be backwards compatible with the > already deployed xenstored and oxenstored which doesn't know about > generation IDs, but you can tell them apart based on the reply: if you > get back OK/BUSY/some error it is the old version, if you get back a > generation ID it is the new version. > > I ran out of time to implement this before the embargo was up, should I > go ahead with prototyping a patch for this now, or do you see an > alternative way to make the live-update command more robust? I think this can be made much easier: The live-update should be handled completely in the daemon, so returning only with success or failure. Returning BUSY wouldn't occur this way, but either "OK" (after the successful LU) or a failure reason (e.g. list of domains blocking) would be returned. I'll have a try with this approach. Juergen
On Wed, 2021-01-13 at 17:34 +0100, Jürgen Groß wrote: > On 06.01.21 15:42, Edwin Torok wrote: > > [...] > > > > I'd prefer it if there was a more asynchronous protocol available > > for > > live-update: > > * the live-update on its own queues up the live update request and > > returns a generation ID. xenstored retries on its own during each > > of > > its main loop iterations whether the conditions for live-update are > > now > > met > > * when a generation ID is specified with the live-update command it > > acts as a query to return the status. A query for generation ID < > > current ID return success, and for generation ID = current ID it > > returns either BUSY, or a specific error if known (e.g. timeout > > reached > > and list of domains preventing live update) > > * the generation ID gets incremented every time a live update > > succeeds > > > > This would eliminiate the need for a client-side timeout, since > > each of > > these commands would be non-blocking. > > It'd also avoid the busy-poll flood. > > > > Obviously any change here has to be backwards compatible with the > > already deployed xenstored and oxenstored which doesn't know about > > generation IDs, but you can tell them apart based on the reply: if > > you > > get back OK/BUSY/some error it is the old version, if you get back > > a > > generation ID it is the new version. > > > > I ran out of time to implement this before the embargo was up, > > should I > > go ahead with prototyping a patch for this now, or do you see an > > alternative way to make the live-update command more robust? > > I think this can be made much easier: > > The live-update should be handled completely in the daemon, so > returning only with success or failure. Returning BUSY wouldn't > occur this way, but either "OK" (after the successful LU) or a > failure reason (e.g. list of domains blocking) would be > returned. > > I'll have a try with this approach. > > In oxenstored each request wants an immediate reply, so delaying the reply to after the live-update is not trivial. Should be possible to do though (e.g. mark the socket such that no further xenstore protocol is processed on it, and "put back" the live update request, to be replied by the new xenstored after the live update completes, which would clear the 'do not process this socket' flag), I'll be curious to see what your approach will look like. Best regards, --Edwin
diff --git a/docs/misc/xenstore.txt b/docs/misc/xenstore.txt index 2081f20f55..1480742330 100644 --- a/docs/misc/xenstore.txt +++ b/docs/misc/xenstore.txt @@ -317,6 +317,27 @@ CONTROL <command>|[<parameters>|] Current commands are: check checks xenstored innards + live-update|<params>|+ + perform a live-update of the Xenstore daemon, only to + be used via xenstore-control command. + <params> are implementation specific and are used for + different steps of the live-update processing. Currently + supported <params> are: + -f <file> specify new daemon binary + -b <size> specify size of new stubdom binary + -d <chunk-size> <binary-chunk> transfer chunk of new + stubdom binary + -c <pars> specify new command line to use + -s [-t <sec>] [-F] start live update process (-t specifies + timeout in seconds to wait for active transactions + to finish, default is 60 seconds; -F will force + live update to happen even with running transactions + after timeout elapsed) + -a abort live update handling + All sub-options will return "OK" in case of success or an + error string in case of failure. -s can return "BUSY" in case + of an active transaction, a retry of -s can be done in that + case. log|on turn xenstore logging on log|off diff --git a/tools/xenstore/Makefile b/tools/xenstore/Makefile index 9a0f0d012d..ab89e22d3a 100644 --- a/tools/xenstore/Makefile +++ b/tools/xenstore/Makefile @@ -11,6 +11,7 @@ CFLAGS += -include $(XEN_ROOT)/tools/config.h CFLAGS += -I./include CFLAGS += $(CFLAGS_libxenevtchn) CFLAGS += $(CFLAGS_libxenctrl) +CFLAGS += $(CFLAGS_libxenguest) CFLAGS += $(CFLAGS_libxentoolcore) CFLAGS += -DXEN_LIB_STORED="\"$(XEN_LIB_STORED)\"" CFLAGS += -DXEN_RUN_STORED="\"$(XEN_RUN_STORED)\"" @@ -81,7 +82,7 @@ xenstore: xenstore_client.o $(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS) xenstore-control: xenstore_control.o - $(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS) + $(CC) $< $(LDFLAGS) $(LDLIBS_libxenstore) $(LDLIBS_libxenctrl) $(LDLIBS_libxenguest) $(LDLIBS_libxentoolcore) $(SOCKET_LIBS) -o $@ $(APPEND_LDFLAGS) xs_tdb_dump: xs_tdb_dump.o utils.o tdb.o talloc.o $(CC) $^ $(LDFLAGS) -o $@ $(APPEND_LDFLAGS) diff --git a/tools/xenstore/xenstore_control.c b/tools/xenstore/xenstore_control.c index afa04495a7..5ca015a07d 100644 --- a/tools/xenstore/xenstore_control.c +++ b/tools/xenstore/xenstore_control.c @@ -1,9 +1,311 @@ +#define _GNU_SOURCE +#include <stdbool.h> #include <stdio.h> #include <stdlib.h> #include <string.h> +#include <time.h> +#include <xenctrl.h> +#include <xenguest.h> #include "xenstore.h" +/* Add a string plus terminating 0 byte to buf, returning new len. */ +static int add_to_buf(char **buf, const char *val, int len) +{ + int vallen = strlen(val) + 1; + + if (len < 0) + return -1; + + *buf = realloc(*buf, len + vallen); + if (!*buf) + return -1; + + strcpy(*buf + len, val); + + return len + vallen; +} + +static int live_update_start(struct xs_handle *xsh, bool force, unsigned int to) +{ + int len = 0; + char *buf = NULL, *ret; + time_t time_start; + + if (asprintf(&ret, "%u", to) < 0) + return 1; + len = add_to_buf(&buf, "-s", len); + len = add_to_buf(&buf, "-t", len); + len = add_to_buf(&buf, ret, len); + free(ret); + if (force) + len = add_to_buf(&buf, "-F", len); + if (len < 0) + return 1; + + for (time_start = time(NULL); time(NULL) - time_start < to;) { + ret = xs_control_command(xsh, "live-update", buf, len); + if (!ret) + goto err; + if (strcmp(ret, "BUSY")) + break; + } + + if (strcmp(ret, "OK")) + goto err; + + free(buf); + free(ret); + + return 0; + + err: + fprintf(stderr, "Starting live update failed:\n%s\n", + ret ? : strerror(errno)); + free(buf); + free(ret); + + return 3; +} + +static int live_update_cmdline(struct xs_handle *xsh, const char *cmdline) +{ + int len = 0, rc = 0; + char *buf = NULL, *ret; + + len = add_to_buf(&buf, "-c", len); + len = add_to_buf(&buf, cmdline, len); + if (len < 0) + return 1; + + ret = xs_control_command(xsh, "live-update", buf, len); + free(buf); + if (!ret || strcmp(ret, "OK")) { + fprintf(stderr, "Setting update binary failed:\n%s\n", + ret ? : strerror(errno)); + rc = 3; + } + free(ret); + + return rc; +} + +static int send_kernel_blob(struct xs_handle *xsh, const char *binary) +{ + int rc = 0, len = 0; + xc_interface *xch; + struct xc_dom_image *dom; + char *ret, *buf = NULL; + size_t off, sz; +#define BLOB_CHUNK_SZ 2048 + + xch = xc_interface_open(NULL, NULL, 0); + if (!xch) { + fprintf(stderr, "xc_interface_open() failed\n"); + return 1; + } + + dom = xc_dom_allocate(xch, NULL, NULL); + if (!dom) { + rc = 1; + goto out_close; + } + + rc = xc_dom_kernel_file(dom, binary); + if (rc) { + rc = 1; + goto out_rel; + } + + if (asprintf(&ret, "%zu", dom->kernel_size) < 0) { + rc = 1; + goto out_rel; + } + len = add_to_buf(&buf, "-b", len); + len = add_to_buf(&buf, ret, len); + free(ret); + if (len < 0) { + rc = 1; + goto out_rel; + } + ret = xs_control_command(xsh, "live-update", buf, len); + free(buf); + if (!ret || strcmp(ret, "OK")) { + fprintf(stderr, "Starting live update failed:\n%s\n", + ret ? : strerror(errno)); + rc = 3; + } + free(ret); + if (rc) + goto out_rel; + + /* buf capable to hold "-d" <1..2048> BLOB_CHUNK_SZ and a terminating 0. */ + buf = malloc(3 + 5 + BLOB_CHUNK_SZ + 1); + if (!buf) { + rc = 1; + goto out_rel; + } + + strcpy(buf, "-d"); + sz = BLOB_CHUNK_SZ; + for (off = 0; off < dom->kernel_size; off += BLOB_CHUNK_SZ) { + if (dom->kernel_size - off < BLOB_CHUNK_SZ) + sz = dom->kernel_size - off; + sprintf(buf + 3, "%zu", sz); + len = 3 + strlen(buf + 3) + 1; + memcpy(buf + len, dom->kernel_blob + off, sz); + buf[len + sz] = 0; + len += sz + 1; + ret = xs_control_command(xsh, "live-update", buf, len); + if (!ret || strcmp(ret, "OK")) { + fprintf(stderr, "Transfer of new binary failed:\n%s\n", + ret ? : strerror(errno)); + rc = 3; + free(ret); + break; + } + free(ret); + } + + free(buf); + + out_rel: + xc_dom_release(dom); + + out_close: + xc_interface_close(xch); + + return rc; +} + +/* + * Live update of Xenstore stubdom + * + * Sequence of actions: + * 1. transfer new stubdom binary + * a) specify size + * b) transfer unpacked binary in chunks + * 2. transfer new cmdline (optional) + * 3. start update (includes flags) + */ +static int live_update_stubdom(struct xs_handle *xsh, const char *binary, + const char *cmdline, bool force, unsigned int to) +{ + int rc; + + rc = send_kernel_blob(xsh, binary); + if (rc) + goto abort; + + if (cmdline) { + rc = live_update_cmdline(xsh, cmdline); + if (rc) + goto abort; + } + + rc = live_update_start(xsh, force, to); + if (rc) + goto abort; + + return 0; + + abort: + xs_control_command(xsh, "live-update", "-a", 3); + return rc; +} + +/* + * Live update of Xenstore daemon + * + * Sequence of actions: + * 1. transfer new binary filename + * 2. transfer new cmdline (optional) + * 3. start update (includes flags) + */ +static int live_update_daemon(struct xs_handle *xsh, const char *binary, + const char *cmdline, bool force, unsigned int to) +{ + int len = 0, rc; + char *buf = NULL, *ret; + + len = add_to_buf(&buf, "-f", len); + len = add_to_buf(&buf, binary, len); + if (len < 0) + return 1; + ret = xs_control_command(xsh, "live-update", buf, len); + free(buf); + if (!ret || strcmp(ret, "OK")) { + fprintf(stderr, "Setting update binary failed:\n%s\n", + ret ? : strerror(errno)); + free(ret); + return 3; + } + free(ret); + + if (cmdline) { + rc = live_update_cmdline(xsh, cmdline); + if (rc) + goto abort; + } + + rc = live_update_start(xsh, force, to); + if (rc) + goto abort; + + return 0; + + abort: + xs_control_command(xsh, "live-update", "-a", 3); + return rc; +} + +static int live_update(struct xs_handle *xsh, int argc, char **argv) +{ + int rc = 0; + unsigned int i, to = 60; + char *binary = NULL, *cmdline = NULL, *val; + bool force = false; + + for (i = 0; i < argc; i++) { + if (!strcmp(argv[i], "-c")) { + i++; + if (i == argc) { + fprintf(stderr, "Missing command line value\n"); + rc = 2; + goto out; + } + cmdline = argv[i]; + } else if (!strcmp(argv[i], "-t")) { + i++; + if (i == argc) { + fprintf(stderr, "Missing timeout value\n"); + rc = 2; + goto out; + } + to = atoi(argv[i]); + } else if (!strcmp(argv[i], "-F")) + force = true; + else + binary = argv[i]; + } + + if (!binary) { + fprintf(stderr, "Missing binary specification\n"); + rc = 2; + goto out; + } + + val = xs_read(xsh, XBT_NULL, "/tool/xenstored/domid", &i); + if (val) + rc = live_update_stubdom(xsh, binary, cmdline, force, to); + else + rc = live_update_daemon(xsh, binary, cmdline, force, to); + + free(val); + + out: + return rc; +} int main(int argc, char **argv) { @@ -20,22 +322,6 @@ int main(int argc, char **argv) goto out; } - for (p = 2; p < argc; p++) - len += strlen(argv[p]) + 1; - if (len) { - par = malloc(len); - if (!par) { - fprintf(stderr, "Allocation error.\n"); - rc = 1; - goto out; - } - len = 0; - for (p = 2; p < argc; p++) { - memcpy(par + len, argv[p], strlen(argv[p]) + 1); - len += strlen(argv[p]) + 1; - } - } - xsh = xs_open(0); if (xsh == NULL) { fprintf(stderr, "Failed to contact Xenstored.\n"); @@ -43,6 +329,19 @@ int main(int argc, char **argv) goto out; } + if (!strcmp(argv[1], "live-update")) { + rc = live_update(xsh, argc - 2, argv + 2); + goto out_close; + } + + for (p = 2; p < argc; p++) + len = add_to_buf(&par, argv[p], len); + if (len < 0) { + fprintf(stderr, "Allocation error.\n"); + rc = 1; + goto out_close; + } + ret = xs_control_command(xsh, argv[1], par, len); if (!ret) { rc = 3; @@ -59,6 +358,7 @@ int main(int argc, char **argv) } else if (strlen(ret) > 0) printf("%s\n", ret); + out_close: xs_close(xsh); out: diff --git a/tools/xenstore/xenstored_control.c b/tools/xenstore/xenstored_control.c index 8d29db8270..00fda5acdb 100644 --- a/tools/xenstore/xenstored_control.c +++ b/tools/xenstore/xenstored_control.c @@ -149,11 +149,41 @@ static int do_control_print(void *ctx, struct connection *conn, return 0; } +static int do_control_lu(void *ctx, struct connection *conn, + char **vec, int num) +{ + const char *resp; + + resp = talloc_strdup(ctx, "NYI"); + send_reply(conn, XS_CONTROL, resp, strlen(resp) + 1); + return 0; +} + static int do_control_help(void *, struct connection *, char **, int); static struct cmd_s cmds[] = { { "check", do_control_check, "" }, { "log", do_control_log, "on|off" }, + + /* + * The parameters are those of the xenstore-control utility! + * Depending on environment (Mini-OS or daemon) the live-update + * sequence is split into several sub-operations: + * 1. Specification of new binary + * daemon: -f <filename> + * Mini-OS: -b <binary-size> + * -d <size> <data-bytes> (multiple of those) + * 2. New command-line (optional): -c <cmdline> + * 3. Start of update: -s [-F] [-t <timeout>] + * Any sub-operation needs to respond with the string "OK" in case + * of success, any other response indicates failure. + * A started live-update sequence can be aborted via "-a" (not + * needed in case of failure for the first or last live-update + * sub-operation). + */ + { "live-update", do_control_lu, + "[-c <cmdline>] [-F] [-t <timeout>] <file>\n" + " Default timeout is 60 seconds.", 4 }, #ifdef __MINIOS__ { "memreport", do_control_memreport, "" }, #else