Message ID | 20191001101259.53162-1-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | libxl: wait for the ack when issuing power control requests | expand |
> On 1 Oct 2019, at 11:12, Roger Pau Monne <roger.pau@citrix.com> wrote: > > + libxl_asyncop_how *ao_how = aohow_val(async); > > caml_enter_blocking_section(); > - ret = libxl_domain_shutdown(CTX, c_domid); > + ret = libxl_domain_shutdown(CTX, c_domid, ao_how); > caml_leave_blocking_section(); > > + free(ao_how); > + Does this work when aohow_val returns NULL or does this needs to be checked? Otherwise This looks good to me. — C
On Tue, Oct 01, 2019 at 12:18:41PM +0200, Christian Lindig wrote: > > > > On 1 Oct 2019, at 11:12, Roger Pau Monne <roger.pau@citrix.com> wrote: > > > > + libxl_asyncop_how *ao_how = aohow_val(async); > > > > caml_enter_blocking_section(); > > - ret = libxl_domain_shutdown(CTX, c_domid); > > + ret = libxl_domain_shutdown(CTX, c_domid, ao_how); > > caml_leave_blocking_section(); > > > > + free(ao_how); > > + > > Does this work when aohow_val returns NULL or does this needs to be checked? Otherwise This looks good to me. Both libxl_domain_shutdown and free are perfectly fine to call with ao_how == NULL, but TBH I have no idea of ocaml, so I just copied what's done in stub_libxl_domain_destroy. Thanks, Roger.
On Tue, Oct 01, 2019 at 12:12:59PM +0200, Roger Pau Monne wrote: > Currently only suspend power control requests wait for an ack from the > domain, while power off or reboot requests simply write the command to > xenstore and exit. > > Introduce a 1 minute wait for the domain to acknowledge the request, or > else return an error. The suspend code is slightly modified to use the > new infrastructure added, but shouldn't have any functional change. > > Fix the ocaml bindings and also provide a backwards compatible > interface for the reboot and poweroff libxl API functions. > > Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > I believe applying this patch is not going to cause regressions in > osstest, albeit FreeBSD doesn't acknowledge poweroff/reboot requests > in the currently tested versions, it will shutdown in less than one > minute, and thus the toolstack won't complain because the control node > is going to be removed from xenstore. > --- > tools/libxl/libxl.h | 23 +++++++- > tools/libxl/libxl_dom_suspend.c | 11 ++-- > tools/libxl/libxl_domain.c | 81 ++++++++++++++++++++-------- > tools/libxl/libxl_internal.h | 7 ++- > tools/ocaml/libs/xl/xenlight_stubs.c | 18 ++++--- > tools/xl/xl_vmcontrol.c | 4 +- > 6 files changed, 102 insertions(+), 42 deletions(-) > > diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h > index f711cfc750..03ea5ca740 100644 > --- a/tools/libxl/libxl.h > +++ b/tools/libxl/libxl.h In libxl.h, can you update the comment of LIBXL_HAVE_FN_USING_QMP_ASYNC to add the _shutdown and _reboot? (The define name might not reflect the reality, but I think the comment describe what the define is for. It was introduced by edaa631ddce.) > diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c > index 0dd5b7ffa9..058f3c77ec 100644 > --- a/tools/libxl/libxl_domain.c > +++ b/tools/libxl/libxl_domain.c > @@ -763,49 +763,86 @@ char * libxl__domain_pvcontrol_read(libxl__gc *gc, xs_transaction_t t, > return libxl__xs_read(gc, t, shutdown_path); > } > > -int libxl__domain_pvcontrol_write(libxl__gc *gc, xs_transaction_t t, > - uint32_t domid, const char *cmd) > +int libxl__domain_pvcontrol(libxl__egc *egc, libxl__xswait_state *pvcontrol, > + domid_t domid, const char *cmd) > { > + STATE_AO_GC(pvcontrol->ao); > const char *shutdown_path; > + int rc; > + > + rc = libxl__domain_pvcontrol_available(gc, domid); > + if (rc < 0) > + return rc; > > shutdown_path = libxl__domain_pvcontrol_xspath(gc, domid); > if (!shutdown_path) > return ERROR_FAIL; > > - return libxl__xs_printf(gc, t, shutdown_path, "%s", cmd); > + rc = libxl__xs_printf(gc, XBT_NULL, shutdown_path, "%s", cmd); > + if (rc) > + return rc; > + > + pvcontrol->path = shutdown_path; > + pvcontrol->what = GCSPRINTF("guest acknowledgement of %s request", cmd); > + pvcontrol->timeout_ms = 60 * 1000; > + libxl__xswait_start(gc, pvcontrol); Shouldn't we watch shutdown_path before we write to it? Otherwise, we might never see the acknowledgement by the guest. But I don't know if xswait to a read the first time it is setup. Also, you need to check the return value of libxl__xswait_start. > + > + return 0; > } > > -static int libxl__domain_pvcontrol(libxl__gc *gc, uint32_t domid, > - const char *cmd) > +static bool pvcontrol_acked(const char *state) > { > - int ret; > + if (!state || !strcmp(state,"")) > + return true; > + > + return false; > +} > + > +static void pvcontrol_cb(libxl__egc *egc, libxl__xswait_state *xswa, int rc, > + const char *state) Can you move pvcontrol_cb after libxl_domain_shutdown and libxl_domain_reboot ? In general, the callback of a function should be after the function that set the callback. See "ASYNCHRONOUS/LONG-RUNNING OPERATIONS" for a more detail explanation. > +{ > + STATE_AO_GC(xswa->ao); > + > + if (!rc && !pvcontrol_acked(state)) > + return; > > - ret = libxl__domain_pvcontrol_available(gc, domid); > - if (ret < 0) > - return ret; > + libxl__xswait_stop(gc, xswa); > > - if (!ret) > - return ERROR_NOPARAVIRT; > + if (rc) > + LOG(ERROR, "guest didn't acknowledge control request: %d", rc); > > - return libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, cmd); > + libxl__ao_complete(egc, ao, rc); > } > > -int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid) > + > +int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid, > + const libxl_asyncop_how *ao_how) > { > - GC_INIT(ctx); > + AO_CREATE(ctx, domid, ao_how); > + libxl__xswait_state *pvcontrol; > int ret; > - ret = libxl__domain_pvcontrol(gc, domid, "poweroff"); > - GC_FREE; > - return ret; > + > + GCNEW(pvcontrol); > + pvcontrol->ao = ao; > + pvcontrol->callback = pvcontrol_cb; > + ret = libxl__domain_pvcontrol(egc, pvcontrol, domid, "poweroff"); libxl__domain_pvcontrol should return a libxl error (I haven't check), so it should be `rc' here, not ret. > + > + return ret ? AO_CREATE_FAIL(ret) : AO_INPROGRESS; > } > > diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c b/tools/ocaml/libs/xl/xenlight_stubs.c > index 37b046df63..ff16b8710b 100644 > --- a/tools/ocaml/libs/xl/xenlight_stubs.c > +++ b/tools/ocaml/libs/xl/xenlight_stubs.c > @@ -551,32 +551,38 @@ value stub_libxl_domain_create_restore(value ctx, value domain_config, value par > CAMLreturn(Val_int(c_domid)); > } > > -value stub_libxl_domain_shutdown(value ctx, value domid) > +value stub_libxl_domain_shutdown(value ctx, value domid, value async, value unit) You also need to change the ocaml prototype ;-). They are declared in the *.ml and *.mli file. Thanks,
> On 1 Oct 2019, at 11:12, Roger Pau Monne <roger.pau@citrix.com> wrote: > > tools/ocaml/libs/xl/xenlight_stubs.c | 18 ++++--- Acked-by: Christian Lindig <christian.lindig@citrix.com>
On Tue, Oct 01, 2019 at 12:12:59PM +0200, Roger Pau Monne wrote: > I believe applying this patch is not going to cause regressions in > osstest, albeit FreeBSD doesn't acknowledge poweroff/reboot requests > in the currently tested versions, it will shutdown in less than one > minute, and thus the toolstack won't complain because the control node > is going to be removed from xenstore. Acknowledgement is described in the documentation of ~/control/shutdown (described in xenstore-paths.pandoc), so I guess it would be a bug in FreeBSD rather than a regression of the toolstack. Isn't it?
Anthony PERARD writes ("Re: [PATCH] libxl: wait for the ack when issuing power control requests"): > On Tue, Oct 01, 2019 at 12:12:59PM +0200, Roger Pau Monne wrote: > > I believe applying this patch is not going to cause regressions in > > osstest, albeit FreeBSD doesn't acknowledge poweroff/reboot requests > > in the currently tested versions, it will shutdown in less than one > > minute, and thus the toolstack won't complain because the control node > > is going to be removed from xenstore. > > Acknowledgement is described in the documentation of ~/control/shutdown > (described in xenstore-paths.pandoc), so I guess it would be a bug in > FreeBSD rather than a regression of the toolstack. Isn't it? Well, yes, but we try to avoid breaking old guests, even if they are buggy. Ian.
On Tue, Oct 01, 2019 at 12:44:38PM +0100, Anthony PERARD wrote: > On Tue, Oct 01, 2019 at 12:12:59PM +0200, Roger Pau Monne wrote: > > - return libxl__xs_printf(gc, t, shutdown_path, "%s", cmd); > > + rc = libxl__xs_printf(gc, XBT_NULL, shutdown_path, "%s", cmd); > > + if (rc) > > + return rc; > > + > > + pvcontrol->path = shutdown_path; > > + pvcontrol->what = GCSPRINTF("guest acknowledgement of %s request", cmd); > > + pvcontrol->timeout_ms = 60 * 1000; > > + libxl__xswait_start(gc, pvcontrol); > > Shouldn't we watch shutdown_path before we write to it? Otherwise, we > might never see the acknowledgement by the guest. > But I don't know if xswait to a read the first time it is setup. AFAIK you always get an event after the watch is registered. Thanks, Roger.
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h index f711cfc750..03ea5ca740 100644 --- a/tools/libxl/libxl.h +++ b/tools/libxl/libxl.h @@ -1628,8 +1628,27 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info, const libxl_asyncop_how *ao_how) LIBXL_EXTERNAL_CALLERS_ONLY; -int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid); -int libxl_domain_reboot(libxl_ctx *ctx, uint32_t domid); +int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid, + const libxl_asyncop_how *ao_how) + LIBXL_EXTERNAL_CALLERS_ONLY; +int libxl_domain_reboot(libxl_ctx *ctx, uint32_t domid, + const libxl_asyncop_how *ao_how) + LIBXL_EXTERNAL_CALLERS_ONLY; +#if defined(LIBXL_API_VERSION) && LIBXL_API_VERSION < 0x041300 +static inline int libxl_domain_shutdown_0x041200(libxl_ctx *ctx, + uint32_t domid) +{ + return libxl_domain_shutdown(ctx, domid, NULL); +} +#define libxl_domain_shutdown libxl_domain_shutdown_0x041200 +static inline int libxl_domain_reboot_0x041200(libxl_ctx *ctx, + uint32_t domid) +{ + return libxl_domain_reboot(ctx, domid, NULL); +} +#define libxl_domain_reboot libxl_domain_reboot_0x041200 +#endif + int libxl_domain_destroy(libxl_ctx *ctx, uint32_t domid, const libxl_asyncop_how *ao_how) LIBXL_EXTERNAL_CALLERS_ONLY; diff --git a/tools/libxl/libxl_dom_suspend.c b/tools/libxl/libxl_dom_suspend.c index 9bb2d00bec..248dbc33e3 100644 --- a/tools/libxl/libxl_dom_suspend.c +++ b/tools/libxl/libxl_dom_suspend.c @@ -193,16 +193,11 @@ static void domain_suspend_callback_common(libxl__egc *egc, LOGD(DEBUG, domid, "issuing %s suspend request via XenBus control node", dsps->type != LIBXL_DOMAIN_TYPE_PV ? "PVH/HVM" : "PV"); - libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, "suspend"); - - dsps->pvcontrol.path = libxl__domain_pvcontrol_xspath(gc, domid); - if (!dsps->pvcontrol.path) { rc = ERROR_FAIL; goto err; } - dsps->pvcontrol.ao = ao; - dsps->pvcontrol.what = "guest acknowledgement of suspend request"; - dsps->pvcontrol.timeout_ms = 60 * 1000; dsps->pvcontrol.callback = domain_suspend_common_pvcontrol_suspending; - libxl__xswait_start(gc, &dsps->pvcontrol); + rc = libxl__domain_pvcontrol(egc, &dsps->pvcontrol, domid, "suspend"); + if (rc) goto err; + return; err: diff --git a/tools/libxl/libxl_domain.c b/tools/libxl/libxl_domain.c index 0dd5b7ffa9..058f3c77ec 100644 --- a/tools/libxl/libxl_domain.c +++ b/tools/libxl/libxl_domain.c @@ -763,49 +763,86 @@ char * libxl__domain_pvcontrol_read(libxl__gc *gc, xs_transaction_t t, return libxl__xs_read(gc, t, shutdown_path); } -int libxl__domain_pvcontrol_write(libxl__gc *gc, xs_transaction_t t, - uint32_t domid, const char *cmd) +int libxl__domain_pvcontrol(libxl__egc *egc, libxl__xswait_state *pvcontrol, + domid_t domid, const char *cmd) { + STATE_AO_GC(pvcontrol->ao); const char *shutdown_path; + int rc; + + rc = libxl__domain_pvcontrol_available(gc, domid); + if (rc < 0) + return rc; shutdown_path = libxl__domain_pvcontrol_xspath(gc, domid); if (!shutdown_path) return ERROR_FAIL; - return libxl__xs_printf(gc, t, shutdown_path, "%s", cmd); + rc = libxl__xs_printf(gc, XBT_NULL, shutdown_path, "%s", cmd); + if (rc) + return rc; + + pvcontrol->path = shutdown_path; + pvcontrol->what = GCSPRINTF("guest acknowledgement of %s request", cmd); + pvcontrol->timeout_ms = 60 * 1000; + libxl__xswait_start(gc, pvcontrol); + + return 0; } -static int libxl__domain_pvcontrol(libxl__gc *gc, uint32_t domid, - const char *cmd) +static bool pvcontrol_acked(const char *state) { - int ret; + if (!state || !strcmp(state,"")) + return true; + + return false; +} + +static void pvcontrol_cb(libxl__egc *egc, libxl__xswait_state *xswa, int rc, + const char *state) +{ + STATE_AO_GC(xswa->ao); + + if (!rc && !pvcontrol_acked(state)) + return; - ret = libxl__domain_pvcontrol_available(gc, domid); - if (ret < 0) - return ret; + libxl__xswait_stop(gc, xswa); - if (!ret) - return ERROR_NOPARAVIRT; + if (rc) + LOG(ERROR, "guest didn't acknowledge control request: %d", rc); - return libxl__domain_pvcontrol_write(gc, XBT_NULL, domid, cmd); + libxl__ao_complete(egc, ao, rc); } -int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid) + +int libxl_domain_shutdown(libxl_ctx *ctx, uint32_t domid, + const libxl_asyncop_how *ao_how) { - GC_INIT(ctx); + AO_CREATE(ctx, domid, ao_how); + libxl__xswait_state *pvcontrol; int ret; - ret = libxl__domain_pvcontrol(gc, domid, "poweroff"); - GC_FREE; - return ret; + + GCNEW(pvcontrol); + pvcontrol->ao = ao; + pvcontrol->callback = pvcontrol_cb; + ret = libxl__domain_pvcontrol(egc, pvcontrol, domid, "poweroff"); + + return ret ? AO_CREATE_FAIL(ret) : AO_INPROGRESS; } -int libxl_domain_reboot(libxl_ctx *ctx, uint32_t domid) +int libxl_domain_reboot(libxl_ctx *ctx, uint32_t domid, + const libxl_asyncop_how *ao_how) { - GC_INIT(ctx); + AO_CREATE(ctx, domid, ao_how); + libxl__xswait_state *pvcontrol; int ret; - ret = libxl__domain_pvcontrol(gc, domid, "reboot"); - GC_FREE; - return ret; + + GCNEW(pvcontrol); + pvcontrol->ao = ao; + pvcontrol->callback = pvcontrol_cb; + ret = libxl__domain_pvcontrol(egc, pvcontrol, domid, "reboot"); + + return ret ? AO_CREATE_FAIL(ret) : AO_INPROGRESS; } static void domain_death_occurred(libxl__egc *egc, diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index bfeb38e0ed..d2d5af746b 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1378,8 +1378,6 @@ _hidden int libxl__domain_pvcontrol_available(libxl__gc *gc, uint32_t domid); _hidden const char *libxl__domain_pvcontrol_xspath(libxl__gc*, uint32_t domid); _hidden char * libxl__domain_pvcontrol_read(libxl__gc *gc, xs_transaction_t t, uint32_t domid); -_hidden int libxl__domain_pvcontrol_write(libxl__gc *gc, xs_transaction_t t, - uint32_t domid, const char *cmd); /* from xl_device */ _hidden char *libxl__device_disk_string_of_backend(libxl_disk_backend backend); @@ -4719,6 +4717,11 @@ _hidden void libxl__ev_devlock_init(libxl__ev_devlock *); _hidden void libxl__ev_devlock_lock(libxl__egc *, libxl__ev_devlock *); _hidden void libxl__ev_devlock_unlock(libxl__gc *, libxl__ev_devlock *); +/* Send control commands over xenstore and wait for an Ack. */ +_hidden int libxl__domain_pvcontrol(libxl__egc *egc, + libxl__xswait_state *pvcontrol, + domid_t domid, const char *cmd); + #endif /* diff --git a/tools/ocaml/libs/xl/xenlight_stubs.c b/tools/ocaml/libs/xl/xenlight_stubs.c index 37b046df63..ff16b8710b 100644 --- a/tools/ocaml/libs/xl/xenlight_stubs.c +++ b/tools/ocaml/libs/xl/xenlight_stubs.c @@ -551,32 +551,38 @@ value stub_libxl_domain_create_restore(value ctx, value domain_config, value par CAMLreturn(Val_int(c_domid)); } -value stub_libxl_domain_shutdown(value ctx, value domid) +value stub_libxl_domain_shutdown(value ctx, value domid, value async, value unit) { - CAMLparam2(ctx, domid); + CAMLparam4(ctx, domid, async, unit); int ret; uint32_t c_domid = Int_val(domid); + libxl_asyncop_how *ao_how = aohow_val(async); caml_enter_blocking_section(); - ret = libxl_domain_shutdown(CTX, c_domid); + ret = libxl_domain_shutdown(CTX, c_domid, ao_how); caml_leave_blocking_section(); + free(ao_how); + if (ret != 0) failwith_xl(ret, "domain_shutdown"); CAMLreturn(Val_unit); } -value stub_libxl_domain_reboot(value ctx, value domid) +value stub_libxl_domain_reboot(value ctx, value domid, value async, value unit) { - CAMLparam2(ctx, domid); + CAMLparam4(ctx, domid, async, unit); int ret; uint32_t c_domid = Int_val(domid); + libxl_asyncop_how *ao_how = aohow_val(async); caml_enter_blocking_section(); - ret = libxl_domain_reboot(CTX, c_domid); + ret = libxl_domain_reboot(CTX, c_domid, ao_how); caml_leave_blocking_section(); + free(ao_how); + if (ret != 0) failwith_xl(ret, "domain_reboot"); diff --git a/tools/xl/xl_vmcontrol.c b/tools/xl/xl_vmcontrol.c index eb6779a561..b20582e15b 100644 --- a/tools/xl/xl_vmcontrol.c +++ b/tools/xl/xl_vmcontrol.c @@ -103,7 +103,7 @@ static void reboot_domain(uint32_t domid, libxl_evgen_domain_death **deathw, int rc; fprintf(stderr, "Rebooting domain %u\n", domid); - rc=libxl_domain_reboot(ctx, domid); + rc = libxl_domain_reboot(ctx, domid, NULL); if (rc == ERROR_NOPARAVIRT) { if (fallback_trigger) { fprintf(stderr, "PV control interface not available:" @@ -136,7 +136,7 @@ static void shutdown_domain(uint32_t domid, int rc; fprintf(stderr, "Shutting down domain %u\n", domid); - rc=libxl_domain_shutdown(ctx, domid); + rc = libxl_domain_shutdown(ctx, domid, NULL); if (rc == ERROR_NOPARAVIRT) { if (fallback_trigger) { fprintf(stderr, "PV control interface not available:"
Currently only suspend power control requests wait for an ack from the domain, while power off or reboot requests simply write the command to xenstore and exit. Introduce a 1 minute wait for the domain to acknowledge the request, or else return an error. The suspend code is slightly modified to use the new infrastructure added, but shouldn't have any functional change. Fix the ocaml bindings and also provide a backwards compatible interface for the reboot and poweroff libxl API functions. Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- I believe applying this patch is not going to cause regressions in osstest, albeit FreeBSD doesn't acknowledge poweroff/reboot requests in the currently tested versions, it will shutdown in less than one minute, and thus the toolstack won't complain because the control node is going to be removed from xenstore. --- tools/libxl/libxl.h | 23 +++++++- tools/libxl/libxl_dom_suspend.c | 11 ++-- tools/libxl/libxl_domain.c | 81 ++++++++++++++++++++-------- tools/libxl/libxl_internal.h | 7 ++- tools/ocaml/libs/xl/xenlight_stubs.c | 18 ++++--- tools/xl/xl_vmcontrol.c | 4 +- 6 files changed, 102 insertions(+), 42 deletions(-)