Message ID | 20200113170843.21332-6-ian.jackson@eu.citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | libxl: event: Fix hang for some applications | expand |
On 1/13/20 5:08 PM, Ian Jackson wrote: > We are going to want to call this in the following situation: > > * We have just set up an ao, which is to call back - so a > non-synchronous one. It ought not to call the application > back right away, so no egc. > > * There is a libxl thread blocking somewhere but it is using > using an out of date fd or timeout set, which does not take into > account the ao we have just started. > > * We try to wake that thread up, but libxl__poller_wakeup fails. So the idea before was that these two functions take an egc, not so much because it actually uses the egc, but to make sure it's only called in a restricted set of conditions; and now we're relaxing those conditions? -George
George Dunlap writes ("Re: [PATCH v2 05/10] libxl: event: Make libxl__poller_wakeup take a gc, not an egc"): > On 1/13/20 5:08 PM, Ian Jackson wrote: > > We are going to want to call this in the following situation: > > > > * We have just set up an ao, which is to call back - so a > > non-synchronous one. It ought not to call the application > > back right away, so no egc. > > > > * There is a libxl thread blocking somewhere but it is using > > using an out of date fd or timeout set, which does not take into > > account the ao we have just started. > > > > * We try to wake that thread up, but libxl__poller_wakeup fails. > > So the idea before was that these two functions take an egc, not so much > because it actually uses the egc, but to make sure it's only called in a > restricted set of conditions; and now we're relaxing those conditions? Yes. Specifically, we need to make one exception, relating to ao's. In the situation described above, there is no egc, but we need to call libxl__poller_wakeup. Introducing an egc is wrong because that would imply that this situation might result in application callbacks, but it shouldn't (and not having an egc prevents that). libxl__poller_wakeup and LIBXL__EVENT_DISASTER only take an egc for form's sake; they don't use any part of it other than the gc. The "form's sake" is to stop them being called from libxl entrypoints that are not involved in event generation. Before this patch this is enforced by the types: you can't call it in the wrong place because it wants an egc which you don't have. After this patch this is no longer enforced. But the mistake (principally, calling _DISASTER) seems unlikely. The type enforcement I mention above was done because it was possible and easy, not because it was important. Does more of this want to be in the commit message ? Thanks, Ian. (much text of this mail first written on irc)
On 1/17/20 1:46 PM, Ian Jackson wrote: > George Dunlap writes ("Re: [PATCH v2 05/10] libxl: event: Make libxl__poller_wakeup take a gc, not an egc"): >> On 1/13/20 5:08 PM, Ian Jackson wrote: >>> We are going to want to call this in the following situation: >>> >>> * We have just set up an ao, which is to call back - so a >>> non-synchronous one. It ought not to call the application >>> back right away, so no egc. >>> >>> * There is a libxl thread blocking somewhere but it is using >>> using an out of date fd or timeout set, which does not take into >>> account the ao we have just started. >>> >>> * We try to wake that thread up, but libxl__poller_wakeup fails. >> >> So the idea before was that these two functions take an egc, not so much >> because it actually uses the egc, but to make sure it's only called in a >> restricted set of conditions; and now we're relaxing those conditions? > > Yes. Specifically, we need to make one exception, relating to ao's. > > In the situation described above, there is no egc, but we need to call > libxl__poller_wakeup. Introducing an egc is wrong because that would > imply that this situation might result in application callbacks, but > it shouldn't (and not having an egc prevents that). > > libxl__poller_wakeup and LIBXL__EVENT_DISASTER only take an egc for > form's sake; they don't use any part of it other than the gc. The > "form's sake" is to stop them being called from libxl entrypoints that > are not involved in event generation. > > Before this patch this is enforced by the types: you can't call it in > the wrong place because it wants an egc which you don't have. > > After this patch this is no longer enforced. But the mistake > (principally, calling _DISASTER) seems unlikely. The type enforcement > I mention above was done because it was possible and easy, not because > it was important. That makes sense; just trying partly to make sure I have it right, partly to have things in the public record. In which case, re the code: Reviewed-by: George Dunlap <george.dunlap@citrix.com> > Does more of this want to be in the commit message ? I was going to say I'm not sure we need another round-trip. I'd be OK with checking it in as-is; or you could edit the commit message on check-in if you wanted.
diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 16e6786889..268a5da120 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -1477,7 +1477,7 @@ void libxl__event_occurred(libxl__egc *egc, libxl_event *event) libxl__poller *poller; LIBXL_TAILQ_INSERT_TAIL(&CTX->occurred, event, link); LIBXL_LIST_FOREACH(poller, &CTX->pollers_event, entry) - libxl__poller_wakeup(egc, poller); + libxl__poller_wakeup(gc, poller); } } @@ -1668,9 +1668,8 @@ void libxl__poller_put(libxl_ctx *ctx, libxl__poller *p) LIBXL_LIST_INSERT_HEAD(&ctx->pollers_idle, p, entry); } -void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p) +void libxl__poller_wakeup(libxl__gc *gc, libxl__poller *p) { - EGC_GC; int e = libxl__self_pipe_wakeup(p->wakeup_pipe[1]); if (e) LIBXL__EVENT_DISASTER(gc, "cannot poke watch pipe", e, 0); } @@ -1924,7 +1923,7 @@ void libxl__ao_complete_check_progress_reports(libxl__egc *egc, libxl__ao *ao) assert(ao->in_initiator); if (!ao->constructing) /* don't bother with this if we're not in the event loop */ - libxl__poller_wakeup(egc, ao->poller); + libxl__poller_wakeup(gc, ao->poller); } else if (ao->how.callback) { LOG(DEBUG, "ao %p: complete for callback", ao); LIBXL_TAILQ_INSERT_TAIL(&egc->aos_for_callback, ao, entry_for_callback); diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index 328ecf3e1e..b68ab218b6 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -1311,7 +1311,7 @@ _hidden void libxl__poller_put(libxl_ctx*, libxl__poller *p /* may be NULL */); /* Notifies whoever is polling using p that they should wake up. * ctx must be locked. */ -_hidden void libxl__poller_wakeup(libxl__egc *egc, libxl__poller *p); +_hidden void libxl__poller_wakeup(libxl__gc *egc, libxl__poller *p); /* Internal to fork and child reaping machinery */ extern const libxl_childproc_hooks libxl__childproc_default_hooks;
We are going to want to call this in the following situation: * We have just set up an ao, which is to call back - so a non-synchronous one. It ought not to call the application back right away, so no egc. * There is a libxl thread blocking somewhere but it is using using an out of date fd or timeout set, which does not take into account the ao we have just started. * We try to wake that thread up, but libxl__poller_wakeup fails. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> --- v2: New patch --- tools/libxl/libxl_event.c | 7 +++---- tools/libxl/libxl_internal.h | 2 +- 2 files changed, 4 insertions(+), 5 deletions(-)