@@ -37,6 +37,140 @@ static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao);
/*
+ * osevent update baton handling
+ *
+ * We need the following property (the "unstale liveness property"):
+ *
+ * Whenever any thread is blocking in the libxl event loop[1], at
+ * least one thread must be using an up to date osevent set. It is OK
+ * for all but one threads to have stale event sets, because so long
+ * as one waiting thread has the right event set, any actually
+ * interesting event will, if nothing else, wake that "right" thread
+ * up. It will then make some progress and/or, if it exits, ensure
+ * that some other thread becomes the "right" thread.
+ *
+ * [1] TODO: Right now we are considering only the libxl event loop.
+ * We need to consider application event loop outside libxl too.
+ *
+ * Argument that our approach is sound:
+ *
+ * The issue we are concerned about is libxl sleeping on an out of
+ * date fd set, or too long a timeout, so that it doesn't make
+ * progress. If the property above is satisfied, then if any thread
+ * is waiting in libxl at least one such thread will be waiting on a
+ * sufficient osevent set, so any relevant osevent will wake up a
+ * libxl thread which will either handle the event, or arrange that at
+ * least one other libxl thread has the right set.
+ *
+ * There are two calls to poll in libxl: one is the fd recheck, which
+ * is not blocking. There is only the one blocking call, in
+ * eventloop_iteration. poll runs with the ctx unlocked, so osevents
+ * might be added after it unlocks the ctx - that is what we are
+ * worried about.
+ *
+ * To demonstrate that the unstale liveness property is satisfied:
+ *
+ * We define a baton holder as follows: a libxl thread is a baton
+ * holder if
+ * (a) it has an egc or an ao and holds the ctx lock, or
+ * (b) it has an active non-app poller and no osevents have been
+ * added since it released the lock, or
+ * (c) it has an active non-app poller which has been woken
+ * (by writing to its pipe), so it will not sleep
+ * We will maintain the invariant (the "baton invariant") that
+ * whenever there is any active poller, there is at least
+ * one baton holder. ("non-app" means simply "not poller_app".)
+ *
+ * No thread outside libxl can have an active non-app poller: pollers
+ * are put on the active list by poller_get which is called in three
+ * places: libxl_event_wait, which puts it before returning;
+ * libxl__ao_create but only in the synchronous case, in which case
+ * the poller is put before returning; and the poller_app, during
+ * initialisation.
+ *
+ * So any time when all libxl threads are blocking (and therefore do
+ * not have the ctx lock), the non-app active pollers belong to those
+ * threads. If at least one is a baton holder (the invariant), that
+ * thread has a good enough event set.
+ *
+ * Now we will demonstrate that the "baton invariant" is maintained:
+ *
+ * The rule is that any thread which might be the baton holder is
+ * responsible for checking that there continues to be a baton holder
+ * as needed.
+ *
+ * Firstly, consider the case when the baton holders (b) cease to be
+ * baton holders because osevents are added.
+ *
+ * There are only two kinds of osevents: timeouts and fds. Every
+ * other internal event source reduces to one of these eventually.
+ * Both of these cases are handled (in the case of fd events, add and
+ * modify, separately), calling pollers_note_osevent_added.
+ *
+ * This walks the poller_active list, marking the active pollers
+ * osevents_added=1. Such a poller cannot be the baton holder. But
+ * pollers_note_osevent_added is called only from ev_* functions,
+ * which are only called from event-chain libxl code: ie, code with an
+ * ao or an egc. So at this point we are a baton holder, and there is
+ * still a baton holder.
+ *
+ * Secondly, consider the case where baton holders (a) cease to be
+ * batton holders because they dispose of their egc or ao. We call
+ * libxl__egc_ao_cleanup_1_baton on every exit path. We arrange that
+ * everything that disposes of an egc or an ao checks that there is a
+ * new baton holder by calling libxl__egc_ao_cleanup_1_baton.
+ *
+ * This function handles the invariant explicitly: if we have any
+ * non-app active pollers it looks for one which is up to date (baton
+ * holder category (b)), and failing that it picks a victim to turn
+ * into the baton holder category (c) by waking it up. (Correctness
+ * depends on this function not spotting its own thread as the
+ * baton-holder, since it is on its way to not being the baton-holder,
+ * so it must be called after the poller has been put back.)
+ *
+ * Thirdly, we must consider the case (c). A thread in category (c)
+ * will reenter libxl when it gains the lock and necessarily then
+ * becomes a baton holder in category (a).
+ *
+ * So the "baton invariant" is maintained. QED.
+ */
+static void pollers_note_osevent_added(libxl_ctx *ctx) {
+ libxl__poller *poller;
+ LIBXL_LIST_FOREACH(poller, &ctx->pollers_active, active_entry)
+ poller->osevents_added = 1;
+}
+
+void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc)
+ /* Any poller we had must have been `put' already. */
+{
+ libxl__poller *search, *wake=0;
+
+ LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) {
+ if (search == CTX->poller_app)
+ /* This one is special. We can't give it the baton. */
+ continue;
+ if (!search->osevents_added)
+ /* This poller is up to date and will wake up as needed. */
+ return;
+ if (!wake)
+ wake = search;
+ }
+
+ if (!wake)
+ /* no-one in libxl waiting for any events */
+ return;
+
+ libxl__poller_wakeup(gc, wake);
+
+ wake->osevents_added = 0;
+ /* This serves to make _1_baton idempotent. It is OK even though
+ * that poller may currently be sleeping on only old osevents,
+ * because it is going to wake up because we've just prodded it,
+ * and it pick up new osevents on its next iteration (or pass
+ * on the baton). */
+}
+
+/*
* The counter osevent_in_hook is used to ensure that the application
* honours the reentrancy restriction documented in libxl_event.h.
*
@@ -194,6 +328,7 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev,
ev->func = func;
LIBXL_LIST_INSERT_HEAD(&CTX->efds, ev, entry);
+ pollers_note_osevent_added(CTX);
rc = 0;
@@ -214,6 +349,8 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events)
rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events);
if (rc) goto out;
+ if ((events & ~ev->events))
+ pollers_note_osevent_added(CTX);
ev->events = events;
rc = 0;
@@ -315,6 +452,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev,
LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/,
timercmp(&ev->abs, &evsearch->abs, >));
+ pollers_note_osevent_added(CTX);
return 0;
}
@@ -1121,6 +1259,7 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller,
*nfds_io = used;
poller->fds_deregistered = 0;
+ poller->osevents_added = 0;
libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
if (etime) {
@@ -1442,7 +1581,7 @@ static void egc_run_callbacks(libxl__egc *egc)
}
}
-void libxl__egc_cleanup(libxl__egc *egc)
+void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc)
{
EGC_GC;
egc_run_callbacks(egc);
@@ -1752,13 +1891,15 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r,
rc = eventloop_iteration(egc, poller);
if (rc) goto out;
- /* we unlock and cleanup the egc each time we go through this loop,
- * so that (a) we don't accumulate garbage and (b) any events
- * which are to be dispatched by callback are actually delivered
- * in a timely fashion.
+ /* we unlock and cleanup the egc each time we go through this
+ * loop, so that (a) we don't accumulate garbage and (b) any
+ * events which are to be dispatched by callback are actually
+ * delivered in a timely fashion. _1_baton will be
+ * called to pass the baton iff we actually leave; otherwise
+ * we are still carrying it.
*/
CTX_UNLOCK;
- libxl__egc_cleanup(egc);
+ libxl__egc_cleanup_2_ul_cb_gc(egc);
CTX_LOCK;
}
@@ -2031,14 +2172,24 @@ int libxl__ao_inprogress(libxl__ao *ao,
* synchronous cancellation ability. */
}
+ /* The call to egc..1_baton is below, only if we are leaving. */
CTX_UNLOCK;
- libxl__egc_cleanup(&egc);
+ libxl__egc_cleanup_2_ul_cb_gc(&egc);
CTX_LOCK;
}
+
+ /* Dispose of this early so libxl__egc_ao_cleanup_1_baton
+ * doesn't mistake us for a baton-holder. No-one much is
+ * going to look at this ao now so setting this to 0 is fine.
+ * We can't call _baton below _leave because _leave destroys
+ * our gc, which _baton needs. */
+ libxl__poller_put(CTX, ao->poller);
+ ao->poller = 0;
} else {
rc = 0;
}
+ libxl__egc_ao_cleanup_1_baton(gc);
ao->in_initiator = 0;
ao__manip_leave(CTX, ao);
@@ -2051,6 +2202,9 @@ int libxl__ao_inprogress(libxl__ao *ao,
static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
/* Temporarily unlocks ctx, which must be locked exactly once on entry. */
{
+ libxl__egc egc;
+ LIBXL_INIT_EGC(egc,ctx);
+
int rc;
ao__manip_enter(parent);
@@ -2071,9 +2225,6 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
/* We keep calling abort hooks until there are none left */
while (!LIBXL_LIST_EMPTY(&parent->abortables)) {
- libxl__egc egc;
- LIBXL_INIT_EGC(egc,ctx);
-
assert(!parent->complete);
libxl__ao_abortable *abrt = LIBXL_LIST_FIRST(&parent->abortables);
@@ -2086,15 +2237,20 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
"ao %p: abrt=%p: aborting", parent, abrt->ao);
abrt->callback(&egc, abrt, ERROR_ABORTED);
+ /* The call to egc..1_baton is in the out block below. */
libxl__ctx_unlock(ctx);
- libxl__egc_cleanup(&egc);
+ libxl__egc_cleanup_2_ul_cb_gc(&egc);
libxl__ctx_lock(ctx);
}
rc = 0;
out:
+ libxl__egc_ao_cleanup_1_baton(&egc.gc);
ao__manip_leave(ctx, parent);
+ /* The call to egc..2_ul_cb_gc is above. This is sufficient
+ * because only code inside the loop adds anything to the egc, and
+ * we ensures that the egc is clean when we leave the loop. */
return rc;
}
@@ -634,9 +634,23 @@ struct libxl__poller {
* event is deregistered, we set the fds_deregistered of all non-idle
* pollers. So afterpoll can tell whether any POLLNVAL is
* plausibly due to an fd being closed and reopened.
+ *
+ * Additionally, we record whether any fd or time event sources
+ * have been registered. This is necessary because sometimes we
+ * need to wake up the only libxl thread stuck in
+ * eventloop_iteration so that it will pick up new fds or earlier
+ * timeouts. osevents_added is cleared by beforepoll, and set by
+ * fd or timeout event registration. When we are about to leave
+ * libxl (strictly, when we are about to give up an egc), we check
+ * whether there are any pollers. If there are, then at least one
+ * of them must have osevents_added clear. If not, we wake up the
+ * first one on the list. Any entry on pollers_active constitutes
+ * a promise to also make this check, so the baton will never be
+ * dropped.
*/
LIBXL_LIST_ENTRY(libxl__poller) active_entry;
bool fds_deregistered;
+ bool osevents_added;
};
struct libxl__gc {
@@ -2350,7 +2364,10 @@ _hidden libxl_device_model_version libxl__default_device_model(libxl__gc *gc);
LIBXL_STAILQ_INIT(&(egc).ev_immediates); \
} while(0)
-_hidden void libxl__egc_cleanup(libxl__egc *egc);
+_hidden void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc);
+ /* Passes the baton for added osevents. See comment for
+ * osevents_added in struct libxl__poller. */
+_hidden void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc);
/* Frees memory allocated within this egc's gc, and and report all
* occurred events via callback, if applicable. May reenter the
* application; see restrictions above. The ctx must be UNLOCKED. */
@@ -2361,9 +2378,11 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
libxl__egc egc[1]; LIBXL_INIT_EGC(egc[0],ctx); \
EGC_GC
-#define EGC_FREE libxl__egc_cleanup(egc)
-
-#define CTX_UNLOCK_EGC_FREE do{ CTX_UNLOCK; EGC_FREE; }while(0)
+#define CTX_UNLOCK_EGC_FREE do{ \
+ libxl__egc_ao_cleanup_1_baton(&egc->gc); \
+ CTX_UNLOCK; \
+ libxl__egc_cleanup_2_ul_cb_gc(egc); \
+ }while(0)
/*
@@ -2468,8 +2487,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
#define AO_INPROGRESS ({ \
libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
+ /* __ao_inprogress will do egc..1_baton if needed */ \
CTX_UNLOCK; \
- EGC_FREE; \
+ libxl__egc_cleanup_2_ul_cb_gc(egc); \
CTX_LOCK; \
int ao__rc = libxl__ao_inprogress(ao, \
__FILE__, __LINE__, __func__); \
@@ -2481,8 +2501,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \
assert(rc); \
libxl__ao_create_fail(ao); \
+ libxl__egc_ao_cleanup_1_baton(&egc->gc); \
libxl__ctx_unlock(ao__ctx); /* gc is now invalid */ \
- EGC_FREE; \
+ libxl__egc_cleanup_2_ul_cb_gc(egc); \
(rc); \
})
If the application calls libxl with ao_how==0 and also makes calls like _occurred, libxl will sometimes get stuck. The bug happens as follows (for example): Thread A libxl_do_thing(,ao_how==0) libxl_do_thing starts, sets up some callbacks libxl_do_thing exit path calls AO_INPROGRESS libxl__ao_inprogress goes into event loop eventloop_iteration sleeps on: - do_thing's current fd set - sigchld pipe if applicable - its poller Thread B libxl_something_occurred the something is to do with do_thing, above do_thing_next_callback does some more work do_thing_next_callback becomes interested in fd N thread B returns to application Note that nothing wakes up thread A. A is not listening on fd N. So do_thing_* will not spot when fd N signals. do_thing will not make further timely progress. If there is no timeout thread A will never wake up. The problem here occurs because thread A is waiting on an out of date osevent set. There is also the possibility that a thread might block waiting for libxl osevents but outside libxl, eg if the application used libxl_osevent_beforepoll. We will deal with that in a moment. See the big comment in libxl_event.c for a fairly formal correctness argument. This depends on libxl__egc_ao_cleanup_1_baton being called everywhere an egc or ao is disposed of. Firstly egcs: in this patch we rename libxl__egc_cleanup, which means we catch all the disposal sites. Secondly aos: these are disposed of by (i) AO_CREATE_FAIL (ii) ao__inprogress and (iii) an event which completes the ao later. (i) and (ii) we handle by adding the call to _baton. In the case of (iii) any such function must be an event-generating function so it has an egc too, so it will pass on the baton when the egc is disposed. Reported-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> --- v2: Call libxl__egc_ao_cleanup_1_baton (renamed from __egc_cleanup) on all exits from ao_inprogress, even requests for async processing. Fixes a remaining instance of this bug (!) This involves disposing of ao->poller somewhat earlier. v2: New correctness arguments in libxl_event.c comment and in commit message. --- tools/libxl/libxl_event.c | 178 ++++++++++++++++++++++++++++++++++++++++--- tools/libxl/libxl_internal.h | 33 ++++++-- 2 files changed, 194 insertions(+), 17 deletions(-)