From patchwork Mon Jan 13 17:08:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ian Jackson X-Patchwork-Id: 11330643 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0956292A for ; Mon, 13 Jan 2020 17:11:27 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id CEC062075B for ; Mon, 13 Jan 2020 17:11:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=citrix.com header.i=@citrix.com header.b="amwWs5TD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CEC062075B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=eu.citrix.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ir3DG-0008Sl-MO; Mon, 13 Jan 2020 17:09:14 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1ir3DG-0008Rz-18 for xen-devel@lists.xenproject.org; Mon, 13 Jan 2020 17:09:14 +0000 X-Inumbo-ID: 5df7fffc-3627-11ea-82b0-12813bfff9fa Received: from esa1.hc3370-68.iphmx.com (unknown [216.71.145.142]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 5df7fffc-3627-11ea-82b0-12813bfff9fa; Mon, 13 Jan 2020 17:08:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=citrix.com; s=securemail; t=1578935330; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version; bh=OvIA52z5IUr6U/0MlYqRVk6y9bCWkJfezM/BnDo8E60=; b=amwWs5TDJ1dQ5ACWVn0ufITXHC+kXFc1RGpraIAX2eV297Qjb04EyZNI MsStlbeDyt4+uL4vXdvCJfroe+OliN03Qb8LOInpcLUxAc63r+lE3RAtP BCHZHHZNUg3hb+tkj8lBJkt9MDJ5UfO2g7xX9CRenWd4jjw46g5R0JFqt Y=; Authentication-Results: esa1.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=ian.jackson@eu.citrix.com; spf=Pass smtp.mailfrom=Ian.Jackson@citrix.com; spf=None smtp.helo=postmaster@mail.citrix.com Received-SPF: None (esa1.hc3370-68.iphmx.com: no sender authenticity information available from domain of ian.jackson@eu.citrix.com) identity=pra; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="Ian.Jackson@citrix.com"; x-sender="ian.jackson@eu.citrix.com"; x-conformance=sidf_compatible Received-SPF: Pass (esa1.hc3370-68.iphmx.com: domain of Ian.Jackson@citrix.com designates 162.221.158.21 as permitted sender) identity=mailfrom; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="Ian.Jackson@citrix.com"; x-sender="Ian.Jackson@citrix.com"; x-conformance=sidf_compatible; x-record-type="v=spf1"; x-record-text="v=spf1 ip4:209.167.231.154 ip4:178.63.86.133 ip4:195.66.111.40/30 ip4:85.115.9.32/28 ip4:199.102.83.4 ip4:192.28.146.160 ip4:192.28.146.107 ip4:216.52.6.88 ip4:216.52.6.188 ip4:162.221.158.21 ip4:162.221.156.83 ip4:168.245.78.127 ~all" Received-SPF: None (esa1.hc3370-68.iphmx.com: no sender authenticity information available from domain of postmaster@mail.citrix.com) identity=helo; client-ip=162.221.158.21; receiver=esa1.hc3370-68.iphmx.com; envelope-from="Ian.Jackson@citrix.com"; x-sender="postmaster@mail.citrix.com"; x-conformance=sidf_compatible IronPort-SDR: wdiCFU+iqhFfSEd8Qge0nCvmK/k/qHUnxsh8EBGRUY1rCRDPU7+5VOhtVicZL+bjJGWdrL4Xrl UZmiRPPUFAlbE2d9lLZirYjreBxrrIeoAZyYR9hbSIaIK+WKPUEFEGwNjAKQbYGeT6pd/Gf6XM f6lbfg3nB9G38GvKyubVRoBLdl5Cl59SePgIcFA7RdP4RTFa2SGECUR5hUD7l0E2UViyhAyzAI qWP15UvhH5y5K5wpafmKSisSnvLDOs8eaD5Pgfxszp+ezPqm8FqB8VCIz8cMd0IerIV0vmTlce X38= X-SBRS: 2.7 X-MesageID: 11009473 X-Ironport-Server: esa1.hc3370-68.iphmx.com X-Remote-IP: 162.221.158.21 X-Policy: $RELAYED X-IronPort-AV: E=Sophos;i="5.69,429,1571716800"; d="scan'208";a="11009473" From: Ian Jackson To: Date: Mon, 13 Jan 2020 17:08:39 +0000 Message-ID: <20200113170843.21332-7-ian.jackson@eu.citrix.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20200113170843.21332-1-ian.jackson@eu.citrix.com> References: <20200113170843.21332-1-ian.jackson@eu.citrix.com> MIME-Version: 1.0 Subject: [Xen-devel] [PATCH v2 06/10] libxl: event: Fix hang when mixing blocking and eventy calls X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Ian Jackson , George Dunlap Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" If the application calls libxl with ao_how==0 and also makes calls like _occurred, libxl will sometimes get stuck. The bug happens as follows (for example): Thread A libxl_do_thing(,ao_how==0) libxl_do_thing starts, sets up some callbacks libxl_do_thing exit path calls AO_INPROGRESS libxl__ao_inprogress goes into event loop eventloop_iteration sleeps on: - do_thing's current fd set - sigchld pipe if applicable - its poller Thread B libxl_something_occurred the something is to do with do_thing, above do_thing_next_callback does some more work do_thing_next_callback becomes interested in fd N thread B returns to application Note that nothing wakes up thread A. A is not listening on fd N. So do_thing_* will not spot when fd N signals. do_thing will not make further timely progress. If there is no timeout thread A will never wake up. The problem here occurs because thread A is waiting on an out of date osevent set. There is also the possibility that a thread might block waiting for libxl osevents but outside libxl, eg if the application used libxl_osevent_beforepoll. We will deal with that in a moment. See the big comment in libxl_event.c for a fairly formal correctness argument. This depends on libxl__egc_ao_cleanup_1_baton being called everywhere an egc or ao is disposed of. Firstly egcs: in this patch we rename libxl__egc_cleanup, which means we catch all the disposal sites. Secondly aos: these are disposed of by (i) AO_CREATE_FAIL (ii) ao__inprogress and (iii) an event which completes the ao later. (i) and (ii) we handle by adding the call to _baton. In the case of (iii) any such function must be an event-generating function so it has an egc too, so it will pass on the baton when the egc is disposed. Reported-by: George Dunlap Signed-off-by: Ian Jackson Reviewed-by: George Dunlap --- v2: Call libxl__egc_ao_cleanup_1_baton (renamed from __egc_cleanup) on all exits from ao_inprogress, even requests for async processing. Fixes a remaining instance of this bug (!) This involves disposing of ao->poller somewhat earlier. v2: New correctness arguments in libxl_event.c comment and in commit message. --- tools/libxl/libxl_event.c | 178 ++++++++++++++++++++++++++++++++++++++++--- tools/libxl/libxl_internal.h | 33 ++++++-- 2 files changed, 194 insertions(+), 17 deletions(-) diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c index 268a5da120..b50d4e5074 100644 --- a/tools/libxl/libxl_event.c +++ b/tools/libxl/libxl_event.c @@ -37,6 +37,140 @@ static void ao__check_destroy(libxl_ctx *ctx, libxl__ao *ao); /* + * osevent update baton handling + * + * We need the following property (the "unstale liveness property"): + * + * Whenever any thread is blocking in the libxl event loop[1], at + * least one thread must be using an up to date osevent set. It is OK + * for all but one threads to have stale event sets, because so long + * as one waiting thread has the right event set, any actually + * interesting event will, if nothing else, wake that "right" thread + * up. It will then make some progress and/or, if it exits, ensure + * that some other thread becomes the "right" thread. + * + * [1] TODO: Right now we are considering only the libxl event loop. + * We need to consider application event loop outside libxl too. + * + * Argument that our approach is sound: + * + * The issue we are concerned about is libxl sleeping on an out of + * date fd set, or too long a timeout, so that it doesn't make + * progress. If the property above is satisfied, then if any thread + * is waiting in libxl at least one such thread will be waiting on a + * sufficient osevent set, so any relevant osevent will wake up a + * libxl thread which will either handle the event, or arrange that at + * least one other libxl thread has the right set. + * + * There are two calls to poll in libxl: one is the fd recheck, which + * is not blocking. There is only the one blocking call, in + * eventloop_iteration. poll runs with the ctx unlocked, so osevents + * might be added after it unlocks the ctx - that is what we are + * worried about. + * + * To demonstrate that the unstale liveness property is satisfied: + * + * We define a baton holder as follows: a libxl thread is a baton + * holder if + * (a) it has an egc or an ao and holds the ctx lock, or + * (b) it has an active non-app poller and no osevents have been + * added since it released the lock, or + * (c) it has an active non-app poller which has been woken + * (by writing to its pipe), so it will not sleep + * We will maintain the invariant (the "baton invariant") that + * whenever there is any active poller, there is at least + * one baton holder. ("non-app" means simply "not poller_app".) + * + * No thread outside libxl can have an active non-app poller: pollers + * are put on the active list by poller_get which is called in three + * places: libxl_event_wait, which puts it before returning; + * libxl__ao_create but only in the synchronous case, in which case + * the poller is put before returning; and the poller_app, during + * initialisation. + * + * So any time when all libxl threads are blocking (and therefore do + * not have the ctx lock), the non-app active pollers belong to those + * threads. If at least one is a baton holder (the invariant), that + * thread has a good enough event set. + * + * Now we will demonstrate that the "baton invariant" is maintained: + * + * The rule is that any thread which might be the baton holder is + * responsible for checking that there continues to be a baton holder + * as needed. + * + * Firstly, consider the case when the baton holders (b) cease to be + * baton holders because osevents are added. + * + * There are only two kinds of osevents: timeouts and fds. Every + * other internal event source reduces to one of these eventually. + * Both of these cases are handled (in the case of fd events, add and + * modify, separately), calling pollers_note_osevent_added. + * + * This walks the poller_active list, marking the active pollers + * osevents_added=1. Such a poller cannot be the baton holder. But + * pollers_note_osevent_added is called only from ev_* functions, + * which are only called from event-chain libxl code: ie, code with an + * ao or an egc. So at this point we are a baton holder, and there is + * still a baton holder. + * + * Secondly, consider the case where baton holders (a) cease to be + * batton holders because they dispose of their egc or ao. We call + * libxl__egc_ao_cleanup_1_baton on every exit path. We arrange that + * everything that disposes of an egc or an ao checks that there is a + * new baton holder by calling libxl__egc_ao_cleanup_1_baton. + * + * This function handles the invariant explicitly: if we have any + * non-app active pollers it looks for one which is up to date (baton + * holder category (b)), and failing that it picks a victim to turn + * into the baton holder category (c) by waking it up. (Correctness + * depends on this function not spotting its own thread as the + * baton-holder, since it is on its way to not being the baton-holder, + * so it must be called after the poller has been put back.) + * + * Thirdly, we must consider the case (c). A thread in category (c) + * will reenter libxl when it gains the lock and necessarily then + * becomes a baton holder in category (a). + * + * So the "baton invariant" is maintained. QED. + */ +static void pollers_note_osevent_added(libxl_ctx *ctx) { + libxl__poller *poller; + LIBXL_LIST_FOREACH(poller, &ctx->pollers_active, active_entry) + poller->osevents_added = 1; +} + +void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc) + /* Any poller we had must have been `put' already. */ +{ + libxl__poller *search, *wake=0; + + LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) { + if (search == CTX->poller_app) + /* This one is special. We can't give it the baton. */ + continue; + if (!search->osevents_added) + /* This poller is up to date and will wake up as needed. */ + return; + if (!wake) + wake = search; + } + + if (!wake) + /* no-one in libxl waiting for any events */ + return; + + libxl__poller_wakeup(gc, wake); + + wake->osevents_added = 0; + /* This serves to make _1_baton idempotent. It is OK even though + * that poller may currently be sleeping on only old osevents, + * because it is going to wake up because we've just prodded it, + * and it pick up new osevents on its next iteration (or pass + * on the baton). */ +} + +/* * The counter osevent_in_hook is used to ensure that the application * honours the reentrancy restriction documented in libxl_event.h. * @@ -194,6 +328,7 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev, ev->func = func; LIBXL_LIST_INSERT_HEAD(&CTX->efds, ev, entry); + pollers_note_osevent_added(CTX); rc = 0; @@ -214,6 +349,8 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, short events) rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, events); if (rc) goto out; + if ((events & ~ev->events)) + pollers_note_osevent_added(CTX); ev->events = events; rc = 0; @@ -315,6 +452,7 @@ static int time_register_finite(libxl__gc *gc, libxl__ev_time *ev, LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/, timercmp(&ev->abs, &evsearch->abs, >)); + pollers_note_osevent_added(CTX); return 0; } @@ -1121,6 +1259,7 @@ static int beforepoll_internal(libxl__gc *gc, libxl__poller *poller, *nfds_io = used; poller->fds_deregistered = 0; + poller->osevents_added = 0; libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes); if (etime) { @@ -1442,7 +1581,7 @@ static void egc_run_callbacks(libxl__egc *egc) } } -void libxl__egc_cleanup(libxl__egc *egc) +void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc) { EGC_GC; egc_run_callbacks(egc); @@ -1752,13 +1891,15 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event **event_r, rc = eventloop_iteration(egc, poller); if (rc) goto out; - /* we unlock and cleanup the egc each time we go through this loop, - * so that (a) we don't accumulate garbage and (b) any events - * which are to be dispatched by callback are actually delivered - * in a timely fashion. + /* we unlock and cleanup the egc each time we go through this + * loop, so that (a) we don't accumulate garbage and (b) any + * events which are to be dispatched by callback are actually + * delivered in a timely fashion. _1_baton will be + * called to pass the baton iff we actually leave; otherwise + * we are still carrying it. */ CTX_UNLOCK; - libxl__egc_cleanup(egc); + libxl__egc_cleanup_2_ul_cb_gc(egc); CTX_LOCK; } @@ -2031,14 +2172,24 @@ int libxl__ao_inprogress(libxl__ao *ao, * synchronous cancellation ability. */ } + /* The call to egc..1_baton is below, only if we are leaving. */ CTX_UNLOCK; - libxl__egc_cleanup(&egc); + libxl__egc_cleanup_2_ul_cb_gc(&egc); CTX_LOCK; } + + /* Dispose of this early so libxl__egc_ao_cleanup_1_baton + * doesn't mistake us for a baton-holder. No-one much is + * going to look at this ao now so setting this to 0 is fine. + * We can't call _baton below _leave because _leave destroys + * our gc, which _baton needs. */ + libxl__poller_put(CTX, ao->poller); + ao->poller = 0; } else { rc = 0; } + libxl__egc_ao_cleanup_1_baton(gc); ao->in_initiator = 0; ao__manip_leave(CTX, ao); @@ -2051,6 +2202,9 @@ int libxl__ao_inprogress(libxl__ao *ao, static int ao__abort(libxl_ctx *ctx, libxl__ao *parent) /* Temporarily unlocks ctx, which must be locked exactly once on entry. */ { + libxl__egc egc; + LIBXL_INIT_EGC(egc,ctx); + int rc; ao__manip_enter(parent); @@ -2071,9 +2225,6 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent) /* We keep calling abort hooks until there are none left */ while (!LIBXL_LIST_EMPTY(&parent->abortables)) { - libxl__egc egc; - LIBXL_INIT_EGC(egc,ctx); - assert(!parent->complete); libxl__ao_abortable *abrt = LIBXL_LIST_FIRST(&parent->abortables); @@ -2086,15 +2237,20 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent) "ao %p: abrt=%p: aborting", parent, abrt->ao); abrt->callback(&egc, abrt, ERROR_ABORTED); + /* The call to egc..1_baton is in the out block below. */ libxl__ctx_unlock(ctx); - libxl__egc_cleanup(&egc); + libxl__egc_cleanup_2_ul_cb_gc(&egc); libxl__ctx_lock(ctx); } rc = 0; out: + libxl__egc_ao_cleanup_1_baton(&egc.gc); ao__manip_leave(ctx, parent); + /* The call to egc..2_ul_cb_gc is above. This is sufficient + * because only code inside the loop adds anything to the egc, and + * we ensures that the egc is clean when we leave the loop. */ return rc; } diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h index b68ab218b6..eec4bf767d 100644 --- a/tools/libxl/libxl_internal.h +++ b/tools/libxl/libxl_internal.h @@ -634,9 +634,23 @@ struct libxl__poller { * event is deregistered, we set the fds_deregistered of all non-idle * pollers. So afterpoll can tell whether any POLLNVAL is * plausibly due to an fd being closed and reopened. + * + * Additionally, we record whether any fd or time event sources + * have been registered. This is necessary because sometimes we + * need to wake up the only libxl thread stuck in + * eventloop_iteration so that it will pick up new fds or earlier + * timeouts. osevents_added is cleared by beforepoll, and set by + * fd or timeout event registration. When we are about to leave + * libxl (strictly, when we are about to give up an egc), we check + * whether there are any pollers. If there are, then at least one + * of them must have osevents_added clear. If not, we wake up the + * first one on the list. Any entry on pollers_active constitutes + * a promise to also make this check, so the baton will never be + * dropped. */ LIBXL_LIST_ENTRY(libxl__poller) active_entry; bool fds_deregistered; + bool osevents_added; }; struct libxl__gc { @@ -2350,7 +2364,10 @@ _hidden libxl_device_model_version libxl__default_device_model(libxl__gc *gc); LIBXL_STAILQ_INIT(&(egc).ev_immediates); \ } while(0) -_hidden void libxl__egc_cleanup(libxl__egc *egc); +_hidden void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc); + /* Passes the baton for added osevents. See comment for + * osevents_added in struct libxl__poller. */ +_hidden void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc); /* Frees memory allocated within this egc's gc, and and report all * occurred events via callback, if applicable. May reenter the * application; see restrictions above. The ctx must be UNLOCKED. */ @@ -2361,9 +2378,11 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc); libxl__egc egc[1]; LIBXL_INIT_EGC(egc[0],ctx); \ EGC_GC -#define EGC_FREE libxl__egc_cleanup(egc) - -#define CTX_UNLOCK_EGC_FREE do{ CTX_UNLOCK; EGC_FREE; }while(0) +#define CTX_UNLOCK_EGC_FREE do{ \ + libxl__egc_ao_cleanup_1_baton(&egc->gc); \ + CTX_UNLOCK; \ + libxl__egc_cleanup_2_ul_cb_gc(egc); \ + }while(0) /* @@ -2468,8 +2487,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc); #define AO_INPROGRESS ({ \ libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \ + /* __ao_inprogress will do egc..1_baton if needed */ \ CTX_UNLOCK; \ - EGC_FREE; \ + libxl__egc_cleanup_2_ul_cb_gc(egc); \ CTX_LOCK; \ int ao__rc = libxl__ao_inprogress(ao, \ __FILE__, __LINE__, __func__); \ @@ -2481,8 +2501,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc); libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc); \ assert(rc); \ libxl__ao_create_fail(ao); \ + libxl__egc_ao_cleanup_1_baton(&egc->gc); \ libxl__ctx_unlock(ao__ctx); /* gc is now invalid */ \ - EGC_FREE; \ + libxl__egc_cleanup_2_ul_cb_gc(egc); \ (rc); \ })