diff mbox series

[4/9] libxl_internal: Create new lock for devices hotplug via QMP

Message ID 20190409164542.30274-5-anthony.perard@citrix.com (mailing list archive)
State Superseded
Headers show
Series libxl: New slow lock + fix libxl_cdrom_insert with QEMU depriv | expand

Commit Message

Anthony PERARD April 9, 2019, 4:45 p.m. UTC
The current lock `domain_userdata_lock' can't be used when modification
to a guest is done by sending command to QEMU, this is a slow process
and requires to call CTX_UNLOCK, which is not possible while holding
the `domain_userdata_lock'.

To resolve this issue, we create a new lock which can take over part of
the job of the json_lock.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>

---

Quote from Ian:
> The invariant that we want to maintain is:
>
>   * Nothing may exist in the primary config without
>     a corresponding entry in libxl-json.
[...]
> How about the following scheme.  We split the libxl-json lock into
> two.  I'm going to call them the fast lock and the slow lock.
>
>   * The fast lock is the existing libxl-json lock.
>
>   * The slow lock is outside the libxl-json lock in the lock
>     hierarchy.  It is also outside the libxl_ctx lock.  It is
>     to be acquired by an ao event callback.
>
>   * No-one may read or edit the libxl-json without holding the fast
>     lock across their read operation, or their read/modify/write
>     cycle.
>
>   * However, there are special rules for thing removal/addition, for
>     things added/removed via qmp.  Call these `qmp things'.  It is
>     permissible to add or remove a qmp thing across two separate
>     acquisitions of the fast lock, one to read the old state of the
>     thing, and one to read/modify/write to update (only) the new state
>     of the thing.  This is subject to the thing add/removal rule, from
>     before, which becomes:
>
>   * You may not cause a thing to be added to the primary config
>     unless you have held the relevant thing lock continuously
>     since ensuring that the libxl-json config describes it.
>
>   * Conversely you may not cause a thing to be removed from the
>     libxl-json unless you have held the relevant thing lock
>     continuously since ensuring the thing is absent from the primary
>     config.
>
>   * The `relevant thing lock' is the slow lock for qmp things, and the
>     fast lock for other things.
>
>   * Acquiring the fast lock fails for a destroyed domain, as at
>     present.
---
 tools/libxl/libxl_internal.c | 27 +++++++++++++++++++++++++
 tools/libxl/libxl_internal.h | 39 ++++++++++++++++++++++++++++++++----
 2 files changed, 62 insertions(+), 4 deletions(-)

Comments

Ian Jackson June 4, 2019, 5:11 p.m. UTC | #1
Anthony PERARD writes ("[PATCH 4/9] libxl_internal: Create new lock for devices hotplug via QMP"):
> The current lock `domain_userdata_lock' can't be used when modification
> to a guest is done by sending command to QEMU, this is a slow process
> and requires to call CTX_UNLOCK, which is not possible while holding
> the `domain_userdata_lock'.
...
> +struct libxl__domain_qmp_lock {
> +    libxl__generic_lock lock;
> +};
> +
> +libxl__domain_qmp_lock *libxl__lock_domain_qmp(libxl__gc *gc,
> +                                               libxl_domid domid)
> +{
> +    libxl__domain_qmp_lock *lock = NULL;
> +    int rc;
> +
> +    lock = libxl__zalloc(NOGC, sizeof(*lock));
> +    rc = libxl__lock_generic(gc, domid, &lock->lock,
> +                             "libxl-device-changes-lock");
> +    if (rc) {
> +        libxl__unlock_domain_qmp(lock);
> +        return NULL;
> +    }
> +
> +    return lock;

This is almost identical to the libxl__lock_domain_userdata which
appeared in the previous patch.  That suggests that the factorisation
here is at the wrong layer.

> +void libxl__unlock_domain_qmp(libxl__domain_qmp_lock *lock)
> +{
> +    libxl__unlock_generic(&lock->lock);
> +    free(lock);
> +}

This is completely identical to libxl__unlock_domain_userdata.  The
only reason we have to two of these is so that the two locks are
distinguished by the type of the lock struct.  That means that you
have to call
   libxl__unlock_domain_qmp(foo->qmp_lock)
but
   libxl__unlock_domain_userdate(foo->userdata_lock)
or some such, and the compiler will spot if you write
   libxl__unlock_domain_userdata(foo->qmp_lock)
or something.  But is it really useful to have to write the `qmp' vs
`userdata' thing twice ?

> + * It is to be acquired by an ao event callback.

I think there is a worse problem here, though.  This lock is probably
*inside* some lock from the caller.  So usual libxl rules apply and
you may not block to acquaire it.

Ie libxl__lock_domain_qmp must be one of these things that takes a
little ev state struct and makes a callback.

At the syscall level, acquiring an fcntl lock cannot be selected on.
The options are to poll, or to spawn a thread, or to fork.

Currently libxl does not call pthread_create, although maybe it could
do.  To safely create a thread you have to do a dance with
sigprocmask, to avoid signal delivery onto your thread.

Maybe it would be better to try once with LOCK_NB, and to fork if this
is not successful.  But it would be simpler to always fork...

> + * Or in case QEMU is the primary config, this pattern can be use:
> + *   lock qmp (qmp_lock)
> + *      lock json config (json_lock)
> + *          read json config
> + *          update in-memory json config with new entry, replacing
> + *             any stale entry
> + *      unlock json config
> + *      apply new config to primary config
> + *      lock json config (json_lock)
> + *          read json config
> + *          update in-memory json config with new entry, replacing
> + *             any stale entry
> + *          write in-memory json config to disk
> + *      unlock json config
> + *   unlock qmp

This doesn't show the ctx locks but I think that is fine.  So I think
this description is correct.

Ian.
Anthony PERARD June 5, 2019, 2:10 p.m. UTC | #2
On Tue, Jun 04, 2019 at 06:11:23PM +0100, Ian Jackson wrote:
> Anthony PERARD writes ("[PATCH 4/9] libxl_internal: Create new lock for devices hotplug via QMP"):
> > The current lock `domain_userdata_lock' can't be used when modification
> > to a guest is done by sending command to QEMU, this is a slow process
> > and requires to call CTX_UNLOCK, which is not possible while holding
> > the `domain_userdata_lock'.
> ...
> > +struct libxl__domain_qmp_lock {
> > +    libxl__generic_lock lock;
> > +};
> > +
> > +libxl__domain_qmp_lock *libxl__lock_domain_qmp(libxl__gc *gc,
> > +                                               libxl_domid domid)
> > +{
> > +    libxl__domain_qmp_lock *lock = NULL;
> > +    int rc;
> > +
> > +    lock = libxl__zalloc(NOGC, sizeof(*lock));
> > +    rc = libxl__lock_generic(gc, domid, &lock->lock,
> > +                             "libxl-device-changes-lock");
> > +    if (rc) {
> > +        libxl__unlock_domain_qmp(lock);
> > +        return NULL;
> > +    }
> > +
> > +    return lock;
> 
> This is almost identical to the libxl__lock_domain_userdata which
> appeared in the previous patch.  That suggests that the factorisation
> here is at the wrong layer.
> 
> > +void libxl__unlock_domain_qmp(libxl__domain_qmp_lock *lock)
> > +{
> > +    libxl__unlock_generic(&lock->lock);
> > +    free(lock);
> > +}
> 
> This is completely identical to libxl__unlock_domain_userdata.  The
> only reason we have to two of these is so that the two locks are
> distinguished by the type of the lock struct.  That means that you
> have to call
>    libxl__unlock_domain_qmp(foo->qmp_lock)
> but
>    libxl__unlock_domain_userdate(foo->userdata_lock)
> or some such, and the compiler will spot if you write
>    libxl__unlock_domain_userdata(foo->qmp_lock)
> or something.  But is it really useful to have to write the `qmp' vs
> `userdata' thing twice ?

So, instead of that interface, how about a different one that uses the
same C type for both kind of lock?

    libxl__lock *libxl__lock_domain_userdata(libxl__gc *, libxl_domid);
    libxl__lock *libxl__lock_domain_qmp(libxl__gc *, libxl_domid);
    void libxl__unlock(libxl__lock *);

Or maybe avoid having two functions for locking and use a #define/enum
instead:
    libxl__lock_domain(gc, domid, LOCK_USERDATA);
    libxl__lock_domain(gc, domid, LOCK_QMP);

I'm not sur what should the last parameter be. Either a string or a
enum. An enum would be better because the lock filename would be all in
a single location in the source code.

What do you think? Would the first proposal be enough to avoid having to
write `userdata' or `qmp' twice on unlock?

> > + * It is to be acquired by an ao event callback.
> 
> I think there is a worse problem here, though.  This lock is probably
> *inside* some lock from the caller.  So usual libxl rules apply and
> you may not block to acquaire it.
> 
> Ie libxl__lock_domain_qmp must be one of these things that takes a
> little ev state struct and makes a callback.
> 
> At the syscall level, acquiring an fcntl lock cannot be selected on.
> The options are to poll, or to spawn a thread, or to fork.
> 
> Currently libxl does not call pthread_create, although maybe it could
> do.  To safely create a thread you have to do a dance with
> sigprocmask, to avoid signal delivery onto your thread.
> 
> Maybe it would be better to try once with LOCK_NB, and to fork if this
> is not successful.  But it would be simpler to always fork...

After our talk IRL, I'll go the fork route.
Also, I'm thinking to always fork when libxl is built with "debug=y",
and allow the optimisation of trying first with LOCK_NB when built with
"debug=n", so the forked code will actually be tested regulary.

Thanks,
Ian Jackson June 5, 2019, 2:32 p.m. UTC | #3
Anthony PERARD writes ("Re: [PATCH 4/9] libxl_internal: Create new lock for devices hotplug via QMP"):
> So, instead of that interface, how about a different one that uses the
> same C type for both kind of lock?
> 
>     libxl__lock *libxl__lock_domain_userdata(libxl__gc *, libxl_domid);
>     libxl__lock *libxl__lock_domain_qmp(libxl__gc *, libxl_domid);
>     void libxl__unlock(libxl__lock *);

I think that would be fine.

> Or maybe avoid having two functions for locking and use a #define/enum
> instead:
>     libxl__lock_domain(gc, domid, LOCK_USERDATA);
>     libxl__lock_domain(gc, domid, LOCK_QMP);

Or this.

But I think maybe this conversation will be superseded by the need to
redo the implementation which will result in a totally different API
for the slow lock, and probably a different state struct.

> What do you think? Would the first proposal be enough to avoid having to
> write `userdata' or `qmp' twice on unlock?

I don't *mind* the writing `userdata' or `qmp' twice.  I just
discounting it as a *benefit*.  I mind the duplicated implementation
code.

> > Maybe it would be better to try once with LOCK_NB, and to fork if this
> > is not successful.  But it would be simpler to always fork...
> 
> After our talk IRL, I'll go the fork route.
> Also, I'm thinking to always fork when libxl is built with "debug=y",
> and allow the optimisation of trying first with LOCK_NB when built with
> "debug=n", so the forked code will actually be tested regulary.

Good plan.

Ian.
diff mbox series

Patch

diff --git a/tools/libxl/libxl_internal.c b/tools/libxl/libxl_internal.c
index fa0bbc3960..db281ac259 100644
--- a/tools/libxl/libxl_internal.c
+++ b/tools/libxl/libxl_internal.c
@@ -607,6 +607,33 @@  void libxl__update_domain_configuration(libxl__gc *gc,
     dst->b_info.video_memkb = src->b_info.video_memkb;
 }
 
+struct libxl__domain_qmp_lock {
+    libxl__generic_lock lock;
+};
+
+libxl__domain_qmp_lock *libxl__lock_domain_qmp(libxl__gc *gc,
+                                               libxl_domid domid)
+{
+    libxl__domain_qmp_lock *lock = NULL;
+    int rc;
+
+    lock = libxl__zalloc(NOGC, sizeof(*lock));
+    rc = libxl__lock_generic(gc, domid, &lock->lock,
+                             "libxl-device-changes-lock");
+    if (rc) {
+        libxl__unlock_domain_qmp(lock);
+        return NULL;
+    }
+
+    return lock;
+}
+
+void libxl__unlock_domain_qmp(libxl__domain_qmp_lock *lock)
+{
+    libxl__unlock_generic(&lock->lock);
+    free(lock);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index f1aefaf98a..43b44f2c30 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -2561,6 +2561,21 @@  typedef struct libxl__ao_device libxl__ao_device;
 typedef struct libxl__multidev libxl__multidev;
 typedef void libxl__device_callback(libxl__egc*, libxl__ao_device*);
 
+/*
+ * Lock for device hotplug, qmp_lock.
+ *
+ * This lock is outside the json_lock lock in lock hierarchy.
+ * It is also outside the libxl_ctx lock.
+ * It is to be acquired by an ao event callback.
+ *
+ * It is to be acquired when adding/removing devices or making changes
+ * to them via QMP, when this is a slow operation.
+ */
+typedef struct libxl__domain_qmp_lock libxl__domain_qmp_lock;
+_hidden libxl__domain_qmp_lock *libxl__lock_domain_qmp(libxl__gc *gc,
+                                                           libxl_domid domid);
+_hidden void libxl__unlock_domain_qmp(libxl__domain_qmp_lock *lock);
+
 /* This functions sets the necessary libxl__ao_device struct values to use
  * safely inside functions. It marks the operation as "active"
  * since we need to be sure that all device status structs are set
@@ -2749,11 +2764,11 @@  struct libxl__multidev {
  * device information, in JSON files, so that we can use this JSON
  * file as a template to reconstruct domain configuration.
  *
- * In essense there are now two views of device state, one is xenstore,
- * the other is JSON file. We use xenstore as primary reference.
+ * In essense there are now two views of device state, one is the
+ * primary config (xenstore or QEMU), the other is JSON file.
  *
- * Here we maintain one invariant: every device in xenstore must have
- * an entry in JSON file.
+ * Here we maintain one invariant: every device in the primary config
+ * must have an entry in JSON file.
  *
  * All device hotplug routines should comply to following pattern:
  *   lock json config (json_lock)
@@ -2768,6 +2783,22 @@  struct libxl__multidev {
  *       end for loop
  *   unlock json config
  *
+ * Or in case QEMU is the primary config, this pattern can be use:
+ *   lock qmp (qmp_lock)
+ *      lock json config (json_lock)
+ *          read json config
+ *          update in-memory json config with new entry, replacing
+ *             any stale entry
+ *      unlock json config
+ *      apply new config to primary config
+ *      lock json config (json_lock)
+ *          read json config
+ *          update in-memory json config with new entry, replacing
+ *             any stale entry
+ *          write in-memory json config to disk
+ *      unlock json config
+ *   unlock qmp
+ *
  * Device removal routines are not touched.
  *
  * Here is the proof that we always maintain that invariant and we