diff mbox

[v3,4/5] libxl: update vcpus bitmap in retrieved guest config

Message ID 1465983102-19308-5-git-send-email-wei.liu2@citrix.com (mailing list archive)
State New, archived
Headers show

Commit Message

Wei Liu June 15, 2016, 9:31 a.m. UTC
... because the available vcpu bitmap can change during domain life time
due to cpu hotplug and unplug.

For QEMU upstream, we interrogate QEMU for the number of vcpus. For
others, we look directly into xenstore for information.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Anthony PERARD <anthony.perard@citrix.com>

v3:
1. Fix indentation of abort.
2. Use strcmp instead of strncmp.
---
 tools/libxl/libxl.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 96 insertions(+)

Comments

Anthony PERARD June 15, 2016, 5:20 p.m. UTC | #1
On Wed, Jun 15, 2016 at 10:31:41AM +0100, Wei Liu wrote:
> ... because the available vcpu bitmap can change during domain life time
> due to cpu hotplug and unplug.
> 
> For QEMU upstream, we interrogate QEMU for the number of vcpus. For
> others, we look directly into xenstore for information.
> 
> Reported-by: Jan Beulich <jbeulich@suse.com>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>

Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Ian Jackson July 8, 2016, 5:44 p.m. UTC | #2
Wei Liu writes ("[PATCH v3 4/5] libxl: update vcpus bitmap in retrieved guest config"):
> ... because the available vcpu bitmap can change during domain life time
> due to cpu hotplug and unplug.
> 
> For QEMU upstream, we interrogate QEMU for the number of vcpus. For
> others, we look directly into xenstore for information.
...
> +static int libxl__update_avail_vcpus_qmp(libxl__gc *gc, uint32_t domid,
> +                                         unsigned int max_vcpus,
> +                                         libxl_bitmap *map)
> +{
> +    int rc;
> +
> +    /* For QEMU upstream we always need to return the number
> +     * of cpus present to QEMU whether they are online or not;
> +     * otherwise QEMU won't accept the saved state.

This comment is confusing to me.  Perhaps `return' should read
`provide' ?  But the connection between the body of this function and
the comment is still obscure.

> +static int libxl__update_avail_vcpus_xenstore(libxl__gc *gc, uint32_t domid,
> +                                              unsigned int max_vcpus,
> +                                              libxl_bitmap *map)
> +{
> +    int rc;
> +    unsigned int i;
> +    const char *dompath;
...
> +    for (i = 0; i < max_vcpus; i++) {
> +        const char *path = GCSPRINTF("%s/cpu/%u/availability", dompath, i);
> +        const char *content = libxl__xs_read(gc, XBT_NULL, path);
> +        if (!strcmp(content, "online"))
> +            libxl_bitmap_set(map, i);

Firstly, this should be libxl__xs_read_checked.  Secondly if "content"
is NULL, this will crash.

> @@ -7294,6 +7341,55 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
>          libxl_dominfo_dispose(&info);
>      }
>  
> +    /* VCPUs */
> +    {
> +        libxl_bitmap *map = &d_config->b_info.avail_vcpus;
> +        unsigned int max_vcpus = d_config->b_info.max_vcpus;
> +        libxl_device_model_version version;
> +
> +        libxl_bitmap_dispose(map);
> +        libxl_bitmap_init(map);
> +        libxl_bitmap_alloc(CTX, map, max_vcpus);
> +        libxl_bitmap_set_none(map);

I see that this function already trashes the domain config when it
fails (leaving it up to the caller to free the partial config) but
that this is not documented :-/.

> +        /* If the user did not explicitly specify a device model
> +         * version, we need to use the same logic used during domain
> +         * creation to derive the device model version.
> +         */
> +        version = d_config->b_info.device_model_version;
> +        if (version == LIBXL_DEVICE_MODEL_VERSION_UNKNOWN)
> +            version = libxl__get_device_model_version(gc, &d_config->b_info);

I think this is rather unfortunate.  Won't changing the default device
model while the domain is running will cause this function to give
wrong answers ?

Thanks,
Ian.
Wei Liu July 11, 2016, 11:31 a.m. UTC | #3
On Fri, Jul 08, 2016 at 06:44:28PM +0100, Ian Jackson wrote:
> Wei Liu writes ("[PATCH v3 4/5] libxl: update vcpus bitmap in retrieved guest config"):
> > ... because the available vcpu bitmap can change during domain life time
> > due to cpu hotplug and unplug.
> > 
> > For QEMU upstream, we interrogate QEMU for the number of vcpus. For
> > others, we look directly into xenstore for information.
> ...
> > +static int libxl__update_avail_vcpus_qmp(libxl__gc *gc, uint32_t domid,
> > +                                         unsigned int max_vcpus,
> > +                                         libxl_bitmap *map)
> > +{
> > +    int rc;
> > +
> > +    /* For QEMU upstream we always need to return the number
> > +     * of cpus present to QEMU whether they are online or not;
> > +     * otherwise QEMU won't accept the saved state.
> 
> This comment is confusing to me.  Perhaps `return' should read
> `provide' ?  But the connection between the body of this function and
> the comment is still obscure.
> 

"provide" is ok.

To avoid confusion the comment can be moved a few line above to describe
the function, not the code snippet. Does that work better?

> > +static int libxl__update_avail_vcpus_xenstore(libxl__gc *gc, uint32_t domid,
> > +                                              unsigned int max_vcpus,
> > +                                              libxl_bitmap *map)
> > +{
> > +    int rc;
> > +    unsigned int i;
> > +    const char *dompath;
> ...
> > +    for (i = 0; i < max_vcpus; i++) {
> > +        const char *path = GCSPRINTF("%s/cpu/%u/availability", dompath, i);
> > +        const char *content = libxl__xs_read(gc, XBT_NULL, path);
> > +        if (!strcmp(content, "online"))
> > +            libxl_bitmap_set(map, i);
> 
> Firstly, this should be libxl__xs_read_checked.  Secondly if "content"
> is NULL, this will crash.
> 

Will fix.

> > @@ -7294,6 +7341,55 @@ int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
> >          libxl_dominfo_dispose(&info);
> >      }
> >  
> > +    /* VCPUs */
> > +    {
> > +        libxl_bitmap *map = &d_config->b_info.avail_vcpus;
> > +        unsigned int max_vcpus = d_config->b_info.max_vcpus;
> > +        libxl_device_model_version version;
> > +
> > +        libxl_bitmap_dispose(map);
> > +        libxl_bitmap_init(map);
> > +        libxl_bitmap_alloc(CTX, map, max_vcpus);
> > +        libxl_bitmap_set_none(map);
> 
> I see that this function already trashes the domain config when it
> fails (leaving it up to the caller to free the partial config) but
> that this is not documented :-/.

d_config is an out parameter. User can't expect the returned d_config to
be valid if this API returns error state. And user is still responsible
for calling _init before calling this API and _dispose afterwards. So
this still fits into how we expect libxl types to be used. Nothing new
needs to be documented.

All other fields are already handled in this way.

> 
> > +        /* If the user did not explicitly specify a device model
> > +         * version, we need to use the same logic used during domain
> > +         * creation to derive the device model version.
> > +         */
> > +        version = d_config->b_info.device_model_version;
> > +        if (version == LIBXL_DEVICE_MODEL_VERSION_UNKNOWN)
> > +            version = libxl__get_device_model_version(gc, &d_config->b_info);
> 
> I think this is rather unfortunate.  Won't changing the default device
> model while the domain is running will cause this function to give
> wrong answers ?

I think saving the device model into the template should work. No need
to derive it during runtime.

Wei.

> 
> Thanks,
> Ian.
Wei Liu July 11, 2016, 2:59 p.m. UTC | #4
On Mon, Jul 11, 2016 at 12:31:43PM +0100, Wei Liu wrote:
[...]
> 
> > 
> > > +        /* If the user did not explicitly specify a device model
> > > +         * version, we need to use the same logic used during domain
> > > +         * creation to derive the device model version.
> > > +         */
> > > +        version = d_config->b_info.device_model_version;
> > > +        if (version == LIBXL_DEVICE_MODEL_VERSION_UNKNOWN)
> > > +            version = libxl__get_device_model_version(gc, &d_config->b_info);
> > 
> > I think this is rather unfortunate.  Won't changing the default device
> > model while the domain is running will cause this function to give
> > wrong answers ?
> 
> I think saving the device model into the template should work. No need
> to derive it during runtime.
> 

Actually this information is already available via
libxl__device_model_version_running.

Wei.
diff mbox

Patch

diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 65af9ee..b748bf1 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -7246,6 +7246,53 @@  void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, libxl_mac *src)
         (*dst)[i] = (*src)[i];
 }
 
+static int libxl__update_avail_vcpus_qmp(libxl__gc *gc, uint32_t domid,
+                                         unsigned int max_vcpus,
+                                         libxl_bitmap *map)
+{
+    int rc;
+
+    /* For QEMU upstream we always need to return the number
+     * of cpus present to QEMU whether they are online or not;
+     * otherwise QEMU won't accept the saved state.
+     */
+    rc = libxl__qmp_query_cpus(gc, domid, map);
+    if (rc) {
+        LOG(ERROR, "fail to get number of cpus for domain %d", domid);
+        goto out;
+    }
+
+    rc = 0;
+out:
+    return rc;
+}
+
+static int libxl__update_avail_vcpus_xenstore(libxl__gc *gc, uint32_t domid,
+                                              unsigned int max_vcpus,
+                                              libxl_bitmap *map)
+{
+    int rc;
+    unsigned int i;
+    const char *dompath;
+
+    dompath = libxl__xs_get_dompath(gc, domid);
+    if (!dompath) {
+        rc = ERROR_FAIL;
+        goto out;
+    }
+
+    for (i = 0; i < max_vcpus; i++) {
+        const char *path = GCSPRINTF("%s/cpu/%u/availability", dompath, i);
+        const char *content = libxl__xs_read(gc, XBT_NULL, path);
+        if (!strcmp(content, "online"))
+            libxl_bitmap_set(map, i);
+    }
+
+    rc = 0;
+out:
+    return rc;
+}
+
 int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
                                         libxl_domain_config *d_config)
 {
@@ -7294,6 +7341,55 @@  int libxl_retrieve_domain_configuration(libxl_ctx *ctx, uint32_t domid,
         libxl_dominfo_dispose(&info);
     }
 
+    /* VCPUs */
+    {
+        libxl_bitmap *map = &d_config->b_info.avail_vcpus;
+        unsigned int max_vcpus = d_config->b_info.max_vcpus;
+        libxl_device_model_version version;
+
+        libxl_bitmap_dispose(map);
+        libxl_bitmap_init(map);
+        libxl_bitmap_alloc(CTX, map, max_vcpus);
+        libxl_bitmap_set_none(map);
+
+        /* If the user did not explicitly specify a device model
+         * version, we need to use the same logic used during domain
+         * creation to derive the device model version.
+         */
+        version = d_config->b_info.device_model_version;
+        if (version == LIBXL_DEVICE_MODEL_VERSION_UNKNOWN)
+            version = libxl__get_device_model_version(gc, &d_config->b_info);
+
+        switch (d_config->b_info.type) {
+        case LIBXL_DOMAIN_TYPE_HVM:
+            switch (version) {
+            case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN:
+                rc = libxl__update_avail_vcpus_qmp(gc, domid,
+                                                   max_vcpus, map);
+                break;
+            case LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL:
+            case LIBXL_DEVICE_MODEL_VERSION_NONE:
+                rc = libxl__update_avail_vcpus_xenstore(gc, domid,
+                                                        max_vcpus, map);
+                break;
+            default:
+                abort();
+            }
+            break;
+        case LIBXL_DOMAIN_TYPE_PV:
+            rc = libxl__update_avail_vcpus_xenstore(gc, domid,
+                                                    max_vcpus, map);
+            break;
+        default:
+            abort();
+        }
+
+        if (rc) {
+            LOG(ERROR, "fail to update available cpu map for domain %d", domid);
+            goto out;
+        }
+    }
+
     /* Memory limits:
      *
      * Currently there are three memory limits: