Message ID | 20240407143633.24108-1-jandryuk@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] libxl: Enable stubdom cdrom changing | expand |
On Sun, Apr 7, 2024 at 10:36 AM Jason Andryuk <jandryuk@gmail.com> wrote: > > To change the cd-rom medium, libxl will: > - QMP eject the medium from QEMU > - block-detach the old PV disk > - block-attach the new PV disk > - QMP change the medium to the new PV disk by fdset-id > > The QMP code is reused, and remove and attach are implemented here. > > The stubdom must internally handle adding /dev/xvdc to the appropriate > fdset. libxl in dom0 doesn't see the result of adding to the fdset as > that is internal to the stubdom, but the fdset's opaque fields will be > set to stub-devid:$devid, so libxl can identify it. $devid is common > between the stubdom and libxl, so it can be identified on both side. > The stubdom will name the device xvdY regardless of the guest name hdY, > sdY, or xvdY, but the stubdom will be assigned the same devid > facilitating lookup. Because the stubdom add-fd call is asynchronous, > libxl needs to poll query-fdsets to identify when add-fd has completed. > > For cd-eject, we still need to attach the empty vbd. This is necessary > since xenstore is used to determine that hdc exists. Otherwise after > eject, hdc would be gone and the cd-insert would fail to find the drive > to insert new media. > > Signed-off-by: Jason Andryuk <jandryuk@gmail.com> > Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> > --- Also, these are the stubdom side changes. The stubdom QEMU command line should gain a qmp socket for the script: -chardev socket,server=on,wait=off,path=/tmp/qemu-cdrom.qmp,id=m-cdrom -mon chardev=m-cdrom,mode=control mdev.conf: # Only add for addition xvd[a-d] 0:0 660 @qemu-xvdc-add-fd.sh And the attached qemu-xvdc-add-fd.sh script to call add-fd with "opaque" set as "stub-devid:$devid". DEVPATH is only set on hot plug and not cold plug. Using it when available avoids IO - busybox `readlink` will need to be installed in the stubdom. If cold plugging is not performed, libxl skips issuing the remove-fd call when it can't find a matching fdset. Regards, Jason
On Sun, Apr 07, 2024 at 10:36:33AM -0400, Jason Andryuk wrote: > diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c > index fa7856f28c..819d34933b 100644 > --- a/tools/libs/light/libxl_disk.c > +++ b/tools/libs/light/libxl_disk.c > @@ -829,21 +829,122 @@ int libxl_device_disk_getinfo(libxl_ctx *ctx, uint32_t domid, > return rc; > } > > +/* > + * Search through the query-fdsets JSON looking for a matching devid. > + * > + * If found, return the fdset-id integer (>=0). > + * > + * If not found, return ERROR_NOTFOUND. > + * > + * On error, return libxl ERROR_*. > + */ > +static int query_fdsets_find_fdset(libxl__gc *gc, > + const libxl__json_object *response, > + int devid) > +{ > + const libxl__json_object *fdset; > + const char *needle = GCSPRINTF("stub-devid:%d", devid); > + int i, j, rc; > + > + /* query-fdsets returns: > + * [ > + * { "fds": [ > + * { "fd": 30, > + * "opaque": "stub-devid:2080" > + * } > + * ], > + * "fdset-id": 1 > + * } > + * ] > + */ > + for (i = 0; (fdset = libxl__json_array_get(response, i)); i++) { > + const libxl__json_object *fdset_id; > + const libxl__json_object *fds; > + const libxl__json_object *fd; > + > + fdset_id = libxl__json_map_get("fdset-id", fdset, JSON_INTEGER); > + if (!fdset_id) { > + rc = ERROR_QEMU_API; > + goto out; > + } > + LOG(DEBUG, "fdset-id=%lld", libxl__json_object_get_integer(fdset_id)); This feels like we are going to have a log of logging about information that isn't going to be used by libxl. Also, an fdset-id is also logged by the caller of this function. (But even there, it might not be useful). When debugging, it might be more helpful to run `query-fdset` by hand than having libxl listing every possible fdset. > + fds = libxl__json_map_get("fds", fdset, JSON_ARRAY); > + if (!fds) { > + rc = ERROR_QEMU_API; > + goto out; > + } > + for (j = 0; (fd = libxl__json_array_get(fds, j)); j++) { > + const libxl__json_object *fd_num; > + const libxl__json_object *opaque; > + const char *opaque_str; > + > + fd_num = libxl__json_map_get("fd", fd, JSON_INTEGER); > + if (!fd_num) { > + rc = ERROR_QEMU_API; > + goto out; > + } > + opaque = libxl__json_map_get("opaque", fd, JSON_STRING); > + if (!opaque) { > + continue; > + } > + > + opaque_str = libxl__json_object_get_string(opaque); > + LOG(DEBUG, "fd=%lld opaque='%s'", > + libxl__json_object_get_integer(fd_num), opaque_str); This logging is also probably too verbose. First, libxl never care about which fd QEMU is using, second, if the opaque doesn't match, it is probably not the one we want, and the needed one is just missing. By the way, there's a big hammer that can pottentiolly be used when debuging QMP interaction, rebuild libxl with -DDEBUG_QMP_CLIENT, this will log every QMP command sent and received. > + if (strcmp(opaque_str, needle) == 0) { > + return libxl__json_object_get_integer(fdset_id); > + } > + } > + } > + rc = ERROR_NOTFOUND; > + > + out: > + return rc; > +} > + > typedef struct { > libxl__ao *ao; > + libxl__ao_device aodev; > libxl_domid domid; > + libxl_domid disk_domid; > libxl_device_disk *disk; > libxl_device_disk disk_saved; > libxl__ev_slowlock qmp_lock; > int dm_ver; > libxl__ev_time time; > + libxl__ev_time timeout_retry; `timeout_retry` sounds to me that we are adding a timeout for a retry, but it is used the opposite way, as a timer to wait until we retry. How about naming this filed "retry_timer" instead? Ah, I see you've added a similar field named "timeout_retries" in libxl_pci.c in the past, but I did introduce a "retry_timer" field in that same file before. So either is fine even if I've got a preference. > libxl__ev_qmp qmp; > + int retries; > + int stubdom_fdset; > } libxl__cdrom_insert_state; > > static void cdrom_insert_lock_acquired(libxl__egc *, libxl__ev_slowlock *, > int rc); > static void cdrom_insert_qmp_connected(libxl__egc *, libxl__ev_qmp *, > const libxl__json_object *, int rc); > +static void cdrom_insert_stubdom_query_fdset_rm(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *resp, > + int rc); > +static void cdrom_insert_stubdom_parse_fdset_rm(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *resp, > + int rc); > +static void cdrom_insert_stubdom_ejected(libxl__egc *egc, libxl__ev_qmp *, > + const libxl__json_object *, int rc); > +static void cdrom_insert_stubdom_disk_remove_cb(libxl__egc *egc, > + libxl__ao_device *aodev); > +static void cdrom_insert_stubdom_disk_add_cb(libxl__egc *egc, > + libxl__ao_device *aodev); > +static void cdrom_insert_stubdom_query_fdset(libxl__egc *egc, > + libxl__ev_time *ev, > + const struct timeval *abs, > + int rc); > +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *response, > + int rc); > static void cdrom_insert_ejected(libxl__egc *egc, libxl__ev_qmp *, > const libxl__json_object *, int rc); > static void cdrom_insert_addfd_cb(libxl__egc *egc, libxl__ev_qmp *, > @@ -865,6 +966,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, > libxl_device_disk *disks = NULL; > int rc; > libxl__cdrom_insert_state *cis; > + libxl_domid stubdomid; > > GCNEW(cis); > cis->ao = ao; > @@ -876,6 +978,8 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, > cis->qmp_lock.ao = ao; > cis->qmp_lock.domid = domid; > libxl__ev_time_init(&cis->time); > + libxl__ev_time_init(&cis->timeout_retry); > + cis->retries = 0; Could you initialise `stubdom_fdset` as well? It's looks like it's possible to have an fdset==0, so initialising it to -1 would be helpful. (And technically, retries is already init to 0, because GCNEW() fill do that for us, by using zallloc.) > libxl__ev_qmp_init(&cis->qmp); > cis->qmp.ao = ao; > cis->qmp.domid = domid; > @@ -1002,6 +1120,224 @@ out: > cdrom_insert_done(egc, cis, rc); /* must be last */ > } > > +static void cdrom_insert_stubdom_query_fdset_rm(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *resp, > + int rc) > +{ > + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); > + STATE_AO_GC(cis->ao); It doesn't looks like we need to call STATE_AO_GC() here, nothing uses an `ao` or a `gc`, and nothing needs an allocation that needs to exist after this function returns. > + if (rc) goto out; > + > + /* Only called for qemu-xen/linux stubdom. */ > + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); > + > + cis->qmp.callback = cdrom_insert_stubdom_parse_fdset_rm; > + > + rc = libxl__ev_qmp_send(egc, &cis->qmp, "query-fdsets", NULL); > + if (rc) goto out; > + > + return; > + > + out: > + cdrom_insert_done(egc, cis, rc); /* must be last */ > +} > + > +static void cdrom_insert_stubdom_parse_fdset_rm(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *resp, > + int rc) > +{ > + EGC_GC; > + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); > + int devid; > + int fdset; > + > + if (rc) goto out; > + > + /* Only called for qemu-xen/linux stubdom. */ > + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); > + > + devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL); > + fdset = query_fdsets_find_fdset(gc, resp, devid); > + if (fdset >= 0) goto found; Could you try to reorder the function to avoid this goto? It's fine to have goto for the out/error path because you can write one exit path for the function, but otherwise, I think it's rare that goto make the code easier to understand. > + if (fdset != ERROR_NOTFOUND) { > + rc = fdset; > + goto out; > + } > + > + LOGD(DEBUG, cis->domid, "No fdset found - skipping remove-fd"); > + cdrom_insert_stubdom_ejected(egc, qmp, resp, 0); > + > + return; > + > + found: > + cis->stubdom_fdset = fdset; Is this value in `cis->stubdom_fdset` used anywhere? We are calling "remove-fd" on it so I don't think we are going to use it. So better not to keep it. > + LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset); > + > + libxl__json_object *args = NULL; > + > + libxl__qmp_param_add_integer(gc, &args, "fdset-id", cis->stubdom_fdset); > + > + cis->qmp.callback = cdrom_insert_stubdom_ejected; > + > + rc = libxl__ev_qmp_send(egc, &cis->qmp, "remove-fd", args); > + if (rc) goto out; > + > + return; > + > + out: > + cdrom_insert_done(egc, cis, rc); /* must be last */ > +} > + > +static void cdrom_insert_stubdom_disk_add_cb(libxl__egc *egc, > + libxl__ao_device *aodev) > +{ > + EGC_GC; > + libxl__cdrom_insert_state *cis = CONTAINER_OF(aodev, *cis, aodev); > + > + if (aodev->rc) { > + LOGD(ERROR, aodev->dev->domid, "Unable to insert stubdom PV disk id %u", > + aodev->dev->devid); > + goto out; > + } > + > + cdrom_insert_stubdom_query_fdset(egc, &cis->timeout_retry, NULL, aodev->rc); Last parameter here could just be ERROR_TIMEDOUT, as this would be the expected rc by the function anyway. And `aodev->rc` should be 0 here anyway. > + return; > + > + out: > + cdrom_insert_done(egc, cis, aodev->rc); > +} > + > +static void cdrom_insert_stubdom_query_fdset(libxl__egc *egc, > + libxl__ev_time *ev, > + const struct timeval *abs, > + int rc) > +{ > + EGC_GC; > + libxl__cdrom_insert_state *cis = CONTAINER_OF(ev, *cis, timeout_retry); > + > + /* When called as an ev_time callback, rc will be ERROR_TIMEDOUT.*/ This comment isn't really useful. Also, we could just rewrite the error check to expect ERROR_TIMEDOUT, or it's an error. Looking at libxl__ev_time_callback typedef in libxl_internal.c, the only value `rc` should have are ERROR_TIMEDOUT or ERROR_ABORTED. > + if (rc && rc != ERROR_TIMEDOUT) goto out; > + > + /* Only called for qemu-xen/linux stubdom. */ > + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); > + > + cis->qmp.callback = cdrom_insert_stubdom_parse_fdset; > + > + rc = libxl__ev_qmp_send(egc, &cis->qmp, "query-fdsets", NULL); > + if (rc) goto out; > + > + return; > + > + out: > + cdrom_insert_done(egc, cis, rc); /* must be last */ > +} > + > +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc, > + libxl__ev_qmp *qmp, > + const libxl__json_object *response, > + int rc) > +{ > + EGC_GC; > + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); > + int devid; > + int fdset; > + > + if (rc) goto out; > + > + /* Only called for qemu-xen/linux stubdom. */ > + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); > + > + devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL); > + fdset = query_fdsets_find_fdset(gc, response, devid); > + if (fdset >= 0) goto found; Like before, would it be possible to rewrite the function without this `goto found`? > + if (fdset != ERROR_NOTFOUND) { > + rc = fdset; > + goto out; > + } > + > + if (cis->retries++ > 10) { Looking at this number, and the retry timer set to 200ms, it looks like libxl will wait for only about 2 second, is that enough? We could just rely on the general cdrom insert timeout `cis->time`, or is that too long? > + LOGD(DEBUG, cis->domid, "Out of query-fdsets retries"); It this useful? Also, an error message might be more useful, and an error message that say something about stubdom not doing its part of the job and libxl gave up waiting. > + rc = ERROR_TIMEDOUT; > + goto out; > + } > + > + LOGD(DEBUG, cis->domid, "Scheduling query-fdsets retry %d", cis->retries); That just going to spam the console, I don't think it's useful. We already have logging about a "query-fdsets" command been ran. > + rc = libxl__ev_time_register_rel(cis->ao, &cis->timeout_retry, > + cdrom_insert_stubdom_query_fdset, > + 200); > + if (rc) goto out; > + > + return; > + > + found: > + cis->stubdom_fdset = fdset; > + > + LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset); > + cdrom_insert_ejected(egc, &cis->qmp, NULL, rc); > + return; > + > + out: > + cdrom_insert_done(egc, cis, rc); /* must be last */ > +} > + > static void cdrom_insert_ejected(libxl__egc *egc, > libxl__ev_qmp *qmp, > const libxl__json_object *response, > @@ -1081,10 +1417,13 @@ static void cdrom_insert_ejected(libxl__egc *egc, > rc = libxl__dm_check_start(gc, &d_config, domid); > if (rc) goto out; > > + /* A linux stubdom will perform add-fd with calculated stubdom_fdset. */ Left over comment? > if (cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN && > + libxl_get_stubdom_id(CTX, cis->domid) == 0 && > disk->format != LIBXL_DISK_FORMAT_EMPTY) { > libxl__json_object *args = NULL; > > + LOGD(DEBUG, cis->domid, "Doing QMP add-fd path"); I don't think this logging is useful. > assert(qmp->payload_fd == -1); > qmp->payload_fd = open(disk->pdev_path, O_RDONLY); > if (qmp->payload_fd < 0) { > @@ -1094,7 +1433,11 @@ static void cdrom_insert_ejected(libxl__egc *egc, > goto out; > } > > - /* This free form parameter is not use by QEMU or libxl. */ > + /* > + * This free form parameter is not used by QEMU or non-stubdom libxl. > + * For a linux stubdom, opaque is set to "stub-devid:$devid" to > + * identify the fdset. > + */ Nothing is going to use this particular "opaque" value, right? The comment say that for stubdom, the value is set to a particular value, but it's not done so here. A comment about "stub-devid:$devid" might be useful in query_fdsets_find_fdset(), but not here. > QMP_PARAMETERS_SPRINTF(&args, "opaque", "%s:%s", > libxl_disk_format_to_string(disk->format), > disk->pdev_path); > @@ -1103,6 +1446,7 @@ static void cdrom_insert_ejected(libxl__egc *egc, > if (rc) goto out; > has_callback = true; > } else { > + LOGD(DEBUG, cis->domid, "Skipping QMP add-fd path"); I think we can found out about that because "add-fd" isn't called. It doesn't feel like a useful logging. > has_callback = false; > } > > @@ -1115,8 +1459,16 @@ out: > if (rc) { > cdrom_insert_done(egc, cis, rc); /* must be last */ > } else if (!has_callback) { > - /* Only called if no asynchronous callback are set. */ This comment should still apply, even if we run for a stubdom, otherwise, `has_callback` is badly named. > - cdrom_insert_inserted(egc, qmp, NULL, 0); /* must be last */ > + if (libxl_get_stubdom_id(CTX, cis->domid) && > + disk->format != LIBXL_DISK_FORMAT_EMPTY) { > + LOGD(DEBUG, cis->domid, > + "stubdom %d needs to perform add-fd internally", > + libxl_get_stubdom_id(CTX, cis->domid)); > + cdrom_insert_addfd_cb(egc, qmp, NULL, 0); /* must be last */ > + } else { > + /* Only called if no asynchronous callback are set. */ > + cdrom_insert_inserted(egc, qmp, NULL, 0); /* must be last */ > + } > } > } > > @@ -1135,17 +1487,22 @@ static void cdrom_insert_addfd_cb(libxl__egc *egc, > /* convenience aliases */ > libxl_device_disk *disk = cis->disk; > > - close(qmp->payload_fd); > - qmp->payload_fd = -1; I think I'd rather keep this here, before the error check. Just call close() conditionally on the value of payload_fd? The thing is, if payload_fd is still set, libxl__ev_qmp* is going to try to submit the fd again, and closing the fd earlier is better. > if (rc) goto out; > > - o = libxl__json_map_get("fdset-id", response, JSON_INTEGER); > - if (!o) { > - rc = ERROR_FAIL; > - goto out; > + /* response non-NULL only for non-stubdom */ > + if (response) { I'm not comfortable with checking the value of `response` to find out if we are on the linux stubdom path. It could be missing for other reason (probably on programming error). There's `cis->stubdom_fdset` that should have a correct value in the linux stubdom case, can we rely on that instead? (Which also refer to why I asked early to init this field) > + close(qmp->payload_fd); > + qmp->payload_fd = -1; > + > + o = libxl__json_map_get("fdset-id", response, JSON_INTEGER); > + if (!o) { > + rc = ERROR_FAIL; > + goto out; > + } > + fdset = libxl__json_object_get_integer(o); > + } else { > + fdset = cis->stubdom_fdset; > } > - fdset = libxl__json_object_get_integer(o); > > devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL); > qmp->callback = cdrom_insert_inserted; Thanks,
diff --git a/docs/misc/stubdom.txt b/docs/misc/stubdom.txt index c717a95d17..64c220db20 100644 --- a/docs/misc/stubdom.txt +++ b/docs/misc/stubdom.txt @@ -127,6 +127,16 @@ Limitations: - at most 26 emulated disks are supported (more are still available as PV disks) - graphics output (VNC/SDL/Spice) not supported +CD-ROM changing: + +To change the CD-ROM medium, libxl will: + - QMP eject the medium from QEMU + - block-detach the old PV disk + - block-attach the new PV disk + - QMP change the medium to the new PV disk by fdset-id + +The stubdom must internally add /dev/xvdc to an fdset in QEMU with opaque set +to "stub-devid:$devid". libxl will lookup the fdset with that string. PV-GRUB ======= diff --git a/tools/libs/light/libxl_disk.c b/tools/libs/light/libxl_disk.c index fa7856f28c..819d34933b 100644 --- a/tools/libs/light/libxl_disk.c +++ b/tools/libs/light/libxl_disk.c @@ -829,21 +829,122 @@ int libxl_device_disk_getinfo(libxl_ctx *ctx, uint32_t domid, return rc; } +/* + * Search through the query-fdsets JSON looking for a matching devid. + * + * If found, return the fdset-id integer (>=0). + * + * If not found, return ERROR_NOTFOUND. + * + * On error, return libxl ERROR_*. + */ +static int query_fdsets_find_fdset(libxl__gc *gc, + const libxl__json_object *response, + int devid) +{ + const libxl__json_object *fdset; + const char *needle = GCSPRINTF("stub-devid:%d", devid); + int i, j, rc; + + /* query-fdsets returns: + * [ + * { "fds": [ + * { "fd": 30, + * "opaque": "stub-devid:2080" + * } + * ], + * "fdset-id": 1 + * } + * ] + */ + for (i = 0; (fdset = libxl__json_array_get(response, i)); i++) { + const libxl__json_object *fdset_id; + const libxl__json_object *fds; + const libxl__json_object *fd; + + fdset_id = libxl__json_map_get("fdset-id", fdset, JSON_INTEGER); + if (!fdset_id) { + rc = ERROR_QEMU_API; + goto out; + } + LOG(DEBUG, "fdset-id=%lld", libxl__json_object_get_integer(fdset_id)); + + fds = libxl__json_map_get("fds", fdset, JSON_ARRAY); + if (!fds) { + rc = ERROR_QEMU_API; + goto out; + } + for (j = 0; (fd = libxl__json_array_get(fds, j)); j++) { + const libxl__json_object *fd_num; + const libxl__json_object *opaque; + const char *opaque_str; + + fd_num = libxl__json_map_get("fd", fd, JSON_INTEGER); + if (!fd_num) { + rc = ERROR_QEMU_API; + goto out; + } + opaque = libxl__json_map_get("opaque", fd, JSON_STRING); + if (!opaque) { + continue; + } + + opaque_str = libxl__json_object_get_string(opaque); + LOG(DEBUG, "fd=%lld opaque='%s'", + libxl__json_object_get_integer(fd_num), opaque_str); + if (strcmp(opaque_str, needle) == 0) { + return libxl__json_object_get_integer(fdset_id); + } + } + } + rc = ERROR_NOTFOUND; + + out: + return rc; +} + typedef struct { libxl__ao *ao; + libxl__ao_device aodev; libxl_domid domid; + libxl_domid disk_domid; libxl_device_disk *disk; libxl_device_disk disk_saved; libxl__ev_slowlock qmp_lock; int dm_ver; libxl__ev_time time; + libxl__ev_time timeout_retry; libxl__ev_qmp qmp; + int retries; + int stubdom_fdset; } libxl__cdrom_insert_state; static void cdrom_insert_lock_acquired(libxl__egc *, libxl__ev_slowlock *, int rc); static void cdrom_insert_qmp_connected(libxl__egc *, libxl__ev_qmp *, const libxl__json_object *, int rc); +static void cdrom_insert_stubdom_query_fdset_rm(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *resp, + int rc); +static void cdrom_insert_stubdom_parse_fdset_rm(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *resp, + int rc); +static void cdrom_insert_stubdom_ejected(libxl__egc *egc, libxl__ev_qmp *, + const libxl__json_object *, int rc); +static void cdrom_insert_stubdom_disk_remove_cb(libxl__egc *egc, + libxl__ao_device *aodev); +static void cdrom_insert_stubdom_disk_add_cb(libxl__egc *egc, + libxl__ao_device *aodev); +static void cdrom_insert_stubdom_query_fdset(libxl__egc *egc, + libxl__ev_time *ev, + const struct timeval *abs, + int rc); +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *response, + int rc); static void cdrom_insert_ejected(libxl__egc *egc, libxl__ev_qmp *, const libxl__json_object *, int rc); static void cdrom_insert_addfd_cb(libxl__egc *egc, libxl__ev_qmp *, @@ -865,6 +966,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, libxl_device_disk *disks = NULL; int rc; libxl__cdrom_insert_state *cis; + libxl_domid stubdomid; GCNEW(cis); cis->ao = ao; @@ -876,6 +978,8 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, cis->qmp_lock.ao = ao; cis->qmp_lock.domid = domid; libxl__ev_time_init(&cis->time); + libxl__ev_time_init(&cis->timeout_retry); + cis->retries = 0; libxl__ev_qmp_init(&cis->qmp); cis->qmp.ao = ao; cis->qmp.domid = domid; @@ -892,12 +996,6 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, goto out; } - if (libxl_get_stubdom_id(ctx, domid) != 0) { - LOGD(ERROR, domid, "cdrom-insert doesn't work for stub domains"); - rc = ERROR_INVAL; - goto out; - } - cis->dm_ver = libxl__device_model_version_running(gc, domid); if (cis->dm_ver == -1) { LOGD(ERROR, domid, "Cannot determine device model version"); @@ -905,7 +1003,22 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, goto out; } - disks = libxl__device_list(gc, &libxl__disk_devtype, domid, &num); + stubdomid = libxl_get_stubdom_id(CTX, cis->domid); + if (stubdomid == 0) { + cis->disk_domid = domid; + } else { + cis->disk_domid = stubdomid; + disk->backend = LIBXL_DISK_BACKEND_PHY; + } + + if (cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN_TRADITIONAL && + stubdomid) { + LOGD(ERROR, domid, "cdrom-insert doesn't work for Mini-OS stubdoms"); + rc = ERROR_INVAL; + goto out; + } + + disks = libxl__device_list(gc, &libxl__disk_devtype, cis->disk_domid, &num); for (i = 0; i < num; i++) { if (disks[i].is_cdrom && !strcmp(disk->vdev, disks[i].vdev)) { @@ -920,7 +1033,7 @@ int libxl_cdrom_insert(libxl_ctx *ctx, uint32_t domid, libxl_device_disk *disk, goto out; } - rc = libxl__device_disk_setdefault(gc, domid, disk, false); + rc = libxl__device_disk_setdefault(gc, cis->disk_domid, disk, false); if (rc) goto out; if (!disk->pdev_path) { @@ -994,7 +1107,12 @@ static void cdrom_insert_qmp_connected(libxl__egc *egc, libxl__ev_qmp *qmp, QMP_PARAMETERS_SPRINTF(&args, "id", "ide-%i", devid); else QMP_PARAMETERS_SPRINTF(&args, "device", "ide-%i", devid); - qmp->callback = cdrom_insert_ejected; + + if (libxl_get_stubdom_id(CTX, cis->domid)) + qmp->callback = cdrom_insert_stubdom_query_fdset_rm; + else + qmp->callback = cdrom_insert_ejected; + rc = libxl__ev_qmp_send(egc, qmp, "eject", args); if (rc) goto out; return; @@ -1002,6 +1120,224 @@ out: cdrom_insert_done(egc, cis, rc); /* must be last */ } +static void cdrom_insert_stubdom_query_fdset_rm(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *resp, + int rc) +{ + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); + STATE_AO_GC(cis->ao); + + if (rc) goto out; + + /* Only called for qemu-xen/linux stubdom. */ + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); + + cis->qmp.callback = cdrom_insert_stubdom_parse_fdset_rm; + + rc = libxl__ev_qmp_send(egc, &cis->qmp, "query-fdsets", NULL); + if (rc) goto out; + + return; + + out: + cdrom_insert_done(egc, cis, rc); /* must be last */ +} + +static void cdrom_insert_stubdom_parse_fdset_rm(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *resp, + int rc) +{ + EGC_GC; + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); + int devid; + int fdset; + + if (rc) goto out; + + /* Only called for qemu-xen/linux stubdom. */ + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); + + devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL); + fdset = query_fdsets_find_fdset(gc, resp, devid); + if (fdset >= 0) goto found; + if (fdset != ERROR_NOTFOUND) { + rc = fdset; + goto out; + } + + LOGD(DEBUG, cis->domid, "No fdset found - skipping remove-fd"); + cdrom_insert_stubdom_ejected(egc, qmp, resp, 0); + + return; + + found: + cis->stubdom_fdset = fdset; + + LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset); + + libxl__json_object *args = NULL; + + libxl__qmp_param_add_integer(gc, &args, "fdset-id", cis->stubdom_fdset); + + cis->qmp.callback = cdrom_insert_stubdom_ejected; + + rc = libxl__ev_qmp_send(egc, &cis->qmp, "remove-fd", args); + if (rc) goto out; + + return; + + out: + cdrom_insert_done(egc, cis, rc); /* must be last */ +} + + +static void cdrom_insert_stubdom_ejected(libxl__egc *egc, libxl__ev_qmp *qmp, + const libxl__json_object *response, + int rc) +{ + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); + libxl__device *device; + STATE_AO_GC(cis->ao); + domid_t stubdomid = cis->disk_domid; + + if (rc) goto out; + + GCNEW(device); + rc = libxl__device_from_disk(gc, stubdomid, cis->disk, device); + if (rc) goto out; + + /* stubdom PV block dev eject */ + libxl__prepare_ao_device(ao, &cis->aodev); + cis->aodev.action = LIBXL__DEVICE_ACTION_REMOVE; + cis->aodev.dev = device; + cis->aodev.callback = cdrom_insert_stubdom_disk_remove_cb; + cis->aodev.force.flag = LIBXL__FORCE_OFF; + libxl__initiate_device_generic_remove(egc, &cis->aodev); + return; + + out: + cdrom_insert_done(egc, cis, rc); /* must be last */ +} + +static void cdrom_insert_stubdom_disk_remove_cb(libxl__egc *egc, + libxl__ao_device *aodev) +{ + STATE_AO_GC(aodev->ao); + libxl__cdrom_insert_state *cis = CONTAINER_OF(aodev, *cis, aodev); + domid_t stubdomid = cis->disk_domid; + + if (aodev->rc) { + LOGD(ERROR, aodev->dev->domid, "Unable to remove stubdom PV disk id %u", + aodev->dev->devid); + goto out; + } + + /* block dev insert - this may be inserting an empty disk for eject. */ + libxl__prepare_ao_device(ao, &cis->aodev); + /* set an ao callback to end up in cdrom_insert_ejected */ + cis->aodev.callback = cdrom_insert_stubdom_disk_add_cb; + libxl__device_disk_add(egc, stubdomid, cis->disk, &cis->aodev); + + return; + + out: + cdrom_insert_done(egc, cis, aodev->rc); /* must be last */ +} + +static void cdrom_insert_stubdom_disk_add_cb(libxl__egc *egc, + libxl__ao_device *aodev) +{ + EGC_GC; + libxl__cdrom_insert_state *cis = CONTAINER_OF(aodev, *cis, aodev); + + if (aodev->rc) { + LOGD(ERROR, aodev->dev->domid, "Unable to insert stubdom PV disk id %u", + aodev->dev->devid); + goto out; + } + + cdrom_insert_stubdom_query_fdset(egc, &cis->timeout_retry, NULL, aodev->rc); + return; + + out: + cdrom_insert_done(egc, cis, aodev->rc); +} + +static void cdrom_insert_stubdom_query_fdset(libxl__egc *egc, + libxl__ev_time *ev, + const struct timeval *abs, + int rc) +{ + EGC_GC; + libxl__cdrom_insert_state *cis = CONTAINER_OF(ev, *cis, timeout_retry); + + /* When called as an ev_time callback, rc will be ERROR_TIMEDOUT.*/ + if (rc && rc != ERROR_TIMEDOUT) goto out; + + /* Only called for qemu-xen/linux stubdom. */ + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); + + cis->qmp.callback = cdrom_insert_stubdom_parse_fdset; + + rc = libxl__ev_qmp_send(egc, &cis->qmp, "query-fdsets", NULL); + if (rc) goto out; + + return; + + out: + cdrom_insert_done(egc, cis, rc); /* must be last */ +} + +static void cdrom_insert_stubdom_parse_fdset(libxl__egc *egc, + libxl__ev_qmp *qmp, + const libxl__json_object *response, + int rc) +{ + EGC_GC; + libxl__cdrom_insert_state *cis = CONTAINER_OF(qmp, *cis, qmp); + int devid; + int fdset; + + if (rc) goto out; + + /* Only called for qemu-xen/linux stubdom. */ + assert(cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN); + + devid = libxl__device_disk_dev_number(cis->disk->vdev, NULL, NULL); + fdset = query_fdsets_find_fdset(gc, response, devid); + if (fdset >= 0) goto found; + if (fdset != ERROR_NOTFOUND) { + rc = fdset; + goto out; + } + + if (cis->retries++ > 10) { + LOGD(DEBUG, cis->domid, "Out of query-fdsets retries"); + rc = ERROR_TIMEDOUT; + goto out; + } + + LOGD(DEBUG, cis->domid, "Scheduling query-fdsets retry %d", cis->retries); + rc = libxl__ev_time_register_rel(cis->ao, &cis->timeout_retry, + cdrom_insert_stubdom_query_fdset, + 200); + if (rc) goto out; + + return; + + found: + cis->stubdom_fdset = fdset; + + LOGD(DEBUG, cis->domid, "Found fdset %d", cis->stubdom_fdset); + cdrom_insert_ejected(egc, &cis->qmp, NULL, rc); + return; + + out: + cdrom_insert_done(egc, cis, rc); /* must be last */ +} + static void cdrom_insert_ejected(libxl__egc *egc, libxl__ev_qmp *qmp, const libxl__json_object *response, @@ -1026,7 +1362,7 @@ static void cdrom_insert_ejected(libxl__egc *egc, if (rc) goto out; - rc = libxl__device_from_disk(gc, domid, disk, &device); + rc = libxl__device_from_disk(gc, cis->disk_domid, disk, &device); if (rc) goto out; be_path = libxl__device_backend_path(gc, &device); libxl_path = libxl__device_libxl_path(gc, &device); @@ -1081,10 +1417,13 @@ static void cdrom_insert_ejected(libxl__egc *egc, rc = libxl__dm_check_start(gc, &d_config, domid); if (rc) goto out; + /* A linux stubdom will perform add-fd with calculated stubdom_fdset. */ if (cis->dm_ver == LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN && + libxl_get_stubdom_id(CTX, cis->domid) == 0 && disk->format != LIBXL_DISK_FORMAT_EMPTY) { libxl__json_object *args = NULL; + LOGD(DEBUG, cis->domid, "Doing QMP add-fd path"); assert(qmp->payload_fd == -1); qmp->payload_fd = open(disk->pdev_path, O_RDONLY); if (qmp->payload_fd < 0) { @@ -1094,7 +1433,11 @@ static void cdrom_insert_ejected(libxl__egc *egc, goto out; } - /* This free form parameter is not use by QEMU or libxl. */ + /* + * This free form parameter is not used by QEMU or non-stubdom libxl. + * For a linux stubdom, opaque is set to "stub-devid:$devid" to + * identify the fdset. + */ QMP_PARAMETERS_SPRINTF(&args, "opaque", "%s:%s", libxl_disk_format_to_string(disk->format), disk->pdev_path); @@ -1103,6 +1446,7 @@ static void cdrom_insert_ejected(libxl__egc *egc, if (rc) goto out; has_callback = true; } else { + LOGD(DEBUG, cis->domid, "Skipping QMP add-fd path"); has_callback = false; } @@ -1115,8 +1459,16 @@ out: if (rc) { cdrom_insert_done(egc, cis, rc); /* must be last */ } else if (!has_callback) { - /* Only called if no asynchronous callback are set. */ - cdrom_insert_inserted(egc, qmp, NULL, 0); /* must be last */ + if (libxl_get_stubdom_id(CTX, cis->domid) && + disk->format != LIBXL_DISK_FORMAT_EMPTY) { + LOGD(DEBUG, cis->domid, + "stubdom %d needs to perform add-fd internally", + libxl_get_stubdom_id(CTX, cis->domid)); + cdrom_insert_addfd_cb(egc, qmp, NULL, 0); /* must be last */ + } else { + /* Only called if no asynchronous callback are set. */ + cdrom_insert_inserted(egc, qmp, NULL, 0); /* must be last */ + } } } @@ -1135,17 +1487,22 @@ static void cdrom_insert_addfd_cb(libxl__egc *egc, /* convenience aliases */ libxl_device_disk *disk = cis->disk; - close(qmp->payload_fd); - qmp->payload_fd = -1; - if (rc) goto out; - o = libxl__json_map_get("fdset-id", response, JSON_INTEGER); - if (!o) { - rc = ERROR_FAIL; - goto out; + /* response non-NULL only for non-stubdom */ + if (response) { + close(qmp->payload_fd); + qmp->payload_fd = -1; + + o = libxl__json_map_get("fdset-id", response, JSON_INTEGER); + if (!o) { + rc = ERROR_FAIL; + goto out; + } + fdset = libxl__json_object_get_integer(o); + } else { + fdset = cis->stubdom_fdset; } - fdset = libxl__json_object_get_integer(o); devid = libxl__device_disk_dev_number(disk->vdev, NULL, NULL); qmp->callback = cdrom_insert_inserted; @@ -1158,8 +1515,13 @@ static void cdrom_insert_addfd_cb(libxl__egc *egc, if (libxl__qmp_ev_qemu_compare_version(qmp, 2, 8, 0) >= 0) { QMP_PARAMETERS_SPRINTF(&args, "id", "ide-%i", devid); QMP_PARAMETERS_SPRINTF(&args, "filename", "/dev/fdset/%d", fdset); - libxl__qmp_param_add_string(gc, &args, "format", - libxl__qemu_disk_format_string(disk->format)); + if (response) { + libxl__qmp_param_add_string(gc, &args, "format", + libxl__qemu_disk_format_string(disk->format)); + } else { + /* Stubdom is using blockdev /dev/xvd* */ + libxl__qmp_param_add_string(gc, &args, "format", "host_device"); + } rc = libxl__ev_qmp_send(egc, qmp, "blockdev-change-medium", args); } else { QMP_PARAMETERS_SPRINTF(&args, "device", "ide-%i", devid); @@ -1196,7 +1558,7 @@ static void cdrom_insert_inserted(libxl__egc *egc, if (rc) goto out; - rc = libxl__device_from_disk(gc, domid, disk, &device); + rc = libxl__device_from_disk(gc, cis->disk_domid, disk, &device); if (rc) goto out; be_path = libxl__device_backend_path(gc, &device); libxl_path = libxl__device_libxl_path(gc, &device); @@ -1281,6 +1643,7 @@ static void cdrom_insert_done(libxl__egc *egc, EGC_GC; libxl__ev_time_deregister(gc, &cis->time); + libxl__ev_time_deregister(gc, &cis->timeout_retry); libxl__ev_qmp_dispose(gc, &cis->qmp); if (cis->qmp.payload_fd >= 0) close(cis->qmp.payload_fd); libxl__ev_slowlock_unlock(gc, &cis->qmp_lock);