Message ID | 20231016151909.22133-12-dwmw2@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Get Xen PV shim running in qemu | expand |
Am 16.10.2023 um 17:19 hat David Woodhouse geschrieben: > From: David Woodhouse <dwmw@amazon.co.uk> > > There's no need to force the user to assign a vdev. We can automatically > assign one, starting at xvda and searching until we find the first disk > name that's unused. > > This means we can now allow '-drive if=xen,file=xxx' to work without an > explicit separate -driver argument, just like if=virtio. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) > XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); > XenBlockVdev *vdev = &blockdev->props.vdev; > > + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { > + char name[11]; > + int disk = 0; > + unsigned long idx; > + > + /* Find an unoccupied device name */ > + while (disk < (1 << 20)) { I like your optimism that we can handle a million disks. :-) I haven't reviewed the Xen part in detail, but the patch looks fine on the block layer side. Acked-by: Kevin Wolf <kwolf@redhat.com>
On Tue, 2023-10-17 at 12:21 +0200, Kevin Wolf wrote: > Am 16.10.2023 um 17:19 hat David Woodhouse geschrieben: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > There's no need to force the user to assign a vdev. We can automatically > > assign one, starting at xvda and searching until we find the first disk > > name that's unused. > > > > This means we can now allow '-drive if=xen,file=xxx' to work without an > > explicit separate -driver argument, just like if=virtio. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > > @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) > > XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); > > XenBlockVdev *vdev = &blockdev->props.vdev; > > > > + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { > > + char name[11]; > > + int disk = 0; > > + unsigned long idx; > > + > > + /* Find an unoccupied device name */ > > + while (disk < (1 << 20)) { > > I like your optimism that we can handle a million disks. :-) Heh, yeah. Given that there *is* a limit, setting it lower seemed a bit arbitrary. For consoles I picked 100 instead of letting it go all the way to INT_MAX, and in a patch set soon to be posted I'll do the same for the xen-net-device as I convert it. Even with a limit of 100, having that many devices *WITHOUT SPECIFYING WHICH ONE IS WHICH* seems a bit many! FWIW I've changed it to check for the existence of the *frontend* nodes (the ones which are visible to the guest). Currently it checks for the backend nodes, but those might be in different places. > I haven't reviewed the Xen part in detail, but the patch looks fine on > the block layer side. > > Acked-by: Kevin Wolf <kwolf@redhat.com> Thanks.
On Mon, 16 Oct 2023 16:19:08 +0100 David Woodhouse <dwmw2@infradead.org> wrote: > From: David Woodhouse <dwmw@amazon.co.uk> > is this index a user (guest) visible? > There's no need to force the user to assign a vdev. We can automatically > assign one, starting at xvda and searching until we find the first disk > name that's unused. > > This means we can now allow '-drive if=xen,file=xxx' to work without an > explicit separate -driver argument, just like if=virtio. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > --- > blockdev.c | 15 ++++++++++++--- > hw/block/xen-block.c | 25 +++++++++++++++++++++++++ > 2 files changed, 37 insertions(+), 3 deletions(-) > > diff --git a/blockdev.c b/blockdev.c > index 325b7a3bef..9dec4c9c74 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -255,13 +255,13 @@ void drive_check_orphaned(void) > * Ignore default drives, because we create certain default > * drives unconditionally, then leave them unclaimed. Not the > * users fault. > - * Ignore IF_VIRTIO, because it gets desugared into -device, > - * so we can leave failing to -device. > + * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into > + * -device, so we can leave failing to -device. > * Ignore IF_NONE, because leaving unclaimed IF_NONE remains > * available for device_add is a feature. > */ > if (dinfo->is_default || dinfo->type == IF_VIRTIO > - || dinfo->type == IF_NONE) { > + || dinfo->type == IF_XEN || dinfo->type == IF_NONE) { > continue; > } > if (!blk_get_attached_dev(blk)) { > @@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type, > qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort); > qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > &error_abort); > + } else if (type == IF_XEN) { > + QemuOpts *devopts; > + devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0, > + &error_abort); > + qemu_opt_set(devopts, "driver", > + (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk", > + &error_abort); > + qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > + &error_abort); > } > > filename = qemu_opt_get(legacy_opts, "file"); > diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c > index 9262338535..20fa783cbe 100644 > --- a/hw/block/xen-block.c > +++ b/hw/block/xen-block.c > @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) > XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); > XenBlockVdev *vdev = &blockdev->props.vdev; > > + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { > + char name[11]; > + int disk = 0; > + unsigned long idx; > + > + /* Find an unoccupied device name */ > + while (disk < (1 << 20)) { > + if (disk < (1 << 4)) { > + idx = (202 << 8) | (disk << 4); > + } else { > + idx = (1 << 28) | (disk << 8); > + } > + snprintf(name, sizeof(name), "%lu", idx); > + if (!xen_backend_exists("qdisk", name)) { > + vdev->type = XEN_BLOCK_VDEV_TYPE_XVD; > + vdev->partition = 0; > + vdev->disk = disk; > + vdev->number = idx; > + return g_strdup(name); > + } > + disk++; > + } > + error_setg(errp, "cannot find device vdev for block device"); > + return NULL; > + } > return g_strdup_printf("%lu", vdev->number); > } >
On Wed, 2023-10-18 at 09:32 +0200, Igor Mammedov wrote: > On Mon, 16 Oct 2023 16:19:08 +0100 > David Woodhouse <dwmw2@infradead.org> wrote: > > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > is this index a user (guest) visible? Yes. It defines what block device (e.g. /dev/xvda) the disk appears as in the guest. In the common case, it literally encodes the Linux major/minor numbers. So xvda (major 202) is 0xca00, xvdb is 0xca10 etc. Previously we had to explicitly set it for each disk in Qemu: -drive file=disk1.img,id=drv1 -device xen-disk,drive=drv1,vdev=xvda -drive file=disk2.img,id=drv2 -device xen-disk,drive=drv2,vdev=xvdb Now we can just do -drive file=disk1.img,if=xen -drive file-disk2.img,if=xen (We could go further and make if=xen the default for Xen guests too, but that doesn't work right now because xen-block will barf on the default provided CD-ROM when it's empty. It doesn't handle empty devices. And if I work around that, then `-hda disk1.img` would work on the command line... but would make it appear as /dev/xvda instead of /dev/hda, and I don't know how I feel about that.) [root@localhost ~]# xenstore-ls -f device/vbd device/vbd = "" device/vbd/51712 = "" device/vbd/51712/backend = "/local/domain/0/backend/qdisk/1/51712" device/vbd/51712/backend-id = "0" device/vbd/51712/device-type = "disk" device/vbd/51712/event-channel = "8" device/vbd/51712/feature-persistent = "1" device/vbd/51712/protocol = "x86_64-abi" device/vbd/51712/ring-ref = "8" device/vbd/51712/state = "4" device/vbd/51712/virtual-device = "51712" > > > There's no need to force the user to assign a vdev. We can automatically > > assign one, starting at xvda and searching until we find the first disk > > name that's unused. > > > > This means we can now allow '-drive if=xen,file=xxx' to work without an > > explicit separate -driver argument, just like if=virtio. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > --- > > blockdev.c | 15 ++++++++++++--- > > hw/block/xen-block.c | 25 +++++++++++++++++++++++++ > > 2 files changed, 37 insertions(+), 3 deletions(-) > > > > diff --git a/blockdev.c b/blockdev.c > > index 325b7a3bef..9dec4c9c74 100644 > > --- a/blockdev.c > > +++ b/blockdev.c > > @@ -255,13 +255,13 @@ void drive_check_orphaned(void) > > * Ignore default drives, because we create certain default > > * drives unconditionally, then leave them unclaimed. Not the > > * users fault. > > - * Ignore IF_VIRTIO, because it gets desugared into -device, > > - * so we can leave failing to -device. > > + * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into > > + * -device, so we can leave failing to -device. > > * Ignore IF_NONE, because leaving unclaimed IF_NONE remains > > * available for device_add is a feature. > > */ > > if (dinfo->is_default || dinfo->type == IF_VIRTIO > > - || dinfo->type == IF_NONE) { > > + || dinfo->type == IF_XEN || dinfo->type == IF_NONE) { > > continue; > > } > > if (!blk_get_attached_dev(blk)) { > > @@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type, > > qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort); > > qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > > &error_abort); > > + } else if (type == IF_XEN) { > > + QemuOpts *devopts; > > + devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0, > > + &error_abort); > > + qemu_opt_set(devopts, "driver", > > + (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk", > > + &error_abort); > > + qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > > + &error_abort); > > } > > > > filename = qemu_opt_get(legacy_opts, "file"); > > diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c > > index 9262338535..20fa783cbe 100644 > > --- a/hw/block/xen-block.c > > +++ b/hw/block/xen-block.c > > @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) > > XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); > > XenBlockVdev *vdev = &blockdev->props.vdev; > > > > + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { > > + char name[11]; > > + int disk = 0; > > + unsigned long idx; > > + > > + /* Find an unoccupied device name */ > > + while (disk < (1 << 20)) { > > + if (disk < (1 << 4)) { > > + idx = (202 << 8) | (disk << 4); > > + } else { > > + idx = (1 << 28) | (disk << 8); > > + } > > + snprintf(name, sizeof(name), "%lu", idx); > > + if (!xen_backend_exists("qdisk", name)) { > > + vdev->type = XEN_BLOCK_VDEV_TYPE_XVD; > > + vdev->partition = 0; > > + vdev->disk = disk; > > + vdev->number = idx; > > + return g_strdup(name); > > + } > > + disk++; > > + } > > + error_setg(errp, "cannot find device vdev for block device"); > > + return NULL; > > + } > > return g_strdup_printf("%lu", vdev->number); > > } > > > >
Am 16.10.2023 um 17:19 hat David Woodhouse geschrieben: > From: David Woodhouse <dwmw@amazon.co.uk> > > There's no need to force the user to assign a vdev. We can automatically > assign one, starting at xvda and searching until we find the first disk > name that's unused. > > This means we can now allow '-drive if=xen,file=xxx' to work without an > explicit separate -driver argument, just like if=virtio. > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Actually, how does this play together with xen_config_dev_blk()? This looks like it tried to implement a very similar thing (which is IF_XEN even already existed). Are we now trying to attach each if=xen disk twice in the 'xenpv' machine? Or if something prevents this, is it dead code? I think in both cases, we would want to delete that function and remove the loop that calls it in xen_init_pv()? Kevin
On Wed, 2023-10-18 at 10:52 +0200, Kevin Wolf wrote: > Am 16.10.2023 um 17:19 hat David Woodhouse geschrieben: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > There's no need to force the user to assign a vdev. We can automatically > > assign one, starting at xvda and searching until we find the first disk > > name that's unused. > > > > This means we can now allow '-drive if=xen,file=xxx' to work without an > > explicit separate -driver argument, just like if=virtio. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > Actually, how does this play together with xen_config_dev_blk()? This > looks like it tried to implement a very similar thing (which is IF_XEN > even already existed). Ah yes, thanks for spotting that! I hadn't been looking at the xenfv > Are we now trying to attach each if=xen disk twice in the 'xenpv' > machine? Or if something prevents this, is it dead code. I suspect we end up creating them twice (and probably thus failing to lock the backing file). The old Xen PV device drivers in Qemu, before Paul's XenDevice conversion, were purely reactive. If they see nodes in XenStore describing a backend which they should implement, they'd do so. The code in xen_config_dev_blk() is *creating* those nodes for the frontend (guest) and backend (host/qemu) to see, which prompts the xen-block and xen-net drivers into action. In the new XenDevice model, we can just instantiate a device (e.g. qdev_new(TYPE_XEN_DISK_DEVICE) and its realize() method will create the corresponding XenStore nodes (with some help from the generic XenBus code for the common ones). The new XenDevice/XenBus model will *also* react to nodes which it discovers in XenStore, and instantiate the corresponding devices. So I suspect what'll happen is xen_config_dev_blk() will create the nodes starting at xvda (int vdev = 202 * 256 + 16 * disk->unit), and later my new code will create a new set starting from xvdb or wherever the next free one is. > I think in both cases, we would want to delete that function and remove > the loop that calls it in xen_init_pv()? Yes, I think so. The xen_config_dev_blk() one can just die, as can xen_config_dev_console() which isn't called from anywhere anyway. The vkbd/vfb ones can stay until/unless those drivers are converted to the new XenDevice model. And xen_config_dev_nic() probably just needs to loop doing the same as I did in pc_init_nic() in https://lore.kernel.org/qemu-devel/20231017182545.97973-5-dwmw2@infradead.org/T/#u + if (xen_bus && (!nd->model || g_str_equal(model, "xen-net-device"))) { + DeviceState *dev = qdev_new("xen-net-device"); + qdev_set_nic_properties(dev, nd); + qdev_realize_and_unref(dev, xen_bus, &error_fatal); ... but this just reinforces what I said there about "if qmp_device_add() can find the damn bus and do this right, why do we have to litter it through platform code?"
On Wed, 2023-10-18 at 10:52 +0200, Kevin Wolf wrote: > Am 16.10.2023 um 17:19 hat David Woodhouse geschrieben: > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > There's no need to force the user to assign a vdev. We can automatically > > assign one, starting at xvda and searching until we find the first disk > > name that's unused. > > > > This means we can now allow '-drive if=xen,file=xxx' to work without an > > explicit separate -driver argument, just like if=virtio. > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > Actually, how does this play together with xen_config_dev_blk()? This > looks like it tried to implement a very similar thing (which is IF_XEN > even already existed). > > Are we now trying to attach each if=xen disk twice in the 'xenpv' > machine? Or if something prevents this, is it dead code? > > I think in both cases, we would want to delete that function and remove > the loop that calls it in xen_init_pv()? I tested this with an xl config which looks a bit like this... disk = [ "backendtype=qdisk,format=qcow2,vdev=xvda,access=rw,target=/var/lib/libvirt/images/fedora28.qcow2" ] device_model_override = "/home/dwmw2/git/qemu/build/qemu-system-x86_64" device_model_version = "qemu-xen" device_model_args = [ "-trace","xen*","-chardev","pty,id=mon","-mon","mon","-drive","file=/var/lib/libvirt/images/solaris11.qcow2,format=qcow2,if=xen","-nic","user,model=xen" ] The code in xen_config_dev_blk() scribbles over the disk that the toolstack has configured for me, and adds that qcow2 file from the '-drive' option on the command line, but in *raw* mode. Then the new xen-disk support kicks in and finds a free vdev, and adds the -drive properly /dev/xvdb (as qcow2). So v2 of this patch will just rip out xen_config_dev_blk() as you suggest.
Am 18.10.2023 um 12:52 hat David Woodhouse geschrieben: > > Actually, how does this play together with xen_config_dev_blk()? This > > looks like it tried to implement a very similar thing (which is IF_XEN > > even already existed). > > Ah yes, thanks for spotting that! I hadn't been looking at the xenfv > > > Are we now trying to attach each if=xen disk twice in the 'xenpv' > > machine? Or if something prevents this, is it dead code. > > I suspect we end up creating them twice (and probably thus failing to > lock the backing file). > > [...] > > ... but this just reinforces what I said there about "if > qmp_device_add() can find the damn bus and do this right, why do we > have to litter it through platform code?" Indeed, if you can do -device, it's always the best option. For block devices not the least because it gives you a way to use -blockdev with it. I'm happy whenever I see a drive_get() call go away. Kevin
On Wed, 2023-10-18 at 11:52 +0100, David Woodhouse wrote: > > And xen_config_dev_nic() probably just needs to loop doing the same > as > I did in pc_init_nic() in > https://lore.kernel.org/qemu-devel/20231017182545.97973-5-dwmw2@infradead.org/T/#u > > + if (xen_bus && (!nd->model || g_str_equal(model, "xen-net- > device"))) { > + DeviceState *dev = qdev_new("xen-net-device"); > + qdev_set_nic_properties(dev, nd); > + qdev_realize_and_unref(dev, xen_bus, &error_fatal); > > > ... but this just reinforces what I said there about "if > qmp_device_add() can find the damn bus and do this right, why do we > have to litter it through platform code?" I had a look through the network setup. There are a bunch of platforms adding specific devices to their own internal system bus, which often use nd_table[] directly. Sometimes they do so whether it's been set up or now. They can mostly be divided into two camps. Some of them will create their NIC anyway, and will use a matching -nic configuration if it exists. Others will only create their NIC if a corresponding -nic configuration does exist. This is fairly random, and perhaps platforms should be more consistent, but it's best to avoid user-visible changes as much as possible while doing the cleanup, so I've kept them as they were. I've created qemu_configure_nic_device() and qemu_create_nic_device() functions for those two use cases respectively, and most code which directly accesses nd_table[] can be converted to those fairly simply: https://git.infradead.org/users/dwmw2/qemu.git/commitdiff/7b4fb6fc10a4 It means I can throw away the horrid parts of -nic support for the Xen network device, which were in the pc and xenfv platforms, and replace it with a trivial loop in xenbus_init(): + /* This is a bus-generic "configure all NICs on this bus type" */ + while ((qnic = qemu_create_nic_device("xen-net-device", true, "xen"))) { + qdev_realize_and_unref(qnic, bus, &error_fatal); Other than that one (which is cheating because there's only one type of network device that can be instantiated on the XenBus), the only remaining case is PCI. Most platforms just iterate over the -nic configurations adding devices to a PCI bus of the platform's choice. In some cases there's a special case, the first one goes at a specific devfn and/or on a different bus. There was a little more variation here... for example fuloong2e would put the first nic (nd_table[0]) in slot 7 only if it's an rtl8139 (if the model is unspecified). But PReP would put the first PCI NIC in slot 3 *regardless* of whether it's the expected PCNET or not. I didn't faithfully preserve the behaviour there, because I don't think it matters. They mostly look like this now (e.g. hw/sh4/r2d): + nd = qemu_find_nic_info(mc->default_nic, true, NULL); + if (nd) { + pci_nic_init_nofail(nd, pci_bus, mc->default_nic, "2"); + } + pci_init_nic_devices(pci_bus, mc->default_nic); So they'll take the first NIC configuration which is of the expected model (again, or unspecified model) and place that in the special slot, and then put the rest of the devices wherever they land. For the change in behaviour to *matter*, the user would have to explicitly specify a NIC of the *non-default* type first, and then a NIC of the default type. My new code will put the default-type NIC in the "right place", while the old code mostly wouldn't. I think that's an OK change to make. My plan is to *remember* which NIC models are used for lookups during the board in, so that qemu_show_nic_devices() can print them all at the end. Current WIP if anyone wants to take a quick look at https://git.infradead.org/users/dwmw2/qemu.git/shortlog/refs/heads/xenfv-net but the weekend arrived quicker than I'd hoped, so I haven't quite got it into shape to post yet. Hopefully it makes sense as an approach though?
On Wed, 18 Oct 2023 09:32:47 +0100 David Woodhouse <dwmw2@infradead.org> wrote: > On Wed, 2023-10-18 at 09:32 +0200, Igor Mammedov wrote: > > On Mon, 16 Oct 2023 16:19:08 +0100 > > David Woodhouse <dwmw2@infradead.org> wrote: > > > > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > > > > is this index a user (guest) visible? > > Yes. It defines what block device (e.g. /dev/xvda) the disk appears as > in the guest. In the common case, it literally encodes the Linux > major/minor numbers. So xvda (major 202) is 0xca00, xvdb is 0xca10 etc. that makes 'index' an implicit ABI and a subject to versioning when the way it's assigned changes (i.e. one has to use versioned machine types to keep older versions working the they used to). From what I remember it's discouraged to make QEMU invent various IDs that are part of ABI (guest or mgmt side). Instead it's preferred for mgmt side/user to provide that explicitly. Basically you are trading off manageability/simplicity at QEMU level with CLI usability for human user. I don't care much as long as it is hidden within xen code base, but maybe libvirt does. > Previously we had to explicitly set it for each disk in Qemu: > > -drive file=disk1.img,id=drv1 -device xen-disk,drive=drv1,vdev=xvda > -drive file=disk2.img,id=drv2 -device xen-disk,drive=drv2,vdev=xvdb > > Now we can just do > > -drive file=disk1.img,if=xen -drive file-disk2.img,if=xen > > (We could go further and make if=xen the default for Xen guests too, > but that doesn't work right now because xen-block will barf on the > default provided CD-ROM when it's empty. It doesn't handle empty > devices. And if I work around that, then `-hda disk1.img` would work on > the command line... but would make it appear as /dev/xvda instead of > /dev/hda, and I don't know how I feel about that.) > > [root@localhost ~]# xenstore-ls -f device/vbd > device/vbd = "" > device/vbd/51712 = "" > device/vbd/51712/backend = "/local/domain/0/backend/qdisk/1/51712" > device/vbd/51712/backend-id = "0" > device/vbd/51712/device-type = "disk" > device/vbd/51712/event-channel = "8" > device/vbd/51712/feature-persistent = "1" > device/vbd/51712/protocol = "x86_64-abi" > device/vbd/51712/ring-ref = "8" > device/vbd/51712/state = "4" > device/vbd/51712/virtual-device = "51712" > > > > > > There's no need to force the user to assign a vdev. We can automatically > > > assign one, starting at xvda and searching until we find the first disk > > > name that's unused. > > > > > > This means we can now allow '-drive if=xen,file=xxx' to work without an > > > explicit separate -driver argument, just like if=virtio. > > > > > > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> > > > --- > > > blockdev.c | 15 ++++++++++++--- > > > hw/block/xen-block.c | 25 +++++++++++++++++++++++++ > > > 2 files changed, 37 insertions(+), 3 deletions(-) > > > > > > diff --git a/blockdev.c b/blockdev.c > > > index 325b7a3bef..9dec4c9c74 100644 > > > --- a/blockdev.c > > > +++ b/blockdev.c > > > @@ -255,13 +255,13 @@ void drive_check_orphaned(void) > > > * Ignore default drives, because we create certain default > > > * drives unconditionally, then leave them unclaimed. Not the > > > * users fault. > > > - * Ignore IF_VIRTIO, because it gets desugared into -device, > > > - * so we can leave failing to -device. > > > + * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into > > > + * -device, so we can leave failing to -device. > > > * Ignore IF_NONE, because leaving unclaimed IF_NONE remains > > > * available for device_add is a feature. > > > */ > > > if (dinfo->is_default || dinfo->type == IF_VIRTIO > > > - || dinfo->type == IF_NONE) { > > > + || dinfo->type == IF_XEN || dinfo->type == IF_NONE) { > > > continue; > > > } > > > if (!blk_get_attached_dev(blk)) { > > > @@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type, > > > qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort); > > > qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > > > &error_abort); > > > + } else if (type == IF_XEN) { > > > + QemuOpts *devopts; > > > + devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0, > > > + &error_abort); > > > + qemu_opt_set(devopts, "driver", > > > + (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk", > > > + &error_abort); > > > + qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), > > > + &error_abort); > > > } > > > > > > filename = qemu_opt_get(legacy_opts, "file"); > > > diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c > > > index 9262338535..20fa783cbe 100644 > > > --- a/hw/block/xen-block.c > > > +++ b/hw/block/xen-block.c > > > @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) > > > XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); > > > XenBlockVdev *vdev = &blockdev->props.vdev; > > > > > > + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { > > > + char name[11]; > > > + int disk = 0; > > > + unsigned long idx; > > > + > > > + /* Find an unoccupied device name */ > > > + while (disk < (1 << 20)) { > > > + if (disk < (1 << 4)) { > > > + idx = (202 << 8) | (disk << 4); > > > + } else { > > > + idx = (1 << 28) | (disk << 8); > > > + } > > > + snprintf(name, sizeof(name), "%lu", idx); > > > + if (!xen_backend_exists("qdisk", name)) { > > > + vdev->type = XEN_BLOCK_VDEV_TYPE_XVD; > > > + vdev->partition = 0; > > > + vdev->disk = disk; > > > + vdev->number = idx; > > > + return g_strdup(name); > > > + } > > > + disk++; > > > + } > > > + error_setg(errp, "cannot find device vdev for block device"); > > > + return NULL; > > > + } > > > return g_strdup_printf("%lu", vdev->number); > > > } > > > > > > > >
On 23 October 2023 10:30:02 BST, Igor Mammedov <imammedo@redhat.com> wrote: >On Wed, 18 Oct 2023 09:32:47 +0100 >David Woodhouse <dwmw2@infradead.org> wrote: > >> On Wed, 2023-10-18 at 09:32 +0200, Igor Mammedov wrote: >> > On Mon, 16 Oct 2023 16:19:08 +0100 >> > David Woodhouse <dwmw2@infradead.org> wrote: >> > >> > > From: David Woodhouse <dwmw@amazon.co.uk> >> > > >> > >> > is this index a user (guest) visible? >> >> Yes. It defines what block device (e.g. /dev/xvda) the disk appears as >> in the guest. In the common case, it literally encodes the Linux >> major/minor numbers. So xvda (major 202) is 0xca00, xvdb is 0xca10 etc. > >that makes 'index' an implicit ABI and a subject to versioning >when the way it's assigned changes (i.e. one has to use versioned >machine types to keep older versions working the they used to). > >From what I remember it's discouraged to make QEMU invent >various IDs that are part of ABI (guest or mgmt side). >Instead it's preferred for mgmt side/user to provide that explicitly. > >Basically you are trading off manageability/simplicity at QEMU >level with CLI usability for human user. >I don't care much as long as it is hidden within xen code base, >but maybe libvirt does. Well, it can still be set explicitly. So not so much a "trade-off" as adding the option for the user to choose the simple way. Yes, in a way it's an ABI, just like the dynamic assignment of PCI devfn for network devices added with "-nic". And I think also for virtio block devices too? And for the ISA ne2000. But it seems unlikely that we'll ever really want to change "the first one is xvda, the second is xvdb...."
Am 23.10.2023 um 11:30 hat Igor Mammedov geschrieben: > On Wed, 18 Oct 2023 09:32:47 +0100 > David Woodhouse <dwmw2@infradead.org> wrote: > > > On Wed, 2023-10-18 at 09:32 +0200, Igor Mammedov wrote: > > > On Mon, 16 Oct 2023 16:19:08 +0100 > > > David Woodhouse <dwmw2@infradead.org> wrote: > > > > > > > From: David Woodhouse <dwmw@amazon.co.uk> > > > > > > > > > > is this index a user (guest) visible? > > > > Yes. It defines what block device (e.g. /dev/xvda) the disk appears as > > in the guest. In the common case, it literally encodes the Linux > > major/minor numbers. So xvda (major 202) is 0xca00, xvdb is 0xca10 etc. > > that makes 'index' an implicit ABI and a subject to versioning > when the way it's assigned changes (i.e. one has to use versioned > machine types to keep older versions working the they used to). > > From what I remember it's discouraged to make QEMU invent > various IDs that are part of ABI (guest or mgmt side). > Instead it's preferred for mgmt side/user to provide that explicitly. > > Basically you are trading off manageability/simplicity at QEMU > level with CLI usability for human user. > I don't care much as long as it is hidden within xen code base, > but maybe libvirt does. -drive is mostly a convenience option for human users anyway. Management tools should use a combination of -blockdev and -device. Kevin
diff --git a/blockdev.c b/blockdev.c index 325b7a3bef..9dec4c9c74 100644 --- a/blockdev.c +++ b/blockdev.c @@ -255,13 +255,13 @@ void drive_check_orphaned(void) * Ignore default drives, because we create certain default * drives unconditionally, then leave them unclaimed. Not the * users fault. - * Ignore IF_VIRTIO, because it gets desugared into -device, - * so we can leave failing to -device. + * Ignore IF_VIRTIO or IF_XEN, because it gets desugared into + * -device, so we can leave failing to -device. * Ignore IF_NONE, because leaving unclaimed IF_NONE remains * available for device_add is a feature. */ if (dinfo->is_default || dinfo->type == IF_VIRTIO - || dinfo->type == IF_NONE) { + || dinfo->type == IF_XEN || dinfo->type == IF_NONE) { continue; } if (!blk_get_attached_dev(blk)) { @@ -977,6 +977,15 @@ DriveInfo *drive_new(QemuOpts *all_opts, BlockInterfaceType block_default_type, qemu_opt_set(devopts, "driver", "virtio-blk", &error_abort); qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), &error_abort); + } else if (type == IF_XEN) { + QemuOpts *devopts; + devopts = qemu_opts_create(qemu_find_opts("device"), NULL, 0, + &error_abort); + qemu_opt_set(devopts, "driver", + (media == MEDIA_CDROM) ? "xen-cdrom" : "xen-disk", + &error_abort); + qemu_opt_set(devopts, "drive", qdict_get_str(bs_opts, "id"), + &error_abort); } filename = qemu_opt_get(legacy_opts, "file"); diff --git a/hw/block/xen-block.c b/hw/block/xen-block.c index 9262338535..20fa783cbe 100644 --- a/hw/block/xen-block.c +++ b/hw/block/xen-block.c @@ -34,6 +34,31 @@ static char *xen_block_get_name(XenDevice *xendev, Error **errp) XenBlockDevice *blockdev = XEN_BLOCK_DEVICE(xendev); XenBlockVdev *vdev = &blockdev->props.vdev; + if (blockdev->props.vdev.type == XEN_BLOCK_VDEV_TYPE_INVALID) { + char name[11]; + int disk = 0; + unsigned long idx; + + /* Find an unoccupied device name */ + while (disk < (1 << 20)) { + if (disk < (1 << 4)) { + idx = (202 << 8) | (disk << 4); + } else { + idx = (1 << 28) | (disk << 8); + } + snprintf(name, sizeof(name), "%lu", idx); + if (!xen_backend_exists("qdisk", name)) { + vdev->type = XEN_BLOCK_VDEV_TYPE_XVD; + vdev->partition = 0; + vdev->disk = disk; + vdev->number = idx; + return g_strdup(name); + } + disk++; + } + error_setg(errp, "cannot find device vdev for block device"); + return NULL; + } return g_strdup_printf("%lu", vdev->number); }