Message ID | 1610488352-18494-25-git-send-email-olekstysh@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | IOREQ feature (+ virtio-mmio) on Arm | expand |
Oleksandr Tyshchenko writes ("[PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration"): > This patch adds basic support for configuring and assisting virtio-disk > backend (emualator) which is intended to run out of Qemu and could be run > in any domain. Thanks. I think this is a very important feature. But I think this part at least needs some work. (That's not inappropriate for an RFC patch - so please don't feel you have done anything wrong. I hope you will find my comments constructive.) > An example of domain configuration (two disks are assigned to the guest, > the latter is in readonly mode): > > vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] I can see why you have done it like this but I am concerned that this is not well-integrated with the existing disk configuration system. As a result not only is your new feature lacking support for many existing libxl features (block backend scripts, cdroms tagged as such, non-raw formats) that could otherwise be made available, but I think adding them later would be quite awkward. I it would be better to reuse (and, if necessary, adapt) the existing disk parsing logic in libxl, so that the syntax for your new vdisks = [...] parameter is the same as for the existing disks. Or even better, simply make your new kind of disk a new flag on the existing disk structure. > Also there is one suggestion from Wei Chen regarding a parameter for > domain config file which I haven't addressed yet. > [Just copy here what Wei said in V2 thread] > Can we keep use the same 'disk' parameter for virtio-disk, but add > an option like "model=virtio-disk"? > For example: > disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ] > Just like what Xen has done for x86 virtio-net. This is the same suggestion I make above, basically. It would be much better, yes. > Xenstore was chosen as a communication interface for the emulator > running in non-toolstack domain to be able to get configuration > either by reading Xenstore directly or by receiving command line > parameters (an updated 'xl devd' running in the same domain would > read Xenstore beforehand and call backend executable with the > required arguments). I was surprised to read this because I would expect that qemu upstream would be resistant to this approach. As far as the Xen Project's point of view goes, I think using xenstore for this is fine, but we would definitely want the support in upstream qemu. Can you please explain the status of the corresponding qemu feature ? (Ideally, in a formal way in the commit message.) > Please note, there is a real concern about VirtIO interrupts allocation. > [Just copy here what Stefano said in RFC thread] > > So, if we end up allocating let's say 6 virtio interrupts for a > domain, the chance of a clash with a physical interrupt of a > passthrough device is real. > > I am not entirely sure how to solve it, but these are a few ideas: > - choosing virtio interrupts that are less likely to conflict (maybe > 1000) > - make the virtio irq (optionally) configurable so that a user could > override the default irq and specify one that doesn't conflict > - implementing support for virq != pirq (even the xl interface doesn't > allow to specify the virq number for passthrough devices, see "irqs") I think here you have chosen to make the interupt configurable ? The implications are that a someone using this with passthrough would have to choose non-clashing IRQs ? In the non-passthrough case (ie, a guest with no passthrough devices), can your code choose an appropriate IRQ, if the user doesn't specify one ? I don't see any changes to the xl documentation in this patch. That would be the place to explain the irq stuff, and would be needed anyway. Indeed with anything substantial like your proposal, it is often a good idea to write (at least a sketch of) the documentation *first*, and then you know what you're aiming to implement. I have some comments on the code details but I think you will probably want to focus on the overall approach, first: > +#ifndef container_of > +#define container_of(ptr, type, member) ({ \ > + typeof( ((type *)0)->member ) *__mptr = (ptr); \ > + (type *)( (char *)__mptr - offsetof(type,member) );}) > +#endif Please use the existing CONTAINER_OF which we have already. > static const char *gicv_to_string(libxl_gic_version gic_version) > { > switch (gic_version) { > @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, > vuart_enabled = true; > } > > - /* > - * XXX: Handle properly virtio > - * A proper solution would be the toolstack to allocate the interrupts > - * used by each virtio backend and let the backend now which one is used > - */ > if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) { > - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; > + uint64_t virtio_base; > + libxl_device_virtio_disk *virtio_disk; > + > + virtio_base = GUEST_VIRTIO_MMIO_BASE; > virtio_irq = GUEST_VIRTIO_MMIO_SPI; I would like to see a review of these changes to virtio handling by someone who understands virtio. > +static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid, > + libxl_device_virtio_disk *virtio_disk, > + bool hotplug) > +{ > + return libxl__resolve_domid(gc, virtio_disk->backend_domname, > + &virtio_disk->backend_domid); There are some line length problems here. I haven't reviewed your parsing code because I think this ought to be done as an option or addition to with the existing disk spec parsing. > diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c > new file mode 100644 > index 0000000..808a7da > --- /dev/null > +++ b/tools/xl/xl_virtio_disk.c > @@ -0,0 +1,46 @@ > +/* > + * Copyright (C) 2020 EPAM Systems Inc. > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU Lesser General Public License as published > + * by the Free Software Foundation; version 2.1 only. with the special > + * exception on linking described in file LICENSE. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU Lesser General Public License for more details. > + */ > + > +#include <stdlib.h> > + > +#include <libxl.h> > +#include <libxl_utils.h> > +#include <libxlutil.h> > + > +#include "xl.h" > +#include "xl_utils.h" > +#include "xl_parse.h" > + > +int main_virtio_diskattach(int argc, char **argv) > +{ > + return 0; > +} > + > +int main_virtio_disklist(int argc, char **argv) > +{ > + return 0; > +} > + > +int main_virtio_diskdetach(int argc, char **argv) > +{ > + return 0; > +} This seems to be a stray early test file left over in the patch ? Thanks, Ian.
Hi Oleksandr, On 12/01/2021 21:52, Oleksandr Tyshchenko wrote: > From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > > This patch adds basic support for configuring and assisting virtio-disk > backend (emualator) which is intended to run out of Qemu and could be run > in any domain. > > Xenstore was chosen as a communication interface for the emulator running > in non-toolstack domain to be able to get configuration either by reading > Xenstore directly or by receiving command line parameters (an updated 'xl devd' > running in the same domain would read Xenstore beforehand and call backend > executable with the required arguments). > > An example of domain configuration (two disks are assigned to the guest, > the latter is in readonly mode): > > vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] > > Where per-disk Xenstore entries are: > - filename and readonly flag (configured via "vdisk" property) > - base and irq (allocated dynamically) > > Besides handling 'visible' params described in configuration file, > patch also allocates virtio-mmio specific ones for each device and > writes them into Xenstore. virtio-mmio params (irq and base) are > unique per guest domain, they allocated at the domain creation time > and passed through to the emulator. Each VirtIO device has at least > one pair of these params. > > TODO: > 1. An extra "virtio" property could be removed. > 2. Update documentation. > > Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> > [On Arm only] > Tested-by: Wei Chen <Wei.Chen@arm.com> > > --- > Changes RFC -> V1: > - no changes > > Changes V1 -> V2: > - rebase according to the new location of libxl_virtio_disk.c > > Changes V2 -> V3: > - no changes > > Changes V3 -> V4: > - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT > > Please note, there is a real concern about VirtIO interrupts allocation. > [Just copy here what Stefano said in RFC thread] > > So, if we end up allocating let's say 6 virtio interrupts for a domain, > the chance of a clash with a physical interrupt of a passthrough device is real. For the first version, I think a static approach is fine because it doesn't bind us to anything yet (there is no interface change). We can refine it on follow-ups as we figure out how virtio is going to be used in the field. > > I am not entirely sure how to solve it, but these are a few ideas: > - choosing virtio interrupts that are less likely to conflict (maybe > 1000) Well, we only support 988 interrupts :). However, we will waste some memory in the vGIC structure (we would need to allocate memory for the 988 interrupts) if you chose an interrupt towards then end. > - make the virtio irq (optionally) configurable so that a user could > override the default irq and specify one that doesn't conflict This is not very ideal because it makes the use of virtio quite unfriendly with passthrough. Note that platform device passthrough is already unfriendly, but I am thinking PCI :). > - implementing support for virq != pirq (even the xl interface doesn't > allow to specify the virq number for passthrough devices, see "irqs") I can't remember whether I had a reason to not support virq != pirq when this was initially implemented. This is one possibility, but it is as unfriendly as the previous option. I will add a 4th one: - Automatically allocate the virtio IRQ. This should be possible to do it without too much trouble as we know in advance which IRQs will be passthrough. My preference is the 4th one, that said we may also want to pick either 2 or 3 to give some flexibility to an admin if they wish to get their hand dirty. > > Also there is one suggestion from Wei Chen regarding a parameter for domain > config file which I haven't addressed yet. > [Just copy here what Wei said in V2 thread] > Can we keep use the same 'disk' parameter for virtio-disk, but add an option like > "model=virtio-disk"? > For example: > disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ] > Just like what Xen has done for x86 virtio-net. > --- > tools/libs/light/Makefile | 1 + > tools/libs/light/libxl_arm.c | 56 ++++++++++++--- > tools/libs/light/libxl_create.c | 1 + > tools/libs/light/libxl_internal.h | 1 + > tools/libs/light/libxl_types.idl | 15 ++++ > tools/libs/light/libxl_types_internal.idl | 1 + > tools/libs/light/libxl_virtio_disk.c | 109 ++++++++++++++++++++++++++++ > tools/xl/Makefile | 2 +- > tools/xl/xl.h | 3 + > tools/xl/xl_cmdtable.c | 15 ++++ > tools/xl/xl_parse.c | 115 ++++++++++++++++++++++++++++++ > tools/xl/xl_virtio_disk.c | 46 ++++++++++++ > 12 files changed, 354 insertions(+), 11 deletions(-) > create mode 100644 tools/libs/light/libxl_virtio_disk.c > create mode 100644 tools/xl/xl_virtio_disk.c > > diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile > index 68f6fa3..ccc91b9 100644 > --- a/tools/libs/light/Makefile > +++ b/tools/libs/light/Makefile > @@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c > SRCS-y += _libxl_types.c > SRCS-y += libxl_flask.c > SRCS-y += _libxl_types_internal.c > +SRCS-y += libxl_virtio_disk.c > > ifeq ($(CONFIG_LIBNL),y) > CFLAGS_LIBXL += $(LIBNL3_CFLAGS) > diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c > index 588ee5a..9eb3022 100644 > --- a/tools/libs/light/libxl_arm.c > +++ b/tools/libs/light/libxl_arm.c > @@ -8,6 +8,12 @@ > #include <assert.h> > #include <xen/device_tree_defs.h> > > +#ifndef container_of > +#define container_of(ptr, type, member) ({ \ > + typeof( ((type *)0)->member ) *__mptr = (ptr); \ > + (type *)( (char *)__mptr - offsetof(type,member) );}) > +#endif > + > static const char *gicv_to_string(libxl_gic_version gic_version) > { > switch (gic_version) { > @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, > vuart_enabled = true; > } > > - /* > - * XXX: Handle properly virtio > - * A proper solution would be the toolstack to allocate the interrupts > - * used by each virtio backend and let the backend now which one is used > - */ Ok, so you added some code in patch #23 that is going to be mostly dropped here. I think you want to rethink how you do the split here. One possible approach would be to have a patch which adds the infrastructe but no call. It would contain: 1) Allocate a space in the virtio region and an interrupt 2) Create the bindings. Those helpers can then be called in this patch. > if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) { It feels to me that this parameter is not necessary. You can easily infer it based whether you have a virtio disks attached or not. > - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; > + uint64_t virtio_base; > + libxl_device_virtio_disk *virtio_disk; > + > + virtio_base = GUEST_VIRTIO_MMIO_BASE; > virtio_irq = GUEST_VIRTIO_MMIO_SPI; Looking at patch #23, you defined a single SPI and a region that can only fit virtio device. However, here, you are going to define multiple virtio devices. I think you want to define the following: - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may contain multiple devices) - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio The per-device size doesn't need to be defined in arch-arm.h. Instead, I would only define internally (unless we can use a virtio.h header from Linux?). > + > + if (!d_config->num_virtio_disks) { > + LOG(ERROR, "Virtio is enabled, but no Virtio devices present\n"); > + return ERROR_FAIL; > + } > + virtio_disk = &d_config->virtio_disks[0]; > + > + for (i = 0; i < virtio_disk->num_disks; i++) { > + virtio_disk->disks[i].base = virtio_base; > + virtio_disk->disks[i].irq = virtio_irq; > + > + LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE 0x%"PRIx64, > + virtio_irq, virtio_base); > + > + virtio_irq ++; NIT: We usually don't have space before ++ or ... > + virtio_base += GUEST_VIRTIO_MMIO_SIZE; > + } > + virtio_irq --; ... --; > + > + nr_spis += (virtio_irq - 32) + 1; > virtio_enabled = true; > } > [...] > diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c > index 2a3364b..054a0c9 100644 > --- a/tools/xl/xl_parse.c > +++ b/tools/xl/xl_parse.c > @@ -1204,6 +1204,120 @@ out: > if (rc) exit(EXIT_FAILURE); > } > > +#define MAX_VIRTIO_DISKS 4 May I ask why this is hardcoded to 4? Cheers,
Hi Ian On 14.01.21 19:20, Ian Jackson wrote: > Oleksandr Tyshchenko writes ("[PATCH V4 24/24] [RFC] libxl: Add support for virtio-disk configuration"): >> This patch adds basic support for configuring and assisting virtio-disk >> backend (emualator) which is intended to run out of Qemu and could be run >> in any domain. > Thanks. I think this is a very important feature. But I think this > part at least needs some work. (That's not inappropriate for an RFC > patch - so please don't feel you have done anything wrong. I hope you > will find my comments constructive.) ok > > >> An example of domain configuration (two disks are assigned to the guest, >> the latter is in readonly mode): >> >> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] > I can see why you have done it like this but I am concerned that this > is not well-integrated with the existing disk configuration system. > > As a result not only is your new feature lacking support for many > existing libxl features (block backend scripts, cdroms tagged as such, > non-raw formats) that could otherwise be made available, but I think > adding them later would be quite awkward. > > I it would be better to reuse (and, if necessary, adapt) the existing > disk parsing logic in libxl, so that the syntax for your new vdisks = > [...] parameter is the same as for the existing disks. Or even > better, simply make your new kind of disk a new flag on the existing > disk structure. I got your point and agree. Almost the same suggestion (to reuse existing disk parameter rather than introduce new one) was proposed by Wei. This is not forgotten, but in my TODO list to investigate (and implement). I will come up with clarifying questions if any. > >> Also there is one suggestion from Wei Chen regarding a parameter for >> domain config file which I haven't addressed yet. >> [Just copy here what Wei said in V2 thread] >> Can we keep use the same 'disk' parameter for virtio-disk, but add >> an option like "model=virtio-disk"? >> For example: >> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ] >> Just like what Xen has done for x86 virtio-net. > This is the same suggestion I make above, basically. It would be much > better, yes. ok > > >> Xenstore was chosen as a communication interface for the emulator >> running in non-toolstack domain to be able to get configuration >> either by reading Xenstore directly or by receiving command line >> parameters (an updated 'xl devd' running in the same domain would >> read Xenstore beforehand and call backend executable with the >> required arguments). > I was surprised to read this because I would expect that qemu upstream > would be resistant to this approach. As far as the Xen Project's > point of view goes, I think using xenstore for this is fine, but we > would definitely want the support in upstream qemu. > > Can you please explain the status of the corresponding qemu feature ? > (Ideally, in a formal way in the commit message.) I am afraid, I don't entirely get what is "the corresponding qemu feature"? I haven't looked at the Qemu direction yet (we don't use Qemu in our target system), so have no ideas what should be done there (if indeed needed) to support standalone "out-of-Qemu" virtio backend. Could you please clarify what support is needed in Qemu for that purpose? > >> Please note, there is a real concern about VirtIO interrupts allocation. >> [Just copy here what Stefano said in RFC thread] >> >> So, if we end up allocating let's say 6 virtio interrupts for a >> domain, the chance of a clash with a physical interrupt of a >> passthrough device is real. >> >> I am not entirely sure how to solve it, but these are a few ideas: >> - choosing virtio interrupts that are less likely to conflict (maybe > 1000) >> - make the virtio irq (optionally) configurable so that a user could >> override the default irq and specify one that doesn't conflict >> - implementing support for virq != pirq (even the xl interface doesn't >> allow to specify the virq number for passthrough devices, see "irqs") > I think here you have chosen to make the interupt configurable ? > > The implications are that a someone using this with passthrough would > have to choose non-clashing IRQs ? Yes > In the non-passthrough case (ie, a > guest with no passthrough devices), can your code choose an > appropriate IRQ, if the user doesn't specify one ? Yes Personally I think, it would be good if we could come up with a way _without_ user involvement at all. > > > I don't see any changes to the xl documentation in this patch. That > would be the place to explain the irq stuff, and would be needed > anyway. Indeed with anything substantial like your proposal, it is > often a good idea to write (at least a sketch of) the documentation > *first*, and then you know what you're aiming to implement. Indeed, this ought to be documented. This is on my TODO list, will definitely update in the next version. > > > I have some comments on the code details but I think you will probably > want to focus on the overall approach, first: > >> +#ifndef container_of >> +#define container_of(ptr, type, member) ({ \ >> + typeof( ((type *)0)->member ) *__mptr = (ptr); \ >> + (type *)( (char *)__mptr - offsetof(type,member) );}) >> +#endif > Please use the existing CONTAINER_OF which we have already. oh, it is present, great. I failed to find something suitable (for some reason) when writing that code) Will reuse. > >> static const char *gicv_to_string(libxl_gic_version gic_version) >> { >> switch (gic_version) { >> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, >> vuart_enabled = true; >> } >> >> - /* >> - * XXX: Handle properly virtio >> - * A proper solution would be the toolstack to allocate the interrupts >> - * used by each virtio backend and let the backend now which one is used >> - */ >> if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) { >> - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; >> + uint64_t virtio_base; >> + libxl_device_virtio_disk *virtio_disk; >> + >> + virtio_base = GUEST_VIRTIO_MMIO_BASE; >> virtio_irq = GUEST_VIRTIO_MMIO_SPI; > I would like to see a review of these changes to virtio handling by > someone who understands virtio. Makes sense. > >> +static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid, >> + libxl_device_virtio_disk *virtio_disk, >> + bool hotplug) >> +{ >> + return libxl__resolve_domid(gc, virtio_disk->backend_domname, >> + &virtio_disk->backend_domid); > There are some line length problems here. Will correct. > > I haven't reviewed your parsing code because I think this ought to be > done as an option or addition to with the existing disk spec parsing. I got it, fair enough. > >> diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c >> new file mode 100644 >> index 0000000..808a7da >> --- /dev/null >> +++ b/tools/xl/xl_virtio_disk.c >> @@ -0,0 +1,46 @@ >> +/* >> + * Copyright (C) 2020 EPAM Systems Inc. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU Lesser General Public License as published >> + * by the Free Software Foundation; version 2.1 only. with the special >> + * exception on linking described in file LICENSE. >> + * >> + * This program is distributed in the hope that it will be useful, >> + * but WITHOUT ANY WARRANTY; without even the implied warranty of >> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the >> + * GNU Lesser General Public License for more details. >> + */ >> + >> +#include <stdlib.h> >> + >> +#include <libxl.h> >> +#include <libxl_utils.h> >> +#include <libxlutil.h> >> + >> +#include "xl.h" >> +#include "xl_utils.h" >> +#include "xl_parse.h" >> + >> +int main_virtio_diskattach(int argc, char **argv) >> +{ >> + return 0; >> +} >> + >> +int main_virtio_disklist(int argc, char **argv) >> +{ >> + return 0; >> +} >> + >> +int main_virtio_diskdetach(int argc, char **argv) >> +{ >> + return 0; >> +} > This seems to be a stray early test file left over in the patch ? Will implement these bits. > > > Thanks, > Ian.
On 16.01.21 00:01, Julien Grall wrote: > Hi Oleksandr, Hi Julien > > On 12/01/2021 21:52, Oleksandr Tyshchenko wrote: >> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> >> This patch adds basic support for configuring and assisting virtio-disk >> backend (emualator) which is intended to run out of Qemu and could be >> run >> in any domain. >> >> Xenstore was chosen as a communication interface for the emulator >> running >> in non-toolstack domain to be able to get configuration either by >> reading >> Xenstore directly or by receiving command line parameters (an updated >> 'xl devd' >> running in the same domain would read Xenstore beforehand and call >> backend >> executable with the required arguments). >> >> An example of domain configuration (two disks are assigned to the guest, >> the latter is in readonly mode): >> >> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] >> >> Where per-disk Xenstore entries are: >> - filename and readonly flag (configured via "vdisk" property) >> - base and irq (allocated dynamically) >> >> Besides handling 'visible' params described in configuration file, >> patch also allocates virtio-mmio specific ones for each device and >> writes them into Xenstore. virtio-mmio params (irq and base) are >> unique per guest domain, they allocated at the domain creation time >> and passed through to the emulator. Each VirtIO device has at least >> one pair of these params. >> >> TODO: >> 1. An extra "virtio" property could be removed. >> 2. Update documentation. >> >> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >> [On Arm only] >> Tested-by: Wei Chen <Wei.Chen@arm.com> >> >> --- >> Changes RFC -> V1: >> - no changes >> >> Changes V1 -> V2: >> - rebase according to the new location of libxl_virtio_disk.c >> >> Changes V2 -> V3: >> - no changes >> >> Changes V3 -> V4: >> - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT >> >> Please note, there is a real concern about VirtIO interrupts allocation. >> [Just copy here what Stefano said in RFC thread] >> >> So, if we end up allocating let's say 6 virtio interrupts for a domain, >> the chance of a clash with a physical interrupt of a passthrough >> device is real. > > For the first version, I think a static approach is fine because it > doesn't bind us to anything yet (there is no interface change). We can > refine it on follow-ups as we figure out how virtio is going to be > used in the field. > >> >> I am not entirely sure how to solve it, but these are a few ideas: >> - choosing virtio interrupts that are less likely to conflict (maybe >> > 1000) > > Well, we only support 988 interrupts :). However, we will waste some > memory in the vGIC structure (we would need to allocate memory for the > 988 interrupts) if you chose an interrupt towards then end. > >> - make the virtio irq (optionally) configurable so that a user could >> override the default irq and specify one that doesn't conflict > > This is not very ideal because it makes the use of virtio quite > unfriendly with passthrough. Note that platform device passthrough is > already unfriendly, but I am thinking PCI :). > >> - implementing support for virq != pirq (even the xl interface doesn't >> allow to specify the virq number for passthrough devices, see "irqs") > I can't remember whether I had a reason to not support virq != pirq > when this was initially implemented. This is one possibility, but it > is as unfriendly as the previous option. > > I will add a 4th one: > - Automatically allocate the virtio IRQ. This should be possible to > do it without too much trouble as we know in advance which IRQs will > be passthrough. As I understand the IRQs for passthrough are described in "irq" property and stored in d_config->b_info.irqs[i], so yes we know in advance which IRQs will be used for passthrough and we will be able to choose non-clashed ones (iterating over all IRQs in a reserved range) for the virtio devices. The question is how many IRQs should be reserved. > > My preference is the 4th one, that said we may also want to pick > either 2 or 3 to give some flexibility to an admin if they wish to get > their hand dirty. Personally I would be ok with 4th, but without admin involvement. > > >> >> Also there is one suggestion from Wei Chen regarding a parameter for >> domain >> config file which I haven't addressed yet. >> [Just copy here what Wei said in V2 thread] >> Can we keep use the same 'disk' parameter for virtio-disk, but add an >> option like >> "model=virtio-disk"? >> For example: >> disk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3,model=virtio-disk' ] >> Just like what Xen has done for x86 virtio-net. >> --- >> tools/libs/light/Makefile | 1 + >> tools/libs/light/libxl_arm.c | 56 ++++++++++++--- >> tools/libs/light/libxl_create.c | 1 + >> tools/libs/light/libxl_internal.h | 1 + >> tools/libs/light/libxl_types.idl | 15 ++++ >> tools/libs/light/libxl_types_internal.idl | 1 + >> tools/libs/light/libxl_virtio_disk.c | 109 >> ++++++++++++++++++++++++++++ >> tools/xl/Makefile | 2 +- >> tools/xl/xl.h | 3 + >> tools/xl/xl_cmdtable.c | 15 ++++ >> tools/xl/xl_parse.c | 115 >> ++++++++++++++++++++++++++++++ >> tools/xl/xl_virtio_disk.c | 46 ++++++++++++ >> 12 files changed, 354 insertions(+), 11 deletions(-) >> create mode 100644 tools/libs/light/libxl_virtio_disk.c >> create mode 100644 tools/xl/xl_virtio_disk.c >> >> diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile >> index 68f6fa3..ccc91b9 100644 >> --- a/tools/libs/light/Makefile >> +++ b/tools/libs/light/Makefile >> @@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c >> SRCS-y += _libxl_types.c >> SRCS-y += libxl_flask.c >> SRCS-y += _libxl_types_internal.c >> +SRCS-y += libxl_virtio_disk.c >> ifeq ($(CONFIG_LIBNL),y) >> CFLAGS_LIBXL += $(LIBNL3_CFLAGS) >> diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c >> index 588ee5a..9eb3022 100644 >> --- a/tools/libs/light/libxl_arm.c >> +++ b/tools/libs/light/libxl_arm.c >> @@ -8,6 +8,12 @@ >> #include <assert.h> >> #include <xen/device_tree_defs.h> >> +#ifndef container_of >> +#define container_of(ptr, type, member) ({ \ >> + typeof( ((type *)0)->member ) *__mptr = (ptr); \ >> + (type *)( (char *)__mptr - offsetof(type,member) );}) >> +#endif >> + >> static const char *gicv_to_string(libxl_gic_version gic_version) >> { >> switch (gic_version) { >> @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, >> vuart_enabled = true; >> } >> - /* >> - * XXX: Handle properly virtio >> - * A proper solution would be the toolstack to allocate the >> interrupts >> - * used by each virtio backend and let the backend now which one >> is used >> - */ > > Ok, so you added some code in patch #23 that is going to be mostly > dropped here. I think you want to rethink how you do the split here. > > One possible approach would be to have a patch which adds the > infrastructe but no call. It would contain: > 1) Allocate a space in the virtio region and an interrupt > 2) Create the bindings. > > Those helpers can then be called in this patch. Sounds reasonable. > > >> if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) { > > It feels to me that this parameter is not necessary. You can easily > infer it based whether you have a virtio disks attached or not. Yes! > > >> - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; >> + uint64_t virtio_base; >> + libxl_device_virtio_disk *virtio_disk; >> + >> + virtio_base = GUEST_VIRTIO_MMIO_BASE; >> virtio_irq = GUEST_VIRTIO_MMIO_SPI; > > Looking at patch #23, you defined a single SPI and a region that can > only fit virtio device. However, here, you are going to define > multiple virtio devices. > > I think you want to define the following: > > - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window > - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may > contain multiple devices) > - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio > - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio > > The per-device size doesn't need to be defined in arch-arm.h. Instead, > I would only define internally (unless we can use a virtio.h header > from Linux?). I think I got the idea. What are the preferences for these values? > > >> + >> + if (!d_config->num_virtio_disks) { >> + LOG(ERROR, "Virtio is enabled, but no Virtio devices >> present\n"); >> + return ERROR_FAIL; >> + } >> + virtio_disk = &d_config->virtio_disks[0]; >> + >> + for (i = 0; i < virtio_disk->num_disks; i++) { >> + virtio_disk->disks[i].base = virtio_base; >> + virtio_disk->disks[i].irq = virtio_irq; >> + >> + LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE >> 0x%"PRIx64, >> + virtio_irq, virtio_base); >> + >> + virtio_irq ++; > > NIT: We usually don't have space before ++ or ... ok > >> + virtio_base += GUEST_VIRTIO_MMIO_SIZE; >> + } >> + virtio_irq --; > > ... --; ok > >> + >> + nr_spis += (virtio_irq - 32) + 1; >> virtio_enabled = true; >> } > > [...] > >> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c >> index 2a3364b..054a0c9 100644 >> --- a/tools/xl/xl_parse.c >> +++ b/tools/xl/xl_parse.c >> @@ -1204,6 +1204,120 @@ out: >> if (rc) exit(EXIT_FAILURE); >> } >> +#define MAX_VIRTIO_DISKS 4 > > May I ask why this is hardcoded to 4? I found 4 as a reasonable value for the initial implementation. This means how many disks the single device instance can handle. Do you think we need to change it? > > Cheers, >
Hi Oleksandr, On 18/01/2021 08:32, Oleksandr wrote: > > On 16.01.21 00:01, Julien Grall wrote: >> Hi Oleksandr, > > Hi Julien > > >> >> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote: >>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>> >>> This patch adds basic support for configuring and assisting virtio-disk >>> backend (emualator) which is intended to run out of Qemu and could be >>> run >>> in any domain. >>> >>> Xenstore was chosen as a communication interface for the emulator >>> running >>> in non-toolstack domain to be able to get configuration either by >>> reading >>> Xenstore directly or by receiving command line parameters (an updated >>> 'xl devd' >>> running in the same domain would read Xenstore beforehand and call >>> backend >>> executable with the required arguments). >>> >>> An example of domain configuration (two disks are assigned to the guest, >>> the latter is in readonly mode): >>> >>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] >>> >>> Where per-disk Xenstore entries are: >>> - filename and readonly flag (configured via "vdisk" property) >>> - base and irq (allocated dynamically) >>> >>> Besides handling 'visible' params described in configuration file, >>> patch also allocates virtio-mmio specific ones for each device and >>> writes them into Xenstore. virtio-mmio params (irq and base) are >>> unique per guest domain, they allocated at the domain creation time >>> and passed through to the emulator. Each VirtIO device has at least >>> one pair of these params. >>> >>> TODO: >>> 1. An extra "virtio" property could be removed. >>> 2. Update documentation. >>> >>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>> [On Arm only] >>> Tested-by: Wei Chen <Wei.Chen@arm.com> >>> >>> --- >>> Changes RFC -> V1: >>> - no changes >>> >>> Changes V1 -> V2: >>> - rebase according to the new location of libxl_virtio_disk.c >>> >>> Changes V2 -> V3: >>> - no changes >>> >>> Changes V3 -> V4: >>> - rebase according to the new argument for DEFINE_DEVICE_TYPE_STRUCT >>> >>> Please note, there is a real concern about VirtIO interrupts allocation. >>> [Just copy here what Stefano said in RFC thread] >>> >>> So, if we end up allocating let's say 6 virtio interrupts for a domain, >>> the chance of a clash with a physical interrupt of a passthrough >>> device is real. >> >> For the first version, I think a static approach is fine because it >> doesn't bind us to anything yet (there is no interface change). We can >> refine it on follow-ups as we figure out how virtio is going to be >> used in the field. >> >>> >>> I am not entirely sure how to solve it, but these are a few ideas: >>> - choosing virtio interrupts that are less likely to conflict (maybe >>> > 1000) >> >> Well, we only support 988 interrupts :). However, we will waste some >> memory in the vGIC structure (we would need to allocate memory for the >> 988 interrupts) if you chose an interrupt towards then end. >> >>> - make the virtio irq (optionally) configurable so that a user could >>> override the default irq and specify one that doesn't conflict >> >> This is not very ideal because it makes the use of virtio quite >> unfriendly with passthrough. Note that platform device passthrough is >> already unfriendly, but I am thinking PCI :). >> >>> - implementing support for virq != pirq (even the xl interface doesn't >>> allow to specify the virq number for passthrough devices, see "irqs") >> I can't remember whether I had a reason to not support virq != pirq >> when this was initially implemented. This is one possibility, but it >> is as unfriendly as the previous option. >> >> I will add a 4th one: >> - Automatically allocate the virtio IRQ. This should be possible to >> do it without too much trouble as we know in advance which IRQs will >> be passthrough. > As I understand the IRQs for passthrough are described in "irq" property > and stored in d_config->b_info.irqs[i], so yes we know in advance which > IRQs will be used for passthrough > and we will be able to choose non-clashed ones (iterating over all IRQs > in a reserved range) for the virtio devices. The question is how many > IRQs should be reserved. If we are automatically selecting the interrupt for virtio devices, then I don't think we need to reserve a batch. Instead, we can allocate one by one as we create the virtio device in libxl. For the static case, then a range of 10-20 might be sufficient for now. [...] >>> - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; >>> + uint64_t virtio_base; >>> + libxl_device_virtio_disk *virtio_disk; >>> + >>> + virtio_base = GUEST_VIRTIO_MMIO_BASE; >>> virtio_irq = GUEST_VIRTIO_MMIO_SPI; >> >> Looking at patch #23, you defined a single SPI and a region that can >> only fit virtio device. However, here, you are going to define >> multiple virtio devices. >> >> I think you want to define the following: >> >> - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window >> - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may >> contain multiple devices) >> - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio >> - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio >> >> The per-device size doesn't need to be defined in arch-arm.h. Instead, >> I would only define internally (unless we can use a virtio.h header >> from Linux?). > > I think I got the idea. What are the preferences for these values? I have suggested some values in patch #23. Let me know what you think there. [...] >>> + >>> + nr_spis += (virtio_irq - 32) + 1; >>> virtio_enabled = true; >>> } >> >> [...] >> >>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c >>> index 2a3364b..054a0c9 100644 >>> --- a/tools/xl/xl_parse.c >>> +++ b/tools/xl/xl_parse.c >>> @@ -1204,6 +1204,120 @@ out: >>> if (rc) exit(EXIT_FAILURE); >>> } >>> +#define MAX_VIRTIO_DISKS 4 >> >> May I ask why this is hardcoded to 4? > > I found 4 as a reasonable value for the initial implementation. > This means how many disks the single device instance can handle. Right, the question is why do you need to impose a limit in xl? Looking at the code, the value is only used in: + if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) { + fprintf(stderr, "vdisk: currently only %d disks are supported", + MAX_VIRTIO_DISKS); The rest of the code (at list in libxl/xl) seems to be completely agnostic to the number of disks. So it feels strange to me to impose what looks like an arbitrary limit in the tools. Cheers,
On 20.01.21 19:05, Julien Grall wrote: > Hi Oleksandr, Hi Julien Sorry for the late response. > > On 18/01/2021 08:32, Oleksandr wrote: >> >> On 16.01.21 00:01, Julien Grall wrote: >>> Hi Oleksandr, >> >> Hi Julien >> >> >>> >>> On 12/01/2021 21:52, Oleksandr Tyshchenko wrote: >>>> From: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>> >>>> This patch adds basic support for configuring and assisting >>>> virtio-disk >>>> backend (emualator) which is intended to run out of Qemu and could >>>> be run >>>> in any domain. >>>> >>>> Xenstore was chosen as a communication interface for the emulator >>>> running >>>> in non-toolstack domain to be able to get configuration either by >>>> reading >>>> Xenstore directly or by receiving command line parameters (an >>>> updated 'xl devd' >>>> running in the same domain would read Xenstore beforehand and call >>>> backend >>>> executable with the required arguments). >>>> >>>> An example of domain configuration (two disks are assigned to the >>>> guest, >>>> the latter is in readonly mode): >>>> >>>> vdisk = [ 'backend=DomD, disks=rw:/dev/mmcblk0p3;ro:/dev/mmcblk1p3' ] >>>> >>>> Where per-disk Xenstore entries are: >>>> - filename and readonly flag (configured via "vdisk" property) >>>> - base and irq (allocated dynamically) >>>> >>>> Besides handling 'visible' params described in configuration file, >>>> patch also allocates virtio-mmio specific ones for each device and >>>> writes them into Xenstore. virtio-mmio params (irq and base) are >>>> unique per guest domain, they allocated at the domain creation time >>>> and passed through to the emulator. Each VirtIO device has at least >>>> one pair of these params. >>>> >>>> TODO: >>>> 1. An extra "virtio" property could be removed. >>>> 2. Update documentation. >>>> >>>> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> >>>> [On Arm only] >>>> Tested-by: Wei Chen <Wei.Chen@arm.com> >>>> >>>> --- >>>> Changes RFC -> V1: >>>> - no changes >>>> >>>> Changes V1 -> V2: >>>> - rebase according to the new location of libxl_virtio_disk.c >>>> >>>> Changes V2 -> V3: >>>> - no changes >>>> >>>> Changes V3 -> V4: >>>> - rebase according to the new argument for >>>> DEFINE_DEVICE_TYPE_STRUCT >>>> >>>> Please note, there is a real concern about VirtIO interrupts >>>> allocation. >>>> [Just copy here what Stefano said in RFC thread] >>>> >>>> So, if we end up allocating let's say 6 virtio interrupts for a >>>> domain, >>>> the chance of a clash with a physical interrupt of a passthrough >>>> device is real. >>> >>> For the first version, I think a static approach is fine because it >>> doesn't bind us to anything yet (there is no interface change). We >>> can refine it on follow-ups as we figure out how virtio is going to >>> be used in the field. >>> >>>> >>>> I am not entirely sure how to solve it, but these are a few ideas: >>>> - choosing virtio interrupts that are less likely to conflict >>>> (maybe > 1000) >>> >>> Well, we only support 988 interrupts :). However, we will waste some >>> memory in the vGIC structure (we would need to allocate memory for >>> the 988 interrupts) if you chose an interrupt towards then end. >>> >>>> - make the virtio irq (optionally) configurable so that a user could >>>> override the default irq and specify one that doesn't conflict >>> >>> This is not very ideal because it makes the use of virtio quite >>> unfriendly with passthrough. Note that platform device passthrough >>> is already unfriendly, but I am thinking PCI :). >>> >>>> - implementing support for virq != pirq (even the xl interface doesn't >>>> allow to specify the virq number for passthrough devices, see >>>> "irqs") >>> I can't remember whether I had a reason to not support virq != pirq >>> when this was initially implemented. This is one possibility, but it >>> is as unfriendly as the previous option. >>> >>> I will add a 4th one: >>> - Automatically allocate the virtio IRQ. This should be possible >>> to do it without too much trouble as we know in advance which IRQs >>> will be passthrough. >> As I understand the IRQs for passthrough are described in "irq" >> property and stored in d_config->b_info.irqs[i], so yes we know in >> advance which IRQs will be used for passthrough >> and we will be able to choose non-clashed ones (iterating over all >> IRQs in a reserved range) for the virtio devices. The question is >> how many IRQs should be reserved. > > If we are automatically selecting the interrupt for virtio devices, > then I don't think we need to reserve a batch. Instead, we can > allocate one by one as we create the virtio device in libxl. Looks like, yes, the reserved range is not needed if we use 4th option. > > > For the static case, then a range of 10-20 might be sufficient for now. ok Thinking a bit more what approach to choose... I would tend to automatically allocate the virtio IRQ (4th option) rather than use static approach with reserved IRQs in order to eliminate the chance of a clash with a physical IRQs completely from the very beginning. From other side we can indeed use static approach (as simpler one) for now and then refine it when we have more understanding about the virtio usage. What do you think? > > > [...] > >>>> - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; >>>> + uint64_t virtio_base; >>>> + libxl_device_virtio_disk *virtio_disk; >>>> + >>>> + virtio_base = GUEST_VIRTIO_MMIO_BASE; >>>> virtio_irq = GUEST_VIRTIO_MMIO_SPI; >>> >>> Looking at patch #23, you defined a single SPI and a region that can >>> only fit virtio device. However, here, you are going to define >>> multiple virtio devices. >>> >>> I think you want to define the following: >>> >>> - GUEST_VIRTIO_MMIO_BASE: Base address of the virtio window >>> - GUEST_VIRTIO_MMIO_SIZE: Full length of the virtio window (may >>> contain multiple devices) >>> - GUEST_VIRTIO_SPI_FIRST: First SPI reserved for virtio >>> - GUEST_VIRTIO_SPI_LAST: Last SPI reserved for virtio >>> >>> The per-device size doesn't need to be defined in arch-arm.h. >>> Instead, I would only define internally (unless we can use a >>> virtio.h header from Linux?). >> >> I think I got the idea. What are the preferences for these values? > > I have suggested some values in patch #23. Let me know what you think > there. ok, thank you. I agree with the values. > > > [...] > >>>> + >>>> + nr_spis += (virtio_irq - 32) + 1; >>>> virtio_enabled = true; >>>> } >>> >>> [...] >>> >>>> diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c >>>> index 2a3364b..054a0c9 100644 >>>> --- a/tools/xl/xl_parse.c >>>> +++ b/tools/xl/xl_parse.c >>>> @@ -1204,6 +1204,120 @@ out: >>>> if (rc) exit(EXIT_FAILURE); >>>> } >>>> +#define MAX_VIRTIO_DISKS 4 >>> >>> May I ask why this is hardcoded to 4? >> >> I found 4 as a reasonable value for the initial implementation. >> This means how many disks the single device instance can handle. > > Right, the question is why do you need to impose a limit in xl? > > Looking at the code, the value is only used in: > > + if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) { > + fprintf(stderr, "vdisk: currently only %d disks are > supported", > + MAX_VIRTIO_DISKS); > > The rest of the code (at list in libxl/xl) seems to be completely > agnostic to the number of disks. So it feels strange to me to impose > what looks like an arbitrary limit in the tools. Well, will drop this limit here. > > > Cheers, >
Hi Oleksandr, On 10/02/2021 09:02, Oleksandr wrote: > > On 20.01.21 19:05, Julien Grall wrote: > Thinking a bit more what approach to choose... > I would tend to automatically allocate the virtio IRQ (4th option) > rather than use static approach with reserved IRQs > in order to eliminate the chance of a clash with a physical IRQs > completely from the very beginning. From other side > we can indeed use static approach (as simpler one) for now and then > refine it when we have more understanding about the virtio usage. > What do you think? The static approach should be fine for now. Cheers,
diff --git a/tools/libs/light/Makefile b/tools/libs/light/Makefile index 68f6fa3..ccc91b9 100644 --- a/tools/libs/light/Makefile +++ b/tools/libs/light/Makefile @@ -115,6 +115,7 @@ SRCS-y += libxl_genid.c SRCS-y += _libxl_types.c SRCS-y += libxl_flask.c SRCS-y += _libxl_types_internal.c +SRCS-y += libxl_virtio_disk.c ifeq ($(CONFIG_LIBNL),y) CFLAGS_LIBXL += $(LIBNL3_CFLAGS) diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c index 588ee5a..9eb3022 100644 --- a/tools/libs/light/libxl_arm.c +++ b/tools/libs/light/libxl_arm.c @@ -8,6 +8,12 @@ #include <assert.h> #include <xen/device_tree_defs.h> +#ifndef container_of +#define container_of(ptr, type, member) ({ \ + typeof( ((type *)0)->member ) *__mptr = (ptr); \ + (type *)( (char *)__mptr - offsetof(type,member) );}) +#endif + static const char *gicv_to_string(libxl_gic_version gic_version) { switch (gic_version) { @@ -39,14 +45,32 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, vuart_enabled = true; } - /* - * XXX: Handle properly virtio - * A proper solution would be the toolstack to allocate the interrupts - * used by each virtio backend and let the backend now which one is used - */ if (libxl_defbool_val(d_config->b_info.arch_arm.virtio)) { - nr_spis += (GUEST_VIRTIO_MMIO_SPI - 32) + 1; + uint64_t virtio_base; + libxl_device_virtio_disk *virtio_disk; + + virtio_base = GUEST_VIRTIO_MMIO_BASE; virtio_irq = GUEST_VIRTIO_MMIO_SPI; + + if (!d_config->num_virtio_disks) { + LOG(ERROR, "Virtio is enabled, but no Virtio devices present\n"); + return ERROR_FAIL; + } + virtio_disk = &d_config->virtio_disks[0]; + + for (i = 0; i < virtio_disk->num_disks; i++) { + virtio_disk->disks[i].base = virtio_base; + virtio_disk->disks[i].irq = virtio_irq; + + LOG(DEBUG, "Allocate Virtio MMIO params: IRQ %u BASE 0x%"PRIx64, + virtio_irq, virtio_base); + + virtio_irq ++; + virtio_base += GUEST_VIRTIO_MMIO_SIZE; + } + virtio_irq --; + + nr_spis += (virtio_irq - 32) + 1; virtio_enabled = true; } @@ -70,8 +94,9 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, } /* The same check as for vpl011 */ - if (virtio_enabled && irq == virtio_irq) { - LOG(ERROR, "Physical IRQ %u conflicting with virtio SPI\n", irq); + if (virtio_enabled && + (irq >= GUEST_VIRTIO_MMIO_SPI && irq <= virtio_irq)) { + LOG(ERROR, "Physical IRQ %u conflicting with Virtio IRQ range\n", irq); return ERROR_FAIL; } @@ -1011,8 +1036,19 @@ next_resize: if (info->tee == LIBXL_TEE_TYPE_OPTEE) FDT( make_optee_node(gc, fdt) ); - if (libxl_defbool_val(info->arch_arm.virtio)) - FDT( make_virtio_mmio_node(gc, fdt, GUEST_VIRTIO_MMIO_BASE, GUEST_VIRTIO_MMIO_SPI) ); + if (libxl_defbool_val(info->arch_arm.virtio)) { + libxl_domain_config *d_config = + container_of(info, libxl_domain_config, b_info); + libxl_device_virtio_disk *virtio_disk = &d_config->virtio_disks[0]; + unsigned int i; + + for (i = 0; i < virtio_disk->num_disks; i++) { + uint64_t base = virtio_disk->disks[i].base; + uint32_t irq = virtio_disk->disks[i].irq; + + FDT( make_virtio_mmio_node(gc, fdt, base, irq) ); + } + } if (pfdt) FDT( copy_partial_fdt(gc, fdt, pfdt) ); diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c index 86f4a83..1734fcd 100644 --- a/tools/libs/light/libxl_create.c +++ b/tools/libs/light/libxl_create.c @@ -1821,6 +1821,7 @@ const libxl__device_type *device_type_tbl[] = { &libxl__dtdev_devtype, &libxl__vdispl_devtype, &libxl__vsnd_devtype, + &libxl__virtio_disk_devtype, NULL }; diff --git a/tools/libs/light/libxl_internal.h b/tools/libs/light/libxl_internal.h index c79523b..5edef85 100644 --- a/tools/libs/light/libxl_internal.h +++ b/tools/libs/light/libxl_internal.h @@ -3999,6 +3999,7 @@ extern const libxl__device_type libxl__vdispl_devtype; extern const libxl__device_type libxl__p9_devtype; extern const libxl__device_type libxl__pvcallsif_devtype; extern const libxl__device_type libxl__vsnd_devtype; +extern const libxl__device_type libxl__virtio_disk_devtype; extern const libxl__device_type *device_type_tbl[]; diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl index 839df86..2c40bc2 100644 --- a/tools/libs/light/libxl_types.idl +++ b/tools/libs/light/libxl_types.idl @@ -936,6 +936,20 @@ libxl_device_vsnd = Struct("device_vsnd", [ ("pcms", Array(libxl_vsnd_pcm, "num_vsnd_pcms")) ]) +libxl_virtio_disk_param = Struct("virtio_disk_param", [ + ("filename", string), + ("readonly", bool), + ("irq", uint32), + ("base", uint64), + ]) + +libxl_device_virtio_disk = Struct("device_virtio_disk", [ + ("backend_domid", libxl_domid), + ("backend_domname", string), + ("devid", libxl_devid), + ("disks", Array(libxl_virtio_disk_param, "num_disks")), + ]) + libxl_domain_config = Struct("domain_config", [ ("c_info", libxl_domain_create_info), ("b_info", libxl_domain_build_info), @@ -952,6 +966,7 @@ libxl_domain_config = Struct("domain_config", [ ("pvcallsifs", Array(libxl_device_pvcallsif, "num_pvcallsifs")), ("vdispls", Array(libxl_device_vdispl, "num_vdispls")), ("vsnds", Array(libxl_device_vsnd, "num_vsnds")), + ("virtio_disks", Array(libxl_device_virtio_disk, "num_virtio_disks")), # a channel manifests as a console with a name, # see docs/misc/channels.txt ("channels", Array(libxl_device_channel, "num_channels")), diff --git a/tools/libs/light/libxl_types_internal.idl b/tools/libs/light/libxl_types_internal.idl index 3593e21..8f71980 100644 --- a/tools/libs/light/libxl_types_internal.idl +++ b/tools/libs/light/libxl_types_internal.idl @@ -32,6 +32,7 @@ libxl__device_kind = Enumeration("device_kind", [ (14, "PVCALLS"), (15, "VSND"), (16, "VINPUT"), + (17, "VIRTIO_DISK"), ]) libxl__console_backend = Enumeration("console_backend", [ diff --git a/tools/libs/light/libxl_virtio_disk.c b/tools/libs/light/libxl_virtio_disk.c new file mode 100644 index 0000000..be769ad --- /dev/null +++ b/tools/libs/light/libxl_virtio_disk.c @@ -0,0 +1,109 @@ +/* + * Copyright (C) 2020 EPAM Systems Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include "libxl_internal.h" + +static int libxl__device_virtio_disk_setdefault(libxl__gc *gc, uint32_t domid, + libxl_device_virtio_disk *virtio_disk, + bool hotplug) +{ + return libxl__resolve_domid(gc, virtio_disk->backend_domname, + &virtio_disk->backend_domid); +} + +static int libxl__virtio_disk_from_xenstore(libxl__gc *gc, const char *libxl_path, + libxl_devid devid, + libxl_device_virtio_disk *virtio_disk) +{ + const char *be_path; + int rc; + + virtio_disk->devid = devid; + rc = libxl__xs_read_mandatory(gc, XBT_NULL, + GCSPRINTF("%s/backend", libxl_path), + &be_path); + if (rc) return rc; + + rc = libxl__backendpath_parse_domid(gc, be_path, &virtio_disk->backend_domid); + if (rc) return rc; + + return 0; +} + +static void libxl__update_config_virtio_disk(libxl__gc *gc, + libxl_device_virtio_disk *dst, + libxl_device_virtio_disk *src) +{ + dst->devid = src->devid; +} + +static int libxl_device_virtio_disk_compare(libxl_device_virtio_disk *d1, + libxl_device_virtio_disk *d2) +{ + return COMPARE_DEVID(d1, d2); +} + +static void libxl__device_virtio_disk_add(libxl__egc *egc, uint32_t domid, + libxl_device_virtio_disk *virtio_disk, + libxl__ao_device *aodev) +{ + libxl__device_add_async(egc, domid, &libxl__virtio_disk_devtype, virtio_disk, aodev); +} + +static int libxl__set_xenstore_virtio_disk(libxl__gc *gc, uint32_t domid, + libxl_device_virtio_disk *virtio_disk, + flexarray_t *back, flexarray_t *front, + flexarray_t *ro_front) +{ + int rc; + unsigned int i; + + for (i = 0; i < virtio_disk->num_disks; i++) { + rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/filename", i), + GCSPRINTF("%s", virtio_disk->disks[i].filename)); + if (rc) return rc; + + rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/readonly", i), + GCSPRINTF("%d", virtio_disk->disks[i].readonly)); + if (rc) return rc; + + rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/base", i), + GCSPRINTF("%lu", virtio_disk->disks[i].base)); + if (rc) return rc; + + rc = flexarray_append_pair(ro_front, GCSPRINTF("%d/irq", i), + GCSPRINTF("%u", virtio_disk->disks[i].irq)); + if (rc) return rc; + } + + return 0; +} + +static LIBXL_DEFINE_UPDATE_DEVID(virtio_disk) +static LIBXL_DEFINE_DEVICE_FROM_TYPE(virtio_disk) +static LIBXL_DEFINE_DEVICES_ADD(virtio_disk) + +DEFINE_DEVICE_TYPE_STRUCT(virtio_disk, VIRTIO_DISK, virtio_disks, + .update_config = (device_update_config_fn_t) libxl__update_config_virtio_disk, + .from_xenstore = (device_from_xenstore_fn_t) libxl__virtio_disk_from_xenstore, + .set_xenstore_config = (device_set_xenstore_config_fn_t) libxl__set_xenstore_virtio_disk +); + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */ diff --git a/tools/xl/Makefile b/tools/xl/Makefile index bdf67c8..9d8f2aa 100644 --- a/tools/xl/Makefile +++ b/tools/xl/Makefile @@ -23,7 +23,7 @@ XL_OBJS += xl_vtpm.o xl_block.o xl_nic.o xl_usb.o XL_OBJS += xl_sched.o xl_pci.o xl_vcpu.o xl_cdrom.o xl_mem.o XL_OBJS += xl_info.o xl_console.o xl_misc.o XL_OBJS += xl_vmcontrol.o xl_saverestore.o xl_migrate.o -XL_OBJS += xl_vdispl.o xl_vsnd.o xl_vkb.o +XL_OBJS += xl_vdispl.o xl_vsnd.o xl_vkb.o xl_virtio_disk.o $(XL_OBJS): CFLAGS += $(CFLAGS_libxentoollog) $(XL_OBJS): CFLAGS += $(CFLAGS_XL) diff --git a/tools/xl/xl.h b/tools/xl/xl.h index 06569c6..3d26f19 100644 --- a/tools/xl/xl.h +++ b/tools/xl/xl.h @@ -178,6 +178,9 @@ int main_vsnddetach(int argc, char **argv); int main_vkbattach(int argc, char **argv); int main_vkblist(int argc, char **argv); int main_vkbdetach(int argc, char **argv); +int main_virtio_diskattach(int argc, char **argv); +int main_virtio_disklist(int argc, char **argv); +int main_virtio_diskdetach(int argc, char **argv); int main_usbctrl_attach(int argc, char **argv); int main_usbctrl_detach(int argc, char **argv); int main_usbdev_attach(int argc, char **argv); diff --git a/tools/xl/xl_cmdtable.c b/tools/xl/xl_cmdtable.c index 6ab5e47..696b190 100644 --- a/tools/xl/xl_cmdtable.c +++ b/tools/xl/xl_cmdtable.c @@ -435,6 +435,21 @@ struct cmd_spec cmd_table[] = { "Destroy a domain's virtual sound device", "<Domain> <DevId>", }, + { "virtio-disk-attach", + &main_virtio_diskattach, 1, 1, + "Create a new virtio block device", + " TBD\n" + }, + { "virtio-disk-list", + &main_virtio_disklist, 0, 0, + "List virtio block devices for a domain", + "<Domain(s)>", + }, + { "virtio-disk-detach", + &main_virtio_diskdetach, 0, 1, + "Destroy a domain's virtio block device", + "<Domain> <DevId>", + }, { "uptime", &main_uptime, 0, 0, "Print uptime for all/some domains", diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c index 2a3364b..054a0c9 100644 --- a/tools/xl/xl_parse.c +++ b/tools/xl/xl_parse.c @@ -1204,6 +1204,120 @@ out: if (rc) exit(EXIT_FAILURE); } +#define MAX_VIRTIO_DISKS 4 + +static int parse_virtio_disk_config(libxl_device_virtio_disk *virtio_disk, char *token) +{ + char *oparg; + libxl_string_list disks = NULL; + int i, rc; + + if (MATCH_OPTION("backend", token, oparg)) { + virtio_disk->backend_domname = strdup(oparg); + } else if (MATCH_OPTION("disks", token, oparg)) { + split_string_into_string_list(oparg, ";", &disks); + + virtio_disk->num_disks = libxl_string_list_length(&disks); + if (virtio_disk->num_disks > MAX_VIRTIO_DISKS) { + fprintf(stderr, "vdisk: currently only %d disks are supported", + MAX_VIRTIO_DISKS); + return 1; + } + virtio_disk->disks = xcalloc(virtio_disk->num_disks, + sizeof(*virtio_disk->disks)); + + for(i = 0; i < virtio_disk->num_disks; i++) { + char *disk_opt; + + rc = split_string_into_pair(disks[i], ":", &disk_opt, + &virtio_disk->disks[i].filename); + if (rc) { + fprintf(stderr, "vdisk: failed to split \"%s\" into pair\n", + disks[i]); + goto out; + } + + if (!strcmp(disk_opt, "ro")) + virtio_disk->disks[i].readonly = 1; + else if (!strcmp(disk_opt, "rw")) + virtio_disk->disks[i].readonly = 0; + else { + fprintf(stderr, "vdisk: failed to parse \"%s\" disk option\n", + disk_opt); + rc = 1; + } + free(disk_opt); + + if (rc) goto out; + } + } else { + fprintf(stderr, "Unknown string \"%s\" in vdisk spec\n", token); + rc = 1; goto out; + } + + rc = 0; + +out: + libxl_string_list_dispose(&disks); + return rc; +} + +static void parse_virtio_disk_list(const XLU_Config *config, + libxl_domain_config *d_config) +{ + XLU_ConfigList *virtio_disks; + const char *item; + char *buf = NULL; + int rc; + + if (!xlu_cfg_get_list (config, "vdisk", &virtio_disks, 0, 0)) { + libxl_domain_build_info *b_info = &d_config->b_info; + int entry = 0; + + /* XXX Remove an extra property */ + libxl_defbool_setdefault(&b_info->arch_arm.virtio, false); + if (!libxl_defbool_val(b_info->arch_arm.virtio)) { + fprintf(stderr, "Virtio device requires Virtio property to be set\n"); + exit(EXIT_FAILURE); + } + + while ((item = xlu_cfg_get_listitem(virtio_disks, entry)) != NULL) { + libxl_device_virtio_disk *virtio_disk; + char *p; + + virtio_disk = ARRAY_EXTEND_INIT(d_config->virtio_disks, + d_config->num_virtio_disks, + libxl_device_virtio_disk_init); + + buf = strdup(item); + + p = strtok (buf, ","); + while (p != NULL) + { + while (*p == ' ') p++; + + rc = parse_virtio_disk_config(virtio_disk, p); + if (rc) goto out; + + p = strtok (NULL, ","); + } + + entry++; + + if (virtio_disk->num_disks == 0) { + fprintf(stderr, "At least one virtio disk should be specified\n"); + rc = 1; goto out; + } + } + } + + rc = 0; + +out: + free(buf); + if (rc) exit(EXIT_FAILURE); +} + void parse_config_data(const char *config_source, const char *config_data, int config_len, @@ -2734,6 +2848,7 @@ skip_usbdev: } parse_vkb_list(config, d_config); + parse_virtio_disk_list(config, d_config); xlu_cfg_get_defbool(config, "xend_suspend_evtchn_compat", &c_info->xend_suspend_evtchn_compat, 0); diff --git a/tools/xl/xl_virtio_disk.c b/tools/xl/xl_virtio_disk.c new file mode 100644 index 0000000..808a7da --- /dev/null +++ b/tools/xl/xl_virtio_disk.c @@ -0,0 +1,46 @@ +/* + * Copyright (C) 2020 EPAM Systems Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU Lesser General Public License as published + * by the Free Software Foundation; version 2.1 only. with the special + * exception on linking described in file LICENSE. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU Lesser General Public License for more details. + */ + +#include <stdlib.h> + +#include <libxl.h> +#include <libxl_utils.h> +#include <libxlutil.h> + +#include "xl.h" +#include "xl_utils.h" +#include "xl_parse.h" + +int main_virtio_diskattach(int argc, char **argv) +{ + return 0; +} + +int main_virtio_disklist(int argc, char **argv) +{ + return 0; +} + +int main_virtio_diskdetach(int argc, char **argv) +{ + return 0; +} + +/* + * Local variables: + * mode: C + * c-basic-offset: 4 + * indent-tabs-mode: nil + * End: + */