Message ID | 20200724025744.69644-10-david@gibson.dropbear.id.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Generalize memory encryption models | expand |
* David Gibson (david@gibson.dropbear.id.au) wrote: > The default behaviour for virtio devices is not to use the platforms normal > DMA paths, but instead to use the fact that it's running in a hypervisor > to directly access guest memory. That doesn't work if the guest's memory > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > So, if a host trust limitation mechanism is enabled, then apply the > iommu_platform=on option so it will go through normal DMA mechanisms. > Those will presumably have some way of marking memory as shared with the > hypervisor or hardware so that DMA will work. > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Good, it's just too easy to forget them at the moment and get hopelessly confused. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > hw/core/machine.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/hw/core/machine.c b/hw/core/machine.c > index b599b0ba65..2a723bf07b 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -28,6 +28,8 @@ > #include "hw/mem/nvdimm.h" > #include "migration/vmstate.h" > #include "exec/host-trust-limitation.h" > +#include "hw/virtio/virtio.h" > +#include "hw/virtio/virtio-pci.h" > > GlobalProperty hw_compat_5_0[] = { > { "virtio-balloon-device", "page-poison", "false" }, > @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) > * areas. > */ > machine_set_mem_merge(OBJECT(machine), false, &error_abort); > + > + /* > + * Virtio devices can't count on directly accessing guest > + * memory, so they need iommu_platform=on to use normal DMA > + * mechanisms. That requires disabling legacy virtio support > + * for virtio pci devices > + */ > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); > } > > machine_class->init(machine); > -- > 2.26.2 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On Mon, 27 Jul 2020 16:05:14 +0100 "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > * David Gibson (david@gibson.dropbear.id.au) wrote: > > The default behaviour for virtio devices is not to use the platforms normal > > DMA paths, but instead to use the fact that it's running in a hypervisor > > to directly access guest memory. That doesn't work if the guest's memory > > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > > > So, if a host trust limitation mechanism is enabled, then apply the > > iommu_platform=on option so it will go through normal DMA mechanisms. > > Those will presumably have some way of marking memory as shared with the > > hypervisor or hardware so that DMA will work. > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > Good, it's just too easy to forget them at the moment and get hopelessly > confused. > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > --- > > hw/core/machine.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > index b599b0ba65..2a723bf07b 100644 > > --- a/hw/core/machine.c > > +++ b/hw/core/machine.c > > @@ -28,6 +28,8 @@ > > #include "hw/mem/nvdimm.h" > > #include "migration/vmstate.h" > > #include "exec/host-trust-limitation.h" > > +#include "hw/virtio/virtio.h" > > +#include "hw/virtio/virtio-pci.h" > > > > GlobalProperty hw_compat_5_0[] = { > > { "virtio-balloon-device", "page-poison", "false" }, > > @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) > > * areas. > > */ > > machine_set_mem_merge(OBJECT(machine), false, &error_abort); > > + > > + /* > > + * Virtio devices can't count on directly accessing guest > > + * memory, so they need iommu_platform=on to use normal DMA > > + * mechanisms. That requires disabling legacy virtio support > > + * for virtio pci devices > > + */ > > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); What about non-transitional devices (eg. vhost-user-fs-pci) ? They don't know about "disable-legacy" since they don't need it. > > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); > > } > > > > machine_class->init(machine); > > -- > > 2.26.2 > > > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > >
On Thu, 13 Aug 2020 09:43:56 +0200 Greg Kurz <groug@kaod.org> wrote: > On Mon, 27 Jul 2020 16:05:14 +0100 > "Dr. David Alan Gilbert" <dgilbert@redhat.com> wrote: > > > * David Gibson (david@gibson.dropbear.id.au) wrote: > > > The default behaviour for virtio devices is not to use the platforms normal > > > DMA paths, but instead to use the fact that it's running in a hypervisor > > > to directly access guest memory. That doesn't work if the guest's memory > > > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > > > > > So, if a host trust limitation mechanism is enabled, then apply the > > > iommu_platform=on option so it will go through normal DMA mechanisms. > > > Those will presumably have some way of marking memory as shared with the > > > hypervisor or hardware so that DMA will work. > > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > > > Good, it's just too easy to forget them at the moment and get hopelessly > > confused. > > > > > > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > > > > --- > > > hw/core/machine.c | 11 +++++++++++ > > > 1 file changed, 11 insertions(+) > > > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > > index b599b0ba65..2a723bf07b 100644 > > > --- a/hw/core/machine.c > > > +++ b/hw/core/machine.c > > > @@ -28,6 +28,8 @@ > > > #include "hw/mem/nvdimm.h" > > > #include "migration/vmstate.h" > > > #include "exec/host-trust-limitation.h" > > > +#include "hw/virtio/virtio.h" > > > +#include "hw/virtio/virtio-pci.h" > > > > > > GlobalProperty hw_compat_5_0[] = { > > > { "virtio-balloon-device", "page-poison", "false" }, > > > @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) > > > * areas. > > > */ > > > machine_set_mem_merge(OBJECT(machine), false, &error_abort); > > > + > > > + /* > > > + * Virtio devices can't count on directly accessing guest > > > + * memory, so they need iommu_platform=on to use normal DMA > > > + * mechanisms. That requires disabling legacy virtio support > > > + * for virtio pci devices > > > + */ > > > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); > > What about non-transitional devices (eg. vhost-user-fs-pci) ? They don't know > about "disable-legacy" since they don't need it. > Ok, it looks like we should add a bool argument to object_register_sugar_prop() that sets the .optional field of GlobalProperty. > > > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); > > > } > > > > > > machine_class->init(machine); > > > -- > > > 2.26.2 > > > > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > > > > >
On Fri, 24 Jul 2020 12:57:43 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > The default behaviour for virtio devices is not to use the platforms normal > DMA paths, but instead to use the fact that it's running in a hypervisor > to directly access guest memory. That doesn't work if the guest's memory > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > So, if a host trust limitation mechanism is enabled, then apply the > iommu_platform=on option so it will go through normal DMA mechanisms. > Those will presumably have some way of marking memory as shared with the > hypervisor or hardware so that DMA will work. Sorry for being this late. I had to do some high priority debugging, which made me drop everything else, and after that I had some vacation. I have some questions about the bigger picture. The promised benefit of this patch for users that invoke QEMU manually is relatively clear: it alters the default value of some virtio properties, so that using the defaults does not result in a bugous configuration. This comes at a price. I used to think of device property default values like this. If I don't specify it and I use the default machine, I will effectively get the the default value of of the property (as reported by qemu -device dev-name,?). If I use a compat machine, then I will get the compatibility default value: i.e. the what is reported as the default value, if I invoke the binary whose default machine is my compat machine. With this patch, that reasoning is not valid any more. Did we do something like this before, or is this the first time we introduce this complication? In any case, I suppose, this change needs a documentation update, which I could not find in the series. How are things supposed to pan out when QEMU is used with management software? I was told that libvirt's policy is to be explicit and not let QEMU use defaults. But this policy does not seem to apply to iommu_platform -- at least not on s390x. Why is this? Is this likely to change in the future? Furthermore, the libvirt documentation is IMHO not that great when it comes to iommu_platform. All I've found is """ Virtio-related options QEMU's virtio devices have some attributes related to the virtio transport under the driver element: The iommu attribute enables the use of emulated IOMMU by the device. """ which: * Is not explicit about the default, but suggests that default is off (because it needs to be enabled), which would reflect the current state of affairs (without this patch). * Makes me wonder, to what extent does the libvirt concept correspond to the virtio semantics of _F_ACCESS_PLATFORM. I.e. we don't really do any IOMMU emulation with virtio-ccw. I guess host trust limitation is something that is to be expressed in libvirt, or? Do we have a design for that? I was also reflecting on how does this patch compare to on/off/auto, but this email is already too long, so decided keep my thoughts for myself -- for now. Regards, Halil > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > --- > hw/core/machine.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/hw/core/machine.c b/hw/core/machine.c > index b599b0ba65..2a723bf07b 100644 > --- a/hw/core/machine.c > +++ b/hw/core/machine.c > @@ -28,6 +28,8 @@ > #include "hw/mem/nvdimm.h" > #include "migration/vmstate.h" > #include "exec/host-trust-limitation.h" > +#include "hw/virtio/virtio.h" > +#include "hw/virtio/virtio-pci.h" > > GlobalProperty hw_compat_5_0[] = { > { "virtio-balloon-device", "page-poison", "false" }, > @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) > * areas. > */ > machine_set_mem_merge(OBJECT(machine), false, &error_abort); > + > + /* > + * Virtio devices can't count on directly accessing guest > + * memory, so they need iommu_platform=on to use normal DMA > + * mechanisms. That requires disabling legacy virtio support > + * for virtio pci devices > + */ > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); > } > > machine_class->init(machine);
On Mon, Sep 07, 2020 at 05:10:46PM +0200, Halil Pasic wrote: > On Fri, 24 Jul 2020 12:57:43 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > The default behaviour for virtio devices is not to use the platforms normal > > DMA paths, but instead to use the fact that it's running in a hypervisor > > to directly access guest memory. That doesn't work if the guest's memory > > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > > > So, if a host trust limitation mechanism is enabled, then apply the > > iommu_platform=on option so it will go through normal DMA mechanisms. > > Those will presumably have some way of marking memory as shared with the > > hypervisor or hardware so that DMA will work. > > Sorry for being this late. I had to do some high priority debugging, > which made me drop everything else, and after that I had some vacation. > > I have some questions about the bigger picture. The promised benefit of > this patch for users that invoke QEMU manually is relatively clear: it > alters the default value of some virtio properties, so that using the > defaults does not result in a bugous configuration. Right. > This comes at a price. I used to think of device property default values > like this. If I don't specify it and I use the default machine, I will > effectively get the the default value of of the property (as reported by > qemu -device dev-name,?). If I use a compat machine, then I will get the > compatibility default value: i.e. the what is reported as the default > value, if I invoke the binary whose default machine is my compat machine. Hm, ok. My mental model has always been that defaults were essentially per-machine-type. Which, I grant you isn't really consistent with the existence of the -device dev,? probing. On the other hand, it's possible for a machine/platforms to impose restrictions on almost any property of almost any device, and it would suck for the user to have to know all of them just in order to start things up with default options. Given that model, extending that to per-machine-variant, including machine options like htl seemed natural. > With this patch, that reasoning is not valid any more. Did we do > something like this before, or is this the first time we introduce this > complication? I don't know off hand if we have per-machine differences for certain options in practice, or only in theory. > In any case, I suppose, this change needs a documentation update, which I > could not find in the series. Uh.. fair enough.. I just need to figure out where. > How are things supposed to pan out when QEMU is used with management > software? > > I was told that libvirt's policy is to be explicit and not let QEMU use > defaults. But this policy does not seem to apply to iommu_platform -- at > least not on s390x. Why is this? Is this likely to change in the future? Ugh.. so. That policy of libvirt's is very double edged. It's there because it allows libvirt to create consistent machines that can be migrated properly and so forth. However, it basically locks libvirt into having to know about every option of qemu, ever. Unsurprisingly there are some gaps, hence things like this. Unfortunately that can't be fixed without substantially redesigning libvirt in a way that can't maintain compatibility. > Furthermore, the libvirt documentation is IMHO not that great when it > comes to iommu_platform. All I've found is > > """ > Virtio-related options > > > QEMU's virtio devices have some attributes related to the virtio transport under the driver element: The iommu attribute enables the use of emulated IOMMU by the device. > """ > > which: > * Is not explicit about the default, but suggests that default is off > (because it needs to be enabled), which would reflect the current state > of affairs (without this patch). > * Makes me wonder, to what extent does the libvirt concept correspond > to the virtio semantics of _F_ACCESS_PLATFORM. I.e. we don't really > do any IOMMU emulation with virtio-ccw. > > I guess host trust limitation is something that is to be expressed in > libvirt, or? Do we have a design for that? Yeah, I guess we'd need to. See "having to know about every option" above :/. No, I haven't thought about a design for that. > I was also reflecting on how does this patch compare to on/off/auto, but > this email is already too long, so decided keep my thoughts for myself > -- for now. on/off/auto works for your case on s390, but I don't think it works for POWER, though I forget the details, so maybe I'm wrong about that. > > Regards, > Halil > > > > > Signed-off-by: David Gibson <david@gibson.dropbear.id.au> > > --- > > hw/core/machine.c | 11 +++++++++++ > > 1 file changed, 11 insertions(+) > > > > diff --git a/hw/core/machine.c b/hw/core/machine.c > > index b599b0ba65..2a723bf07b 100644 > > --- a/hw/core/machine.c > > +++ b/hw/core/machine.c > > @@ -28,6 +28,8 @@ > > #include "hw/mem/nvdimm.h" > > #include "migration/vmstate.h" > > #include "exec/host-trust-limitation.h" > > +#include "hw/virtio/virtio.h" > > +#include "hw/virtio/virtio-pci.h" > > > > GlobalProperty hw_compat_5_0[] = { > > { "virtio-balloon-device", "page-poison", "false" }, > > @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) > > * areas. > > */ > > machine_set_mem_merge(OBJECT(machine), false, &error_abort); > > + > > + /* > > + * Virtio devices can't count on directly accessing guest > > + * memory, so they need iommu_platform=on to use normal DMA > > + * mechanisms. That requires disabling legacy virtio support > > + * for virtio pci devices > > + */ > > + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); > > + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); > > } > > > > machine_class->init(machine); > > >
On Fri, 11 Sep 2020 12:04:42 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Mon, Sep 07, 2020 at 05:10:46PM +0200, Halil Pasic wrote: > > On Fri, 24 Jul 2020 12:57:43 +1000 > > David Gibson <david@gibson.dropbear.id.au> wrote: > > > > > The default behaviour for virtio devices is not to use the platforms normal > > > DMA paths, but instead to use the fact that it's running in a hypervisor > > > to directly access guest memory. That doesn't work if the guest's memory > > > is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. > > > > > > So, if a host trust limitation mechanism is enabled, then apply the > > > iommu_platform=on option so it will go through normal DMA mechanisms. > > > Those will presumably have some way of marking memory as shared with the > > > hypervisor or hardware so that DMA will work. > > > > Sorry for being this late. I had to do some high priority debugging, > > which made me drop everything else, and after that I had some vacation. > > > > I have some questions about the bigger picture. The promised benefit of > > this patch for users that invoke QEMU manually is relatively clear: it > > alters the default value of some virtio properties, so that using the > > defaults does not result in a bugous configuration. > > Right. > > > This comes at a price. I used to think of device property default values > > like this. If I don't specify it and I use the default machine, I will > > effectively get the the default value of of the property (as reported by > > qemu -device dev-name,?). If I use a compat machine, then I will get the > > compatibility default value: i.e. the what is reported as the default > > value, if I invoke the binary whose default machine is my compat machine. > > Hm, ok. My mental model has always been that defaults were > essentially per-machine-type. Which, I grant you isn't really > consistent with the existence of the -device dev,? probing. On the > other hand, it's possible for a machine/platforms to impose > restrictions on almost any property of almost any device, and it would > suck for the user to have to know all of them just in order to start > things up with default options. > > Given that model, extending that to per-machine-variant, including > machine options like htl seemed natural. > Yesterday I got some more education on the matters of Libvirt by Boris. I think now we both agree that this complicates the mental model, but that it simplifies usage, and is worth it. > > With this patch, that reasoning is not valid any more. Did we do > > something like this before, or is this the first time we introduce this > > complication? > > I don't know off hand if we have per-machine differences for certain > options in practice, or only in theory. > > > In any case, I suppose, this change needs a documentation update, which I > > could not find in the series. > > Uh.. fair enough.. I just need to figure out where. > > > How are things supposed to pan out when QEMU is used with management > > software? > > > > I was told that libvirt's policy is to be explicit and not let QEMU use > > defaults. But this policy does not seem to apply to iommu_platform -- at > > least not on s390x. Why is this? Is this likely to change in the future? > > Ugh.. so. That policy of libvirt's is very double edged. It's there > because it allows libvirt to create consistent machines that can be > migrated properly and so forth. However, it basically locks libvirt > into having to know about every option of qemu, ever. Unsurprisingly > there are some gaps, hence things like this. > > Unfortunately that can't be fixed without substantially redesigning > libvirt in a way that can't maintain compatibility. > Boris told me that virtio properties are a notable exception from the aforementioned Libvirt rule. He showed me the code, and libvirt seems to be intentionally non-explicit on iommu and other virtio properties, in a sense that if nothing is specified in the xml, nothing is specified on the generated command line, and we get the default. Unfortunately, we didn't manage to figure out the reason. > > Furthermore, the libvirt documentation is IMHO not that great when it > > comes to iommu_platform. All I've found is > > > > """ > > Virtio-related options > > > > > > QEMU's virtio devices have some attributes related to the virtio transport under the driver element: The iommu attribute enables the use of emulated IOMMU by the device. > > """ > > > > which: > > * Is not explicit about the default, but suggests that default is off > > (because it needs to be enabled), which would reflect the current state > > of affairs (without this patch). > > * Makes me wonder, to what extent does the libvirt concept correspond > > to the virtio semantics of _F_ACCESS_PLATFORM. I.e. we don't really > > do any IOMMU emulation with virtio-ccw. > > > > I guess host trust limitation is something that is to be expressed in > > libvirt, or? Do we have a design for that? > > Yeah, I guess we'd need to. See "having to know about every option" > above :/. No, I haven't thought about a design for that. > > > I was also reflecting on how does this patch compare to on/off/auto, but > > this email is already too long, so decided keep my thoughts for myself > > -- for now. > > on/off/auto works for your case on s390, but I don't think it works > for POWER, though I forget the details, so maybe I'm wrong about that. > If management software/libvirt will let us use the defaults, I see this patch as a big usability improvement. Thus Reviewed-by: Halil Pasic <pasic@linux.ibm.com> I believe on/off/auto could work for POWER as well, just not the way I intended it to work for s390, which is basically changing the value as we transition the plain VM to a PV VM. I see a potential semantic benefit to on/off/auto, compared to some machine properties affecting the default value of some device properties. The semantic I would like to have for on/off/auto seems to be a bit different than the semantic the community seems to associate with auto right now. What I would like to have is a 'be as intelligent about it as you can' type of 'auto'. Let me bring some examples outside QEMU. For instance the intel_gpu_frequency tool lets one lock the GPU frequency to certain values (http://manpages.ubuntu.com/manpages/xenial/man1/intel_gpu_frequency.1.html). I guess there are scenarios where that can be useful. But for normal usage, I guess both for CPU and GPU frequency the user does not want to have a rigid value, but it expects the system to balance the trade-offs and tweak the frequency according to the current situation. So in by book, auto would allow libvirt and the user to be explicit about not wanting to control that aspect of the VM, and QEMU to step beyond 'sane defaults'. The current semantics of auto seems to be: if you specify auto you may end up with something different that if you specify default, possibly based on some environment considerations, but the 'auto' seems to be resolved to 'on' or 'off' once, and (it seems preferably) at initialization time. Given all that, I'm pretty happy with your current solution. Doing on/off/auto and resolving 'auto' based on host_trust_limitation_enabled() would IMHO not be much different from your current solution, so I see no big benefit to that. Maybe a tad cleaner UI, but more code, and more complicated initialization interdependency (we would need to have host_trust_limitation_enabled() armed when we resolve iommu_platform, don't know if that's natural). Regards, Halil
diff --git a/hw/core/machine.c b/hw/core/machine.c index b599b0ba65..2a723bf07b 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -28,6 +28,8 @@ #include "hw/mem/nvdimm.h" #include "migration/vmstate.h" #include "exec/host-trust-limitation.h" +#include "hw/virtio/virtio.h" +#include "hw/virtio/virtio-pci.h" GlobalProperty hw_compat_5_0[] = { { "virtio-balloon-device", "page-poison", "false" }, @@ -1161,6 +1163,15 @@ void machine_run_board_init(MachineState *machine) * areas. */ machine_set_mem_merge(OBJECT(machine), false, &error_abort); + + /* + * Virtio devices can't count on directly accessing guest + * memory, so they need iommu_platform=on to use normal DMA + * mechanisms. That requires disabling legacy virtio support + * for virtio pci devices + */ + object_register_sugar_prop(TYPE_VIRTIO_PCI, "disable-legacy", "on"); + object_register_sugar_prop(TYPE_VIRTIO_DEVICE, "iommu_platform", "on"); } machine_class->init(machine);
The default behaviour for virtio devices is not to use the platforms normal DMA paths, but instead to use the fact that it's running in a hypervisor to directly access guest memory. That doesn't work if the guest's memory is protected from hypervisor access, such as with AMD's SEV or POWER's PEF. So, if a host trust limitation mechanism is enabled, then apply the iommu_platform=on option so it will go through normal DMA mechanisms. Those will presumably have some way of marking memory as shared with the hypervisor or hardware so that DMA will work. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> --- hw/core/machine.c | 11 +++++++++++ 1 file changed, 11 insertions(+)