Message ID | 20240614103924.1420121-1-berrange@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | i386: revert defaults to 'legacy-vm-type=true' for SEV(-ES) guests | expand |
Ping, any comments for this ? On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > only have been released for a bit over a month when QEMU 9.1 is > released. > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > to 2018. With this in mind, the overwhealming majority of users of > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > forseeable future. > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > machine types will be broken out of the box for most SEV(-ES) users. > Even if the kernel is new enough, it also affects the guest measurement, > which means that their existing tools for validating measurements will > also be broken by the new default. > > This is not a sensible default choice at this point in time. Revert to > the historical behaviour which is compatible with what most users are > currently running. > > This can be re-evaluated a few years down the line, though it is more > likely that all attention will be on SEV-SNP by this time. Distro > vendors may still choose to change this default downstream to align > with their new major releases where they can guarantee the kernel > will always provide the required functionality. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > --- > hw/i386/pc.c | 1 - > qapi/qom.json | 12 ++++++------ > target/i386/sev.c | 7 +++++++ > 3 files changed, 13 insertions(+), 7 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 0469af00a7..b65843c559 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -82,7 +82,6 @@ > GlobalProperty pc_compat_9_0[] = { > { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" }, > { TYPE_X86_CPU, "guest-phys-bits", "0" }, > - { "sev-guest", "legacy-vm-type", "true" }, > { TYPE_X86_CPU, "legacy-multi-node", "on" }, > }; > const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0); > diff --git a/qapi/qom.json b/qapi/qom.json > index 8bd299265e..714ebeec8b 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -912,12 +912,12 @@ > # @handle: SEV firmware handle (default: 0) > # > # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. > -# The newer KVM_SEV_INIT2 interface syncs additional vCPU > -# state when initializing the VMSA structures, which will > -# result in a different guest measurement. Set this to > -# maintain compatibility with older QEMU or kernel versions > -# that rely on legacy KVM_SEV_INIT behavior. > -# (default: false) (since 9.1) > +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs > +# additional vCPU state when initializing the VMSA structures, > +# which will result in a different guest measurement. Toggle > +# this to control compatibility with older QEMU or kernel > +# versions that rely on legacy KVM_SEV_INIT behavior. > +# (default: true) (since 9.1) > # > # Since: 2.12 > ## > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 004c667ac1..16029282b7 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj) > object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy, > OBJ_PROP_FLAG_READWRITE); > object_apply_compat_props(obj); > + > + /* > + * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid > + * breaking existing users of SEV, since the overwhealming > + * majority won't have a new enough kernel for a long time > + */ > + sev_guest->legacy_vm_type = true; > } > > /* guest info specific sev/sev-es */ > -- > 2.45.1 > With regards, Daniel
On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > only have been released for a bit over a month when QEMU 9.1 is > released. > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > to 2018. With this in mind, the overwhealming majority of users of > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > forseeable future. > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > machine types will be broken out of the box for most SEV(-ES) users. > Even if the kernel is new enough, it also affects the guest measurement, > which means that their existing tools for validating measurements will > also be broken by the new default. > > This is not a sensible default choice at this point in time. Revert to > the historical behaviour which is compatible with what most users are > currently running. > > This can be re-evaluated a few years down the line, though it is more > likely that all attention will be on SEV-SNP by this time. Distro > vendors may still choose to change this default downstream to align > with their new major releases where they can guarantee the kernel > will always provide the required functionality. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> This makes sense superficially, so Acked-by: Michael S. Tsirkin <mst@redhat.com> and I'll let kvm maintainers merge this. However I wonder, wouldn't it be better to refactor this: if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) { cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); } else { struct kvm_sev_init args = { 0 }; ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); } to something like: if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) { struct kvm_sev_init args = { 0 }; ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); if (ret && errno == ENOTTY) { cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); } } Yes I realize this means measurement will then depend on the host but it seems nicer than failing guest start, no? > --- > hw/i386/pc.c | 1 - > qapi/qom.json | 12 ++++++------ > target/i386/sev.c | 7 +++++++ > 3 files changed, 13 insertions(+), 7 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 0469af00a7..b65843c559 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -82,7 +82,6 @@ > GlobalProperty pc_compat_9_0[] = { > { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" }, > { TYPE_X86_CPU, "guest-phys-bits", "0" }, > - { "sev-guest", "legacy-vm-type", "true" }, > { TYPE_X86_CPU, "legacy-multi-node", "on" }, > }; > const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0); > diff --git a/qapi/qom.json b/qapi/qom.json > index 8bd299265e..714ebeec8b 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -912,12 +912,12 @@ > # @handle: SEV firmware handle (default: 0) > # > # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. > -# The newer KVM_SEV_INIT2 interface syncs additional vCPU > -# state when initializing the VMSA structures, which will > -# result in a different guest measurement. Set this to > -# maintain compatibility with older QEMU or kernel versions > -# that rely on legacy KVM_SEV_INIT behavior. > -# (default: false) (since 9.1) > +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs > +# additional vCPU state when initializing the VMSA structures, > +# which will result in a different guest measurement. Toggle > +# this to control compatibility with older QEMU or kernel > +# versions that rely on legacy KVM_SEV_INIT behavior. > +# (default: true) (since 9.1) > # > # Since: 2.12 > ## > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 004c667ac1..16029282b7 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj) > object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy, > OBJ_PROP_FLAG_READWRITE); > object_apply_compat_props(obj); > + > + /* > + * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid > + * breaking existing users of SEV, since the overwhealming > + * majority won't have a new enough kernel for a long time > + */ > + sev_guest->legacy_vm_type = true; > } > > /* guest info specific sev/sev-es */ > -- > 2.45.1
On Mon, Jun 24, 2024 at 08:27:01AM -0400, Michael S. Tsirkin wrote: > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > > only have been released for a bit over a month when QEMU 9.1 is > > released. > > > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > > to 2018. With this in mind, the overwhealming majority of users of > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > > forseeable future. > > > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > > machine types will be broken out of the box for most SEV(-ES) users. > > Even if the kernel is new enough, it also affects the guest measurement, > > which means that their existing tools for validating measurements will > > also be broken by the new default. > > > > This is not a sensible default choice at this point in time. Revert to > > the historical behaviour which is compatible with what most users are > > currently running. > > > > This can be re-evaluated a few years down the line, though it is more > > likely that all attention will be on SEV-SNP by this time. Distro > > vendors may still choose to change this default downstream to align > > with their new major releases where they can guarantee the kernel > > will always provide the required functionality. > > > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > > This makes sense superficially, so > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > and I'll let kvm maintainers merge this. > > However I wonder, wouldn't it be better to refactor this: > > if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) { > cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; > > ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); > } else { > struct kvm_sev_init args = { 0 }; > > ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); > } > > to something like: > > if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) { > struct kvm_sev_init args = { 0 }; > > ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); > if (ret && errno == ENOTTY) { > cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; > > ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); > } > } > > > Yes I realize this means measurement will then depend on the host > but it seems nicer than failing guest start, no? IMHO having an invariant measurement for a given guest configuration is a critical guarantee. We should not be allowing guest attestation to break as a side-effect of upgrading a software component, while keeping the guest config unchanged. IOW, I'd view measurement as being "guest ABI", and versioned machine types are there to provide invariant guest ABI. Personally, if we want simplicitly then just not using KVM_SEV_INIT2 at all would be the easiest option. SEV/SEV-ES are legacy technology at this point, so we could be justified in leaving it unchanged and only focusing on SEV-SNP. Unless someone can say what the critical *must have* benefit of using KVM_SEV_INIT2 is ? With regards, Daniel
On Mon, Jun 24, 2024 at 01:38:56PM +0100, Daniel P. Berrangé wrote: > On Mon, Jun 24, 2024 at 08:27:01AM -0400, Michael S. Tsirkin wrote: > > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > > > only have been released for a bit over a month when QEMU 9.1 is > > > released. > > > > > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > > > to 2018. With this in mind, the overwhealming majority of users of > > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > > > forseeable future. > > > > > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > > > machine types will be broken out of the box for most SEV(-ES) users. > > > Even if the kernel is new enough, it also affects the guest measurement, > > > which means that their existing tools for validating measurements will > > > also be broken by the new default. > > > > > > This is not a sensible default choice at this point in time. Revert to > > > the historical behaviour which is compatible with what most users are > > > currently running. > > > > > > This can be re-evaluated a few years down the line, though it is more > > > likely that all attention will be on SEV-SNP by this time. Distro > > > vendors may still choose to change this default downstream to align > > > with their new major releases where they can guarantee the kernel > > > will always provide the required functionality. > > > > > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > > > > This makes sense superficially, so > > > > Acked-by: Michael S. Tsirkin <mst@redhat.com> > > > > and I'll let kvm maintainers merge this. > > > > However I wonder, wouldn't it be better to refactor this: > > > > if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) { > > cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; > > > > ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); > > } else { > > struct kvm_sev_init args = { 0 }; > > > > ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); > > } > > > > to something like: > > > > if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) { > > struct kvm_sev_init args = { 0 }; > > > > ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error); > > if (ret && errno == ENOTTY) { > > cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT; > > > > ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error); > > } > > } > > > > > > Yes I realize this means measurement will then depend on the host > > but it seems nicer than failing guest start, no? > > IMHO having an invariant measurement for a given guest configuration > is a critical guarantee. We should not be allowing guest attestation > to break as a side-effect of upgrading a software component, while > keeping the guest config unchanged. Well attenstation can change for a variety of reasons involving software upgrades: host or guest. It is up to user to either trust both old and new attestion, or pick one. Seems better than forcing policy host side. > IOW, I'd view measurement as being "guest ABI", and versioned machine > types are there to provide invariant guest ABI. In practice we can't always do this exactly: e.g. vhost has a rich feature mask and what we do is clear features not supported by a specific host kernel. Similarly for vhost-user where the ABI depends on an external component. So things can change if you move across host kernels. > Personally, if we want simplicitly then just not using KVM_SEV_INIT2 > at all would be the easiest option. SEV/SEV-ES are legacy technology > at this point, so we could be justified in leaving it unchanged and > only focusing on SEV-SNP. Unless someone can say what the critical > *must have* benefit of using KVM_SEV_INIT2 is ? No objection. > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > only have been released for a bit over a month when QEMU 9.1 is > released. > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > to 2018. With this in mind, the overwhealming majority of users of > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > forseeable future. > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > machine types will be broken out of the box for most SEV(-ES) users. > Even if the kernel is new enough, it also affects the guest measurement, > which means that their existing tools for validating measurements will > also be broken by the new default. > > This is not a sensible default choice at this point in time. Revert to > the historical behaviour which is compatible with what most users are > currently running. Part of the reason for the change is that SEV-ES measurements are already affected by some short-comings of the legacy KVM_SEV_ES_INIT API. Namely, if the kvm_amd.debug-swap module param is used to enable that SEV-ES feature, then that feature will get enabled on the KVM side and change the initial guest measurement (due to VMSA_FEATURES field of the vCPU's VMSA changing), and userspace has no way to control that on a per-VM basis, so measurement for any particular invocation will be somewhat random depending on the system configuration and kernel level. I think that's why users of newer QEMU machine types are highly encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this causing issues for older QEMU machine types that previously relied on the legacy interface, since we do want to avoid measurement changing for an existing guest that was previously working on an older kernel, which is why this flag defaults to true for pre-9.1 machine types. But on newer kernels there is still potential for issues relating to debug-swap (and other VMSA features that get added to KVM in the future) and how they may cause measurement changes underneath the covers if we don't allow userspace the ability to control what is/isn't disabled. Because of that I think it's less headache for userspace to have to opt-in to legacy interface when using newer machine models. It should be a concious decision to keep using this deprecated interface with known limitations that could affect measurement in unexpected ways. I was actually planning to go the other direction on this because currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to the above KVM_SEV_ES_INIT interface and potential inherit the issues noted above. So I was planning on getting rid of the fallback, and basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user manually sets sev_guest->legacy_vm_type via cmdline. -Mike > > This can be re-evaluated a few years down the line, though it is more > likely that all attention will be on SEV-SNP by this time. Distro > vendors may still choose to change this default downstream to align > with their new major releases where they can guarantee the kernel > will always provide the required functionality. > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> > --- > hw/i386/pc.c | 1 - > qapi/qom.json | 12 ++++++------ > target/i386/sev.c | 7 +++++++ > 3 files changed, 13 insertions(+), 7 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 0469af00a7..b65843c559 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -82,7 +82,6 @@ > GlobalProperty pc_compat_9_0[] = { > { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" }, > { TYPE_X86_CPU, "guest-phys-bits", "0" }, > - { "sev-guest", "legacy-vm-type", "true" }, > { TYPE_X86_CPU, "legacy-multi-node", "on" }, > }; > const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0); > diff --git a/qapi/qom.json b/qapi/qom.json > index 8bd299265e..714ebeec8b 100644 > --- a/qapi/qom.json > +++ b/qapi/qom.json > @@ -912,12 +912,12 @@ > # @handle: SEV firmware handle (default: 0) > # > # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. > -# The newer KVM_SEV_INIT2 interface syncs additional vCPU > -# state when initializing the VMSA structures, which will > -# result in a different guest measurement. Set this to > -# maintain compatibility with older QEMU or kernel versions > -# that rely on legacy KVM_SEV_INIT behavior. > -# (default: false) (since 9.1) > +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs > +# additional vCPU state when initializing the VMSA structures, > +# which will result in a different guest measurement. Toggle > +# this to control compatibility with older QEMU or kernel > +# versions that rely on legacy KVM_SEV_INIT behavior. > +# (default: true) (since 9.1) > # > # Since: 2.12 > ## > diff --git a/target/i386/sev.c b/target/i386/sev.c > index 004c667ac1..16029282b7 100644 > --- a/target/i386/sev.c > +++ b/target/i386/sev.c > @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj) > object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy, > OBJ_PROP_FLAG_READWRITE); > object_apply_compat_props(obj); > + > + /* > + * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid > + * breaking existing users of SEV, since the overwhealming > + * majority won't have a new enough kernel for a long time > + */ > + sev_guest->legacy_vm_type = true; > } > > /* guest info specific sev/sev-es */ > -- > 2.45.1 >
On Mon, Jun 24, 2024 at 08:19:19PM -0500, Michael Roth wrote: > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > > only have been released for a bit over a month when QEMU 9.1 is > > released. > > > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > > to 2018. With this in mind, the overwhealming majority of users of > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > > forseeable future. > > > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > > machine types will be broken out of the box for most SEV(-ES) users. > > Even if the kernel is new enough, it also affects the guest measurement, > > which means that their existing tools for validating measurements will > > also be broken by the new default. > > > > This is not a sensible default choice at this point in time. Revert to > > the historical behaviour which is compatible with what most users are > > currently running. > > Part of the reason for the change is that SEV-ES measurements are > already affected by some short-comings of the legacy KVM_SEV_ES_INIT > API. Namely, if the kvm_amd.debug-swap module param is used to enable > that SEV-ES feature, then that feature will get enabled on the KVM side > and change the initial guest measurement (due to VMSA_FEATURES field > of the vCPU's VMSA changing), and userspace has no way to control that > on a per-VM basis, so measurement for any particular invocation will > be somewhat random depending on the system configuration and kernel > level. The debug-swap feature was set to disabled by default. So that could be just a docs problem to say if you want to use that feature, then you must set the legacy-vm=false property. IOW an opt-in to incompatible behaviour. > I think that's why users of newer QEMU machine types are highly > encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this > causing issues for older QEMU machine types that previously relied on > the legacy interface, since we do want to avoid measurement changing > for an existing guest that was previously working on an older kernel, > which is why this flag defaults to true for pre-9.1 machine types. This justification mis-understands how machine types are actually used in practice though. There is *zero* correlation between use of the new machine types, and availabilty of the new kernel interface. 99% of usage of QEMU, will just ask for the unversioned "q35" / "pc" machines. They will be expanded to the very latest machine type version, either internally by QEMU, or by libvirt prior to launching the VM. Either way, you can expect essentially everything to be running on the latest machine type versions, regardless of kernel version. So making the latest machine types dependent on a kernel version that is brand new is just not a sensible default. Latest QEMU machines types need to work on kernel releases years old, without expecting the user to set magic flags to avoid incompatibility. > I was actually planning to go the other direction on this because > currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if > KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to > the above KVM_SEV_ES_INIT interface and potential inherit the issues > noted above. So I was planning on getting rid of the fallback, and > basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user > manually sets sev_guest->legacy_vm_type via cmdline. Dynamic detection of SEV_INIT vs SEV-INIT2 is a bad idea as that breaks migration when someone is moving from a host with new kernel to an older kernel, while keeping the QEMU machine type unchanged. The behaviour of what kernel feature to use must be controllable with an explicit choice. With regards, Daniel
On Tue, Jun 25, 2024 at 10:51:43AM +0100, Daniel P. Berrangé wrote: > On Mon, Jun 24, 2024 at 08:19:19PM -0500, Michael Roth wrote: > > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote: > > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will > > > only have been released for a bit over a month when QEMU 9.1 is > > > released. > > > > > > The SEV(-ES) support in QEMU has been present since 2.12 dating back > > > to 2018. With this in mind, the overwhealming majority of users of > > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the > > > forseeable future. > > > > > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU > > > machine types will be broken out of the box for most SEV(-ES) users. > > > Even if the kernel is new enough, it also affects the guest measurement, > > > which means that their existing tools for validating measurements will > > > also be broken by the new default. > > > > > > This is not a sensible default choice at this point in time. Revert to > > > the historical behaviour which is compatible with what most users are > > > currently running. > > > > Part of the reason for the change is that SEV-ES measurements are > > already affected by some short-comings of the legacy KVM_SEV_ES_INIT > > API. Namely, if the kvm_amd.debug-swap module param is used to enable > > that SEV-ES feature, then that feature will get enabled on the KVM side > > and change the initial guest measurement (due to VMSA_FEATURES field > > of the vCPU's VMSA changing), and userspace has no way to control that > > on a per-VM basis, so measurement for any particular invocation will > > be somewhat random depending on the system configuration and kernel > > level. > > The debug-swap feature was set to disabled by default. So that > could be just a docs problem to say if you want to use that > feature, then you must set the legacy-vm=false property. IOW > an opt-in to incompatible behaviour. debug-swap defaulted to true for KVM_SEV*_INIT guests unfortunately, so the ship sailed on preparing users for the change in advance and instead over time legacy guests users will gradually see the measurement change when they upgrade to new kernels and then need to take steps to either adjust their measurement calculation or disable debug-swap via module parameters. debug-swap is fairly recent as well however, so there's a fair chance users hitting the above issue will have the option of switching over to KVM_SEV_INIT2 where it's not much additional work to update measurements, and in turn they'll benefit from better control over what ends up in the VMSA as well. If they do plan to eventually switch to SNP, these steps will bring them closer toward that end as well since there's a lot of common handling/infrastructure in that regard. > > > > I think that's why users of newer QEMU machine types are highly > > encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this > > causing issues for older QEMU machine types that previously relied on > > the legacy interface, since we do want to avoid measurement changing > > for an existing guest that was previously working on an older kernel, > > which is why this flag defaults to true for pre-9.1 machine types. > > This justification mis-understands how machine types are actually > used in practice though. There is *zero* correlation between use > of the new machine types, and availabilty of the new kernel > interface. > > 99% of usage of QEMU, will just ask for the unversioned "q35" > / "pc" machines. They will be expanded to the very latest machine > type version, either internally by QEMU, or by libvirt prior to > launching the VM. In my experience that's how many VMs start off until they start breaking on newer kernels/QEMUs, then everyone scrambles to revert to the old behavior. Quite often that ends up involving just tacking on an explicit machine-type to maintain migration/behavioral compatibility with what QEMU originally defaulted to when the VM was created. But when first creating the VM, there is less expectation about what should/shouldn't work. If they see failures because KVM_SEV_INIT2 isn't available, it seems worthwhile that they need to make a decision on whether to upgrade kernel or adopt the legacy behavior and be stuck on a reduced featureset for the life of the guest. "Just works" is nice, but "just working" in the case of KVM_SEV*_INIT comes with potential headaches down the road and ideally users would be aware of what they are signing up for. If failing is too heavy-handed, maybe some type of warning that gets printed by QEMU any time KVM_SEV*_INIT set? Then maybe down the road if we decide to finally default KVM_SEV_INIT2, there's a better chance that users have taken the hint and have already made the transiton? I'll also defer to the maintainers on this point though since there are clearly merits to both approaches. > > Either way, you can expect essentially everything to be running on > the latest machine type versions, regardless of kernel version. > > So making the latest machine types dependent on a kernel version > that is brand new is just not a sensible default. Latest QEMU > machines types need to work on kernel releases years old, without > expecting the user to set magic flags to avoid incompatibility. > > > I was actually planning to go the other direction on this because > > currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if > > KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to > > the above KVM_SEV_ES_INIT interface and potential inherit the issues > > noted above. So I was planning on getting rid of the fallback, and > > basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user > > manually sets sev_guest->legacy_vm_type via cmdline. > > Dynamic detection of SEV_INIT vs SEV-INIT2 is a bad idea as that > breaks migration when someone is moving from a host with new > kernel to an older kernel, while keeping the QEMU machine type > unchanged. The behaviour of what kernel feature to use must be > controllable with an explicit choice. Totally agreed on this point. I've sent a patch that does this, and adopted the QAPI wording you used in this patch so there is less churn if they are both applied: https://lore.kernel.org/kvm/20240704000019.3928862-1-michael.roth@amd.com/T/#u Thanks, Mike > > > With regards, > Daniel > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :| > >
diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 0469af00a7..b65843c559 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -82,7 +82,6 @@ GlobalProperty pc_compat_9_0[] = { { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" }, { TYPE_X86_CPU, "guest-phys-bits", "0" }, - { "sev-guest", "legacy-vm-type", "true" }, { TYPE_X86_CPU, "legacy-multi-node", "on" }, }; const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0); diff --git a/qapi/qom.json b/qapi/qom.json index 8bd299265e..714ebeec8b 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -912,12 +912,12 @@ # @handle: SEV firmware handle (default: 0) # # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM. -# The newer KVM_SEV_INIT2 interface syncs additional vCPU -# state when initializing the VMSA structures, which will -# result in a different guest measurement. Set this to -# maintain compatibility with older QEMU or kernel versions -# that rely on legacy KVM_SEV_INIT behavior. -# (default: false) (since 9.1) +# The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs +# additional vCPU state when initializing the VMSA structures, +# which will result in a different guest measurement. Toggle +# this to control compatibility with older QEMU or kernel +# versions that rely on legacy KVM_SEV_INIT behavior. +# (default: true) (since 9.1) # # Since: 2.12 ## diff --git a/target/i386/sev.c b/target/i386/sev.c index 004c667ac1..16029282b7 100644 --- a/target/i386/sev.c +++ b/target/i386/sev.c @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj) object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy, OBJ_PROP_FLAG_READWRITE); object_apply_compat_props(obj); + + /* + * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid + * breaking existing users of SEV, since the overwhealming + * majority won't have a new enough kernel for a long time + */ + sev_guest->legacy_vm_type = true; } /* guest info specific sev/sev-es */
The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will only have been released for a bit over a month when QEMU 9.1 is released. The SEV(-ES) support in QEMU has been present since 2.12 dating back to 2018. With this in mind, the overwhealming majority of users of SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the forseeable future. IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU machine types will be broken out of the box for most SEV(-ES) users. Even if the kernel is new enough, it also affects the guest measurement, which means that their existing tools for validating measurements will also be broken by the new default. This is not a sensible default choice at this point in time. Revert to the historical behaviour which is compatible with what most users are currently running. This can be re-evaluated a few years down the line, though it is more likely that all attention will be on SEV-SNP by this time. Distro vendors may still choose to change this default downstream to align with their new major releases where they can guarantee the kernel will always provide the required functionality. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- hw/i386/pc.c | 1 - qapi/qom.json | 12 ++++++------ target/i386/sev.c | 7 +++++++ 3 files changed, 13 insertions(+), 7 deletions(-)