diff mbox series

i386: revert defaults to 'legacy-vm-type=true' for SEV(-ES) guests

Message ID 20240614103924.1420121-1-berrange@redhat.com (mailing list archive)
State New, archived
Headers show
Series i386: revert defaults to 'legacy-vm-type=true' for SEV(-ES) guests | expand

Commit Message

Daniel P. Berrangé June 14, 2024, 10:39 a.m. UTC
The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
only have been released for a bit over a month when QEMU 9.1 is
released.

The SEV(-ES) support in QEMU has been present since 2.12 dating back
to 2018. With this in mind, the overwhealming majority of users of
SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
forseeable future.

IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
machine types will be broken out of the box for most SEV(-ES) users.
Even if the kernel is new enough, it also affects the guest measurement,
which means that their existing tools for validating measurements will
also be broken by the new default.

This is not a sensible default choice at this point in time. Revert to
the historical behaviour which is compatible with what most users are
currently running.

This can be re-evaluated a few years down the line, though it is more
likely that all attention will be on SEV-SNP by this time. Distro
vendors may still choose to change this default downstream to align
with their new major releases where they can guarantee the kernel
will always provide the required functionality.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 hw/i386/pc.c      |  1 -
 qapi/qom.json     | 12 ++++++------
 target/i386/sev.c |  7 +++++++
 3 files changed, 13 insertions(+), 7 deletions(-)

Comments

Daniel P. Berrangé June 24, 2024, 11:43 a.m. UTC | #1
Ping, any comments for this ?

On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> only have been released for a bit over a month when QEMU 9.1 is
> released.
> 
> The SEV(-ES) support in QEMU has been present since 2.12 dating back
> to 2018. With this in mind, the overwhealming majority of users of
> SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> forseeable future.
> 
> IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> machine types will be broken out of the box for most SEV(-ES) users.
> Even if the kernel is new enough, it also affects the guest measurement,
> which means that their existing tools for validating measurements will
> also be broken by the new default.
> 
> This is not a sensible default choice at this point in time. Revert to
> the historical behaviour which is compatible with what most users are
> currently running.
> 
> This can be re-evaluated a few years down the line, though it is more
> likely that all attention will be on SEV-SNP by this time. Distro
> vendors may still choose to change this default downstream to align
> with their new major releases where they can guarantee the kernel
> will always provide the required functionality.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  hw/i386/pc.c      |  1 -
>  qapi/qom.json     | 12 ++++++------
>  target/i386/sev.c |  7 +++++++
>  3 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 0469af00a7..b65843c559 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -82,7 +82,6 @@
>  GlobalProperty pc_compat_9_0[] = {
>      { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" },
>      { TYPE_X86_CPU, "guest-phys-bits", "0" },
> -    { "sev-guest", "legacy-vm-type", "true" },
>      { TYPE_X86_CPU, "legacy-multi-node", "on" },
>  };
>  const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0);
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 8bd299265e..714ebeec8b 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -912,12 +912,12 @@
>  # @handle: SEV firmware handle (default: 0)
>  #
>  # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
> -#                  The newer KVM_SEV_INIT2 interface syncs additional vCPU
> -#                  state when initializing the VMSA structures, which will
> -#                  result in a different guest measurement. Set this to
> -#                  maintain compatibility with older QEMU or kernel versions
> -#                  that rely on legacy KVM_SEV_INIT behavior.
> -#                  (default: false) (since 9.1)
> +#                  The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs
> +#                  additional vCPU state when initializing the VMSA structures,
> +#                  which will result in a different guest measurement. Toggle
> +#                  this to control compatibility with older QEMU or kernel
> +#                  versions that rely on legacy KVM_SEV_INIT behavior.
> +#                  (default: true) (since 9.1)
>  #
>  # Since: 2.12
>  ##
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 004c667ac1..16029282b7 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj)
>      object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy,
>                                     OBJ_PROP_FLAG_READWRITE);
>      object_apply_compat_props(obj);
> +
> +    /*
> +     * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid
> +     * breaking existing users of SEV, since the overwhealming
> +     * majority won't have a new enough kernel for a long time
> +     */
> +    sev_guest->legacy_vm_type = true;
>  }
>  
>  /* guest info specific sev/sev-es */
> -- 
> 2.45.1
> 

With regards,
Daniel
Michael S. Tsirkin June 24, 2024, 12:27 p.m. UTC | #2
On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> only have been released for a bit over a month when QEMU 9.1 is
> released.
> 
> The SEV(-ES) support in QEMU has been present since 2.12 dating back
> to 2018. With this in mind, the overwhealming majority of users of
> SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> forseeable future.
> 
> IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> machine types will be broken out of the box for most SEV(-ES) users.
> Even if the kernel is new enough, it also affects the guest measurement,
> which means that their existing tools for validating measurements will
> also be broken by the new default.
> 
> This is not a sensible default choice at this point in time. Revert to
> the historical behaviour which is compatible with what most users are
> currently running.
> 
> This can be re-evaluated a few years down the line, though it is more
> likely that all attention will be on SEV-SNP by this time. Distro
> vendors may still choose to change this default downstream to align
> with their new major releases where they can guarantee the kernel
> will always provide the required functionality.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

This makes sense superficially, so

Acked-by: Michael S. Tsirkin <mst@redhat.com>

and I'll let kvm maintainers merge this.

However I wonder, wouldn't it be better to refactor this:

    if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
        cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
        
        ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
    } else {
        struct kvm_sev_init args = { 0 };
                
        ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
    }   

to something like:

if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) {
        struct kvm_sev_init args = { 0 };
                
        ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
	if (ret && errno == ENOTTY) {
		cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;

		ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
	}
}


Yes I realize this means measurement will then depend on the host
but it seems nicer than failing guest start, no?




> ---
>  hw/i386/pc.c      |  1 -
>  qapi/qom.json     | 12 ++++++------
>  target/i386/sev.c |  7 +++++++
>  3 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 0469af00a7..b65843c559 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -82,7 +82,6 @@
>  GlobalProperty pc_compat_9_0[] = {
>      { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" },
>      { TYPE_X86_CPU, "guest-phys-bits", "0" },
> -    { "sev-guest", "legacy-vm-type", "true" },
>      { TYPE_X86_CPU, "legacy-multi-node", "on" },
>  };
>  const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0);
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 8bd299265e..714ebeec8b 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -912,12 +912,12 @@
>  # @handle: SEV firmware handle (default: 0)
>  #
>  # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
> -#                  The newer KVM_SEV_INIT2 interface syncs additional vCPU
> -#                  state when initializing the VMSA structures, which will
> -#                  result in a different guest measurement. Set this to
> -#                  maintain compatibility with older QEMU or kernel versions
> -#                  that rely on legacy KVM_SEV_INIT behavior.
> -#                  (default: false) (since 9.1)
> +#                  The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs
> +#                  additional vCPU state when initializing the VMSA structures,
> +#                  which will result in a different guest measurement. Toggle
> +#                  this to control compatibility with older QEMU or kernel
> +#                  versions that rely on legacy KVM_SEV_INIT behavior.
> +#                  (default: true) (since 9.1)
>  #
>  # Since: 2.12
>  ##
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 004c667ac1..16029282b7 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj)
>      object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy,
>                                     OBJ_PROP_FLAG_READWRITE);
>      object_apply_compat_props(obj);
> +
> +    /*
> +     * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid
> +     * breaking existing users of SEV, since the overwhealming
> +     * majority won't have a new enough kernel for a long time
> +     */
> +    sev_guest->legacy_vm_type = true;
>  }
>  
>  /* guest info specific sev/sev-es */
> -- 
> 2.45.1
Daniel P. Berrangé June 24, 2024, 12:38 p.m. UTC | #3
On Mon, Jun 24, 2024 at 08:27:01AM -0400, Michael S. Tsirkin wrote:
> On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> > only have been released for a bit over a month when QEMU 9.1 is
> > released.
> > 
> > The SEV(-ES) support in QEMU has been present since 2.12 dating back
> > to 2018. With this in mind, the overwhealming majority of users of
> > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> > forseeable future.
> > 
> > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> > machine types will be broken out of the box for most SEV(-ES) users.
> > Even if the kernel is new enough, it also affects the guest measurement,
> > which means that their existing tools for validating measurements will
> > also be broken by the new default.
> > 
> > This is not a sensible default choice at this point in time. Revert to
> > the historical behaviour which is compatible with what most users are
> > currently running.
> > 
> > This can be re-evaluated a few years down the line, though it is more
> > likely that all attention will be on SEV-SNP by this time. Distro
> > vendors may still choose to change this default downstream to align
> > with their new major releases where they can guarantee the kernel
> > will always provide the required functionality.
> > 
> > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> 
> This makes sense superficially, so
> 
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> 
> and I'll let kvm maintainers merge this.
> 
> However I wonder, wouldn't it be better to refactor this:
> 
>     if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
>         cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
>         
>         ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
>     } else {
>         struct kvm_sev_init args = { 0 };
>                 
>         ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
>     }   
> 
> to something like:
> 
> if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) {
>         struct kvm_sev_init args = { 0 };
>                 
>         ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
> 	if (ret && errno == ENOTTY) {
> 		cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
> 
> 		ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
> 	}
> }
> 
> 
> Yes I realize this means measurement will then depend on the host
> but it seems nicer than failing guest start, no?

IMHO having an invariant measurement for a given guest configuration
is a critical guarantee. We should not be allowing guest attestation
to break as a side-effect of upgrading a software component, while
keeping the guest config unchanged.

IOW, I'd view measurement as being "guest ABI", and versioned machine
types are there to provide invariant guest ABI.

Personally, if we want simplicitly then just not using KVM_SEV_INIT2
at all would be the easiest option. SEV/SEV-ES are legacy technology
at this point, so we could be justified in leaving it unchanged and
only focusing on SEV-SNP. Unless someone can say what the critical
*must have* benefit of using KVM_SEV_INIT2 is ?

With regards,
Daniel
Michael S. Tsirkin June 24, 2024, 1:14 p.m. UTC | #4
On Mon, Jun 24, 2024 at 01:38:56PM +0100, Daniel P. Berrangé wrote:
> On Mon, Jun 24, 2024 at 08:27:01AM -0400, Michael S. Tsirkin wrote:
> > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> > > only have been released for a bit over a month when QEMU 9.1 is
> > > released.
> > > 
> > > The SEV(-ES) support in QEMU has been present since 2.12 dating back
> > > to 2018. With this in mind, the overwhealming majority of users of
> > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> > > forseeable future.
> > > 
> > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> > > machine types will be broken out of the box for most SEV(-ES) users.
> > > Even if the kernel is new enough, it also affects the guest measurement,
> > > which means that their existing tools for validating measurements will
> > > also be broken by the new default.
> > > 
> > > This is not a sensible default choice at this point in time. Revert to
> > > the historical behaviour which is compatible with what most users are
> > > currently running.
> > > 
> > > This can be re-evaluated a few years down the line, though it is more
> > > likely that all attention will be on SEV-SNP by this time. Distro
> > > vendors may still choose to change this default downstream to align
> > > with their new major releases where they can guarantee the kernel
> > > will always provide the required functionality.
> > > 
> > > Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> > 
> > This makes sense superficially, so
> > 
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > 
> > and I'll let kvm maintainers merge this.
> > 
> > However I wonder, wouldn't it be better to refactor this:
> > 
> >     if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) == KVM_X86_DEFAULT_VM) {
> >         cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
> >         
> >         ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
> >     } else {
> >         struct kvm_sev_init args = { 0 };
> >                 
> >         ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
> >     }   
> > 
> > to something like:
> > 
> > if (x86_klass->kvm_type(X86_CONFIDENTIAL_GUEST(sev_common)) != KVM_X86_DEFAULT_VM) {
> >         struct kvm_sev_init args = { 0 };
> >                 
> >         ret = sev_ioctl(sev_common->sev_fd, KVM_SEV_INIT2, &args, &fw_error);
> > 	if (ret && errno == ENOTTY) {
> > 		cmd = sev_es_enabled() ? KVM_SEV_ES_INIT : KVM_SEV_INIT;
> > 
> > 		ret = sev_ioctl(sev_common->sev_fd, cmd, NULL, &fw_error);
> > 	}
> > }
> > 
> > 
> > Yes I realize this means measurement will then depend on the host
> > but it seems nicer than failing guest start, no?
> 
> IMHO having an invariant measurement for a given guest configuration
> is a critical guarantee. We should not be allowing guest attestation
> to break as a side-effect of upgrading a software component, while
> keeping the guest config unchanged.

Well attenstation can change for a variety of reasons involving software
upgrades: host or guest. It is up to user to either trust both old and
new attestion, or pick one. Seems better than forcing policy host side.

> IOW, I'd view measurement as being "guest ABI", and versioned machine
> types are there to provide invariant guest ABI.

In practice we can't always do this exactly: e.g. vhost has
a rich feature mask and what we do is clear features not
supported by a specific host kernel.

Similarly for vhost-user where the ABI depends on an
external component.

So things can change if you move across host kernels.




> Personally, if we want simplicitly then just not using KVM_SEV_INIT2
> at all would be the easiest option. SEV/SEV-ES are legacy technology
> at this point, so we could be justified in leaving it unchanged and
> only focusing on SEV-SNP. Unless someone can say what the critical
> *must have* benefit of using KVM_SEV_INIT2 is ?


No objection.

> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Michael Roth June 25, 2024, 1:19 a.m. UTC | #5
On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> only have been released for a bit over a month when QEMU 9.1 is
> released.
> 
> The SEV(-ES) support in QEMU has been present since 2.12 dating back
> to 2018. With this in mind, the overwhealming majority of users of
> SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> forseeable future.
> 
> IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> machine types will be broken out of the box for most SEV(-ES) users.
> Even if the kernel is new enough, it also affects the guest measurement,
> which means that their existing tools for validating measurements will
> also be broken by the new default.
> 
> This is not a sensible default choice at this point in time. Revert to
> the historical behaviour which is compatible with what most users are
> currently running.

Part of the reason for the change is that SEV-ES measurements are
already affected by some short-comings of the legacy KVM_SEV_ES_INIT
API. Namely, if the kvm_amd.debug-swap module param is used to enable
that SEV-ES feature, then that feature will get enabled on the KVM side
and change the initial guest measurement (due to VMSA_FEATURES field
of the vCPU's VMSA changing), and userspace has no way to control that
on a per-VM basis, so measurement for any particular invocation will
be somewhat random depending on the system configuration and kernel
level.

I think that's why users of newer QEMU machine types are highly
encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this
causing issues for older QEMU machine types that previously relied on
the legacy interface, since we do want to avoid measurement changing
for an existing guest that was previously working on an older kernel,
which is why this flag defaults to true for pre-9.1 machine types. But
on newer kernels there is still potential for issues relating to
debug-swap (and other VMSA features that get added to KVM in the future)
and how they may cause measurement changes underneath the covers if we
don't allow userspace the ability to control what is/isn't disabled.

Because of that I think it's less headache for userspace to have to
opt-in to legacy interface when using newer machine models. It should be
a concious decision to keep using this deprecated interface with known
limitations that could affect measurement in unexpected ways.

I was actually planning to go the other direction on this because
currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if
KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to
the above KVM_SEV_ES_INIT interface and potential inherit the issues
noted above. So I was planning on getting rid of the fallback, and
basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user
manually sets sev_guest->legacy_vm_type via cmdline.

-Mike

> 
> This can be re-evaluated a few years down the line, though it is more
> likely that all attention will be on SEV-SNP by this time. Distro
> vendors may still choose to change this default downstream to align
> with their new major releases where they can guarantee the kernel
> will always provide the required functionality.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>  hw/i386/pc.c      |  1 -
>  qapi/qom.json     | 12 ++++++------
>  target/i386/sev.c |  7 +++++++
>  3 files changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 0469af00a7..b65843c559 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -82,7 +82,6 @@
>  GlobalProperty pc_compat_9_0[] = {
>      { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" },
>      { TYPE_X86_CPU, "guest-phys-bits", "0" },
> -    { "sev-guest", "legacy-vm-type", "true" },
>      { TYPE_X86_CPU, "legacy-multi-node", "on" },
>  };
>  const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0);
> diff --git a/qapi/qom.json b/qapi/qom.json
> index 8bd299265e..714ebeec8b 100644
> --- a/qapi/qom.json
> +++ b/qapi/qom.json
> @@ -912,12 +912,12 @@
>  # @handle: SEV firmware handle (default: 0)
>  #
>  # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
> -#                  The newer KVM_SEV_INIT2 interface syncs additional vCPU
> -#                  state when initializing the VMSA structures, which will
> -#                  result in a different guest measurement. Set this to
> -#                  maintain compatibility with older QEMU or kernel versions
> -#                  that rely on legacy KVM_SEV_INIT behavior.
> -#                  (default: false) (since 9.1)
> +#                  The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs
> +#                  additional vCPU state when initializing the VMSA structures,
> +#                  which will result in a different guest measurement. Toggle
> +#                  this to control compatibility with older QEMU or kernel
> +#                  versions that rely on legacy KVM_SEV_INIT behavior.
> +#                  (default: true) (since 9.1)
>  #
>  # Since: 2.12
>  ##
> diff --git a/target/i386/sev.c b/target/i386/sev.c
> index 004c667ac1..16029282b7 100644
> --- a/target/i386/sev.c
> +++ b/target/i386/sev.c
> @@ -2086,6 +2086,13 @@ sev_guest_instance_init(Object *obj)
>      object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy,
>                                     OBJ_PROP_FLAG_READWRITE);
>      object_apply_compat_props(obj);
> +
> +    /*
> +     * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid
> +     * breaking existing users of SEV, since the overwhealming
> +     * majority won't have a new enough kernel for a long time
> +     */
> +    sev_guest->legacy_vm_type = true;
>  }
>  
>  /* guest info specific sev/sev-es */
> -- 
> 2.45.1
>
Daniel P. Berrangé June 25, 2024, 9:51 a.m. UTC | #6
On Mon, Jun 24, 2024 at 08:19:19PM -0500, Michael Roth wrote:
> On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> > only have been released for a bit over a month when QEMU 9.1 is
> > released.
> > 
> > The SEV(-ES) support in QEMU has been present since 2.12 dating back
> > to 2018. With this in mind, the overwhealming majority of users of
> > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> > forseeable future.
> > 
> > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> > machine types will be broken out of the box for most SEV(-ES) users.
> > Even if the kernel is new enough, it also affects the guest measurement,
> > which means that their existing tools for validating measurements will
> > also be broken by the new default.
> > 
> > This is not a sensible default choice at this point in time. Revert to
> > the historical behaviour which is compatible with what most users are
> > currently running.
> 
> Part of the reason for the change is that SEV-ES measurements are
> already affected by some short-comings of the legacy KVM_SEV_ES_INIT
> API. Namely, if the kvm_amd.debug-swap module param is used to enable
> that SEV-ES feature, then that feature will get enabled on the KVM side
> and change the initial guest measurement (due to VMSA_FEATURES field
> of the vCPU's VMSA changing), and userspace has no way to control that
> on a per-VM basis, so measurement for any particular invocation will
> be somewhat random depending on the system configuration and kernel
> level.

The debug-swap feature was set to disabled by default. So that
could be just a docs problem to say if you want to use that
feature, then you must set the legacy-vm=false property. IOW
an opt-in to incompatible behaviour.


> I think that's why users of newer QEMU machine types are highly
> encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this
> causing issues for older QEMU machine types that previously relied on
> the legacy interface, since we do want to avoid measurement changing
> for an existing guest that was previously working on an older kernel,
> which is why this flag defaults to true for pre-9.1 machine types.

This justification mis-understands how machine types are actually
used in practice though. There is *zero* correlation between use
of the new machine types, and availabilty of the new kernel
interface. 

99% of usage of QEMU, will just ask for the unversioned "q35"
/ "pc" machines. They will be expanded to the very latest machine
type version, either internally by QEMU, or by libvirt prior to
launching the VM.

Either way, you can expect essentially everything to be running on
the latest machine type versions, regardless of kernel version.

So making the latest machine types dependent on a kernel version
that is brand new is just not a sensible default. Latest QEMU
machines types need to work on kernel releases years old, without
expecting the user to set magic flags to avoid incompatibility.

> I was actually planning to go the other direction on this because
> currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if
> KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to
> the above KVM_SEV_ES_INIT interface and potential inherit the issues
> noted above. So I was planning on getting rid of the fallback, and
> basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user
> manually sets sev_guest->legacy_vm_type via cmdline.

Dynamic detection of SEV_INIT vs SEV-INIT2 is a bad idea as that
breaks migration when someone is moving from a host with new
kernel to an older kernel, while keeping the QEMU machine type
unchanged. The behaviour of what kernel feature to use must be
controllable with an explicit choice.


With regards,
Daniel
Michael Roth July 4, 2024, 12:06 a.m. UTC | #7
On Tue, Jun 25, 2024 at 10:51:43AM +0100, Daniel P. Berrangé wrote:
> On Mon, Jun 24, 2024 at 08:19:19PM -0500, Michael Roth wrote:
> > On Fri, Jun 14, 2024 at 11:39:24AM +0100, Daniel P. Berrangé wrote:
> > > The KVM_SEV_INIT2 ioctl was only introduced in Linux 6.10, which will
> > > only have been released for a bit over a month when QEMU 9.1 is
> > > released.
> > > 
> > > The SEV(-ES) support in QEMU has been present since 2.12 dating back
> > > to 2018. With this in mind, the overwhealming majority of users of
> > > SEV(-ES) are unlikely to be running Linux >= 6.10, any time in the
> > > forseeable future.
> > > 
> > > IOW, defaulting new QEMU to 'legacy-vm-type=false' means latest QEMU
> > > machine types will be broken out of the box for most SEV(-ES) users.
> > > Even if the kernel is new enough, it also affects the guest measurement,
> > > which means that their existing tools for validating measurements will
> > > also be broken by the new default.
> > > 
> > > This is not a sensible default choice at this point in time. Revert to
> > > the historical behaviour which is compatible with what most users are
> > > currently running.
> > 
> > Part of the reason for the change is that SEV-ES measurements are
> > already affected by some short-comings of the legacy KVM_SEV_ES_INIT
> > API. Namely, if the kvm_amd.debug-swap module param is used to enable
> > that SEV-ES feature, then that feature will get enabled on the KVM side
> > and change the initial guest measurement (due to VMSA_FEATURES field
> > of the vCPU's VMSA changing), and userspace has no way to control that
> > on a per-VM basis, so measurement for any particular invocation will
> > be somewhat random depending on the system configuration and kernel
> > level.
> 
> The debug-swap feature was set to disabled by default. So that
> could be just a docs problem to say if you want to use that
> feature, then you must set the legacy-vm=false property. IOW
> an opt-in to incompatible behaviour.

debug-swap defaulted to true for KVM_SEV*_INIT guests unfortunately, so
the ship sailed on preparing users for the change in advance and instead
over time legacy guests users will gradually see the measurement change
when they upgrade to new kernels and then need to take steps to either
adjust their measurement calculation or disable debug-swap via module
parameters.

debug-swap is fairly recent as well however, so there's a fair chance
users hitting the above issue will have the option of switching over to
KVM_SEV_INIT2 where it's not much additional work to update
measurements, and in turn they'll benefit from better control over what
ends up in the VMSA as well. If they do plan to eventually switch to SNP,
these steps will bring them closer toward that end as well since there's
a lot of common handling/infrastructure in that regard.

> 
> 
> > I think that's why users of newer QEMU machine types are highly
> > encouraged to switch to the new KVM_SEV_INIT2 interface. I do see this
> > causing issues for older QEMU machine types that previously relied on
> > the legacy interface, since we do want to avoid measurement changing
> > for an existing guest that was previously working on an older kernel,
> > which is why this flag defaults to true for pre-9.1 machine types.
> 
> This justification mis-understands how machine types are actually
> used in practice though. There is *zero* correlation between use
> of the new machine types, and availabilty of the new kernel
> interface. 
> 
> 99% of usage of QEMU, will just ask for the unversioned "q35"
> / "pc" machines. They will be expanded to the very latest machine
> type version, either internally by QEMU, or by libvirt prior to
> launching the VM.

In my experience that's how many VMs start off until they start breaking
on newer kernels/QEMUs, then everyone scrambles to revert to the old
behavior. Quite often that ends up involving just tacking on an explicit
machine-type to maintain migration/behavioral compatibility with what
QEMU originally defaulted to when the VM was created.

But when first creating the VM, there is less expectation about what
should/shouldn't work. If they see failures because KVM_SEV_INIT2 isn't
available, it seems worthwhile that they need to make a decision on
whether to upgrade kernel or adopt the legacy behavior and be stuck on
a reduced featureset for the life of the guest. "Just works" is nice,
but "just working" in the case of KVM_SEV*_INIT comes with potential
headaches down the road and ideally users would be aware of what they
are signing up for.

If failing is too heavy-handed, maybe some type of warning that gets
printed by QEMU any time KVM_SEV*_INIT set? Then maybe down the road
if we decide to finally default KVM_SEV_INIT2, there's a better chance
that users have taken the hint and have already made the transiton?

I'll also defer to the maintainers on this point though since there are
clearly merits to both approaches.

> 
> Either way, you can expect essentially everything to be running on
> the latest machine type versions, regardless of kernel version.
> 
> So making the latest machine types dependent on a kernel version
> that is brand new is just not a sensible default. Latest QEMU
> machines types need to work on kernel releases years old, without
> expecting the user to set magic flags to avoid incompatibility.
> 
> > I was actually planning to go the other direction on this because
> > currently for 9.1+, QEMU will try to use KVM_SEV_INIT2 if
> > KVM_CAP_VM_TYPES advertises its availability, but otherwise fall back to
> > the above KVM_SEV_ES_INIT interface and potential inherit the issues
> > noted above. So I was planning on getting rid of the fallback, and
> > basically only allowing legacy KVM_SEV_ES_INIT for 9.1+ if the user
> > manually sets sev_guest->legacy_vm_type via cmdline.
> 
> Dynamic detection of SEV_INIT vs SEV-INIT2 is a bad idea as that
> breaks migration when someone is moving from a host with new
> kernel to an older kernel, while keeping the QEMU machine type
> unchanged. The behaviour of what kernel feature to use must be
> controllable with an explicit choice.

Totally agreed on this point. I've sent a patch that does this, and
adopted the QAPI wording you used in this patch so there is less churn
if they are both applied:

  https://lore.kernel.org/kvm/20240704000019.3928862-1-michael.roth@amd.com/T/#u

Thanks,

Mike

> 
> 
> With regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
> 
>
diff mbox series

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0469af00a7..b65843c559 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -82,7 +82,6 @@ 
 GlobalProperty pc_compat_9_0[] = {
     { TYPE_X86_CPU, "x-l1-cache-per-thread", "false" },
     { TYPE_X86_CPU, "guest-phys-bits", "0" },
-    { "sev-guest", "legacy-vm-type", "true" },
     { TYPE_X86_CPU, "legacy-multi-node", "on" },
 };
 const size_t pc_compat_9_0_len = G_N_ELEMENTS(pc_compat_9_0);
diff --git a/qapi/qom.json b/qapi/qom.json
index 8bd299265e..714ebeec8b 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -912,12 +912,12 @@ 
 # @handle: SEV firmware handle (default: 0)
 #
 # @legacy-vm-type: Use legacy KVM_SEV_INIT KVM interface for creating the VM.
-#                  The newer KVM_SEV_INIT2 interface syncs additional vCPU
-#                  state when initializing the VMSA structures, which will
-#                  result in a different guest measurement. Set this to
-#                  maintain compatibility with older QEMU or kernel versions
-#                  that rely on legacy KVM_SEV_INIT behavior.
-#                  (default: false) (since 9.1)
+#                  The newer KVM_SEV_INIT2 interface, from Linux >= 6.10, syncs
+#                  additional vCPU state when initializing the VMSA structures,
+#                  which will result in a different guest measurement. Toggle
+#                  this to control compatibility with older QEMU or kernel
+#                  versions that rely on legacy KVM_SEV_INIT behavior.
+#                  (default: true) (since 9.1)
 #
 # Since: 2.12
 ##
diff --git a/target/i386/sev.c b/target/i386/sev.c
index 004c667ac1..16029282b7 100644
--- a/target/i386/sev.c
+++ b/target/i386/sev.c
@@ -2086,6 +2086,13 @@  sev_guest_instance_init(Object *obj)
     object_property_add_uint32_ptr(obj, "policy", &sev_guest->policy,
                                    OBJ_PROP_FLAG_READWRITE);
     object_apply_compat_props(obj);
+
+    /*
+     * KVM_SEV_INIT2 was only introduced in Linux 6.10. Avoid
+     * breaking existing users of SEV, since the overwhealming
+     * majority won't have a new enough kernel for a long time
+     */
+    sev_guest->legacy_vm_type = true;
 }
 
 /* guest info specific sev/sev-es */