mbox series

[0/2] i386: Fix interrupt based Async PF enablement

Message ID 20210401151957.408028-1-vkuznets@redhat.com (mailing list archive)
Headers show
Series i386: Fix interrupt based Async PF enablement | expand

Message

Vitaly Kuznetsov April 1, 2021, 3:19 p.m. UTC
I noticed two issues with 'kvm-asyncpf-int' enablement:
1) We forgot to add to to kvm_default_props[] so it doesn't get enabled
 automatically (unless '-cpu host' is used or the feature is enabled
 manually on the command line)
2) We forgot to disable it for older machine types to preserve migration.
 This went unnoticed because of 1) I believe.

Vitaly Kuznetsov (2):
  i386: Add 'kvm-asyncpf-int' to kvm_default_props array
  i386: Disable 'kvm-asyncpf-int' feature for machine types <= 5.1

 hw/i386/pc.c      | 1 +
 target/i386/cpu.c | 1 +
 2 files changed, 2 insertions(+)

Comments

Paolo Bonzini April 1, 2021, 3:57 p.m. UTC | #1
On 01/04/21 17:19, Vitaly Kuznetsov wrote:
> I noticed two issues with 'kvm-asyncpf-int' enablement:
> 1) We forgot to add to to kvm_default_props[] so it doesn't get enabled
>   automatically (unless '-cpu host' is used or the feature is enabled
>   manually on the command line)
> 2) We forgot to disable it for older machine types to preserve migration.
>   This went unnoticed because of 1) I believe.
> 
> Vitaly Kuznetsov (2):
>    i386: Add 'kvm-asyncpf-int' to kvm_default_props array
>    i386: Disable 'kvm-asyncpf-int' feature for machine types <= 5.1
> 
>   hw/i386/pc.c      | 1 +
>   target/i386/cpu.c | 1 +
>   2 files changed, 2 insertions(+)
> 

Wasn't this intentional to avoid requiring a new kernel version?

Paolo
Vitaly Kuznetsov April 6, 2021, 11:42 a.m. UTC | #2
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 01/04/21 17:19, Vitaly Kuznetsov wrote:
>> I noticed two issues with 'kvm-asyncpf-int' enablement:
>> 1) We forgot to add to to kvm_default_props[] so it doesn't get enabled
>>   automatically (unless '-cpu host' is used or the feature is enabled
>>   manually on the command line)
>> 2) We forgot to disable it for older machine types to preserve migration.
>>   This went unnoticed because of 1) I believe.
>> 
>> Vitaly Kuznetsov (2):
>>    i386: Add 'kvm-asyncpf-int' to kvm_default_props array
>>    i386: Disable 'kvm-asyncpf-int' feature for machine types <= 5.1
>> 
>>   hw/i386/pc.c      | 1 +
>>   target/i386/cpu.c | 1 +
>>   2 files changed, 2 insertions(+)
>> 
>
> Wasn't this intentional to avoid requiring a new kernel version?

I think I forgot the initial plan :-( The problem is that after we
disabled the original APF (#PF based) almost nobody is using the feature
as it needs to be enabled explicitly on the command line.

Several considerations regarding the default: if your kernel doesn't
support the feature you get as much as a warning:

qemu-system-x86_64: warning: host doesn't support requested feature:
CPUID.40000001H:EAX.kvm-asyncpf-int [bit 14]

older machine types are still available (I disable it for <= 5.1 but we
can consider disabling it for 5.2 too). The feature is upstream since
Linux 5.8, I know that QEMU supports much older kernels but this doesn't
probably mean that we can't enable new KVM PV features unless all
supported kernels have it, we'd have to wait many years otherwise.
Paolo Bonzini April 8, 2021, 12:46 p.m. UTC | #3
On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> older machine types are still available (I disable it for <= 5.1 but we
> can consider disabling it for 5.2 too). The feature is upstream since
> Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> probably mean that we can't enable new KVM PV features unless all
> supported kernels have it, we'd have to wait many years otherwise.

Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7, 
though that will go away in 6.1.

We should take the occasion of dropping RHEL7 to be clearer about which 
kernels are supported.

Paolo
Dr. David Alan Gilbert April 15, 2021, 7:14 p.m. UTC | #4
* Paolo Bonzini (pbonzini@redhat.com) wrote:
> On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> > older machine types are still available (I disable it for <= 5.1 but we
> > can consider disabling it for 5.2 too). The feature is upstream since
> > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> > probably mean that we can't enable new KVM PV features unless all
> > supported kernels have it, we'd have to wait many years otherwise.
> 
> Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> though that will go away in 6.1.
> 
> We should take the occasion of dropping RHEL7 to be clearer about which
> kernels are supported.

It would be nice to be able to define sets of KVM functonality that we
can either start given machine types with, or provide a separate switch
to limit kvm functionality back to some defined point.  We do trip over
the same things pretty regularly when accidentally turning on new
features.

Dave

> Paolo
>
Eduardo Habkost April 20, 2021, 5:35 p.m. UTC | #5
On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
> * Paolo Bonzini (pbonzini@redhat.com) wrote:
> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> > > older machine types are still available (I disable it for <= 5.1 but we
> > > can consider disabling it for 5.2 too). The feature is upstream since
> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> > > probably mean that we can't enable new KVM PV features unless all
> > > supported kernels have it, we'd have to wait many years otherwise.
> > 
> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> > though that will go away in 6.1.
> > 
> > We should take the occasion of dropping RHEL7 to be clearer about which
> > kernels are supported.
> 
> It would be nice to be able to define sets of KVM functonality that we
> can either start given machine types with, or provide a separate switch
> to limit kvm functionality back to some defined point.  We do trip over
> the same things pretty regularly when accidentally turning on new
> features.

The same idea can apply to the hyperv=on stuff Vitaly is working
on.  Maybe we should consider making a generic version of the
s390x FeatGroup code, use it to define convenient sets of KVM and
hyperv features.
Vitaly Kuznetsov April 21, 2021, 8:38 a.m. UTC | #6
Eduardo Habkost <ehabkost@redhat.com> writes:

> On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
>> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
>> > > older machine types are still available (I disable it for <= 5.1 but we
>> > > can consider disabling it for 5.2 too). The feature is upstream since
>> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
>> > > probably mean that we can't enable new KVM PV features unless all
>> > > supported kernels have it, we'd have to wait many years otherwise.
>> > 
>> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
>> > though that will go away in 6.1.
>> > 
>> > We should take the occasion of dropping RHEL7 to be clearer about which
>> > kernels are supported.
>> 
>> It would be nice to be able to define sets of KVM functonality that we
>> can either start given machine types with, or provide a separate switch
>> to limit kvm functionality back to some defined point.  We do trip over
>> the same things pretty regularly when accidentally turning on new
>> features.
>
> The same idea can apply to the hyperv=on stuff Vitaly is working
> on.  Maybe we should consider making a generic version of the
> s390x FeatGroup code, use it to define convenient sets of KVM and
> hyperv features.

True, the more I look at PV features enablement, the more I think that
we're missing something important in the logic. All machine types we
have are generally suposed to work with the oldest supported kernel so
we should wait many years before enabling some of the new PV features
(KVM or Hyper-V) by default.

This also links to our parallel discussion regarding migration
policies. Currently, we can't enable PV features by default based on
their availability on the host because of migration, the set may differ
on the destination host. What if we introduce (and maybe even switch to
it by default) something like

 -migratable opportunistic (stupid name, I know)

which would allow to enable all features supported by the source host
and then somehow checking that the destination host has them all. This
would effectively mean that it is possible to migrate a VM to a
same-or-newer software (both kernel an QEMU) but not the other way
around. This may be a reasonable choice.
Daniel P. Berrangé April 21, 2021, 8:50 a.m. UTC | #7
On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
> Eduardo Habkost <ehabkost@redhat.com> writes:
> 
> > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
> >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
> >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> >> > > older machine types are still available (I disable it for <= 5.1 but we
> >> > > can consider disabling it for 5.2 too). The feature is upstream since
> >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> >> > > probably mean that we can't enable new KVM PV features unless all
> >> > > supported kernels have it, we'd have to wait many years otherwise.
> >> > 
> >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> >> > though that will go away in 6.1.
> >> > 
> >> > We should take the occasion of dropping RHEL7 to be clearer about which
> >> > kernels are supported.
> >> 
> >> It would be nice to be able to define sets of KVM functonality that we
> >> can either start given machine types with, or provide a separate switch
> >> to limit kvm functionality back to some defined point.  We do trip over
> >> the same things pretty regularly when accidentally turning on new
> >> features.
> >
> > The same idea can apply to the hyperv=on stuff Vitaly is working
> > on.  Maybe we should consider making a generic version of the
> > s390x FeatGroup code, use it to define convenient sets of KVM and
> > hyperv features.
> 
> True, the more I look at PV features enablement, the more I think that
> we're missing something important in the logic. All machine types we
> have are generally suposed to work with the oldest supported kernel so
> we should wait many years before enabling some of the new PV features
> (KVM or Hyper-V) by default.
> 
> This also links to our parallel discussion regarding migration
> policies. Currently, we can't enable PV features by default based on
> their availability on the host because of migration, the set may differ
> on the destination host. What if we introduce (and maybe even switch to
> it by default) something like
> 
>  -migratable opportunistic (stupid name, I know)
> 
> which would allow to enable all features supported by the source host
> and then somehow checking that the destination host has them all. This
> would effectively mean that it is possible to migrate a VM to a
> same-or-newer software (both kernel an QEMU) but not the other way
> around. This may be a reasonable choice.

I don't think this is usable in pratice. Any large cloud or data center
mgmt app using QEMU relies on migration, so can't opportunistically
use arbitrary new features. They can only use features in the oldest
kernel their deployment cares about. This can be newer than the oldest
that QEMU supports, but still older than the newest that exists.

ie we have situation where:

 - QEMU upstream minimum host is version 7
 - Latest possible host is version 45
 - A particular deployment has a mixture of hosts at version 24 and 37

"-migratable opportunistic"  would let QEMU use features from version 37
despite the deployment needing compatibility with host version 24 still.


It is almost as if we need to have a way to explicitly express a minimum
required host version that VM requires compatibility with, so deployments
can set their own baseline that is newer than QEMU minimum.

Regards,
Daniel
Dr. David Alan Gilbert April 21, 2021, 9:23 a.m. UTC | #8
* Daniel P. Berrangé (berrange@redhat.com) wrote:
> On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
> > Eduardo Habkost <ehabkost@redhat.com> writes:
> > 
> > > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
> > >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
> > >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> > >> > > older machine types are still available (I disable it for <= 5.1 but we
> > >> > > can consider disabling it for 5.2 too). The feature is upstream since
> > >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> > >> > > probably mean that we can't enable new KVM PV features unless all
> > >> > > supported kernels have it, we'd have to wait many years otherwise.
> > >> > 
> > >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> > >> > though that will go away in 6.1.
> > >> > 
> > >> > We should take the occasion of dropping RHEL7 to be clearer about which
> > >> > kernels are supported.
> > >> 
> > >> It would be nice to be able to define sets of KVM functonality that we
> > >> can either start given machine types with, or provide a separate switch
> > >> to limit kvm functionality back to some defined point.  We do trip over
> > >> the same things pretty regularly when accidentally turning on new
> > >> features.
> > >
> > > The same idea can apply to the hyperv=on stuff Vitaly is working
> > > on.  Maybe we should consider making a generic version of the
> > > s390x FeatGroup code, use it to define convenient sets of KVM and
> > > hyperv features.
> > 
> > True, the more I look at PV features enablement, the more I think that
> > we're missing something important in the logic. All machine types we
> > have are generally suposed to work with the oldest supported kernel so
> > we should wait many years before enabling some of the new PV features
> > (KVM or Hyper-V) by default.
> > 
> > This also links to our parallel discussion regarding migration
> > policies. Currently, we can't enable PV features by default based on
> > their availability on the host because of migration, the set may differ
> > on the destination host. What if we introduce (and maybe even switch to
> > it by default) something like
> > 
> >  -migratable opportunistic (stupid name, I know)
> > 
> > which would allow to enable all features supported by the source host
> > and then somehow checking that the destination host has them all. This
> > would effectively mean that it is possible to migrate a VM to a
> > same-or-newer software (both kernel an QEMU) but not the other way
> > around. This may be a reasonable choice.
> 
> I don't think this is usable in pratice. Any large cloud or data center
> mgmt app using QEMU relies on migration, so can't opportunistically
> use arbitrary new features. They can only use features in the oldest
> kernel their deployment cares about. This can be newer than the oldest
> that QEMU supports, but still older than the newest that exists.
> 
> ie we have situation where:
> 
>  - QEMU upstream minimum host is version 7
>  - Latest possible host is version 45
>  - A particular deployment has a mixture of hosts at version 24 and 37
> 
> "-migratable opportunistic"  would let QEMU use features from version 37
> despite the deployment needing compatibility with host version 24 still.
> 
> 
> It is almost as if we need to have a way to explicitly express a minimum
> required host version that VM requires compatibility with, so deployments
> can set their own baseline that is newer than QEMU minimum.

It's not a 'version' - it's just the set of capabilities, and the qemu
needs to check them at startup and fail if they're missing; I think
that's what thats FeatGroup is that was suggested.

Just like we have machine type and CPU version we need a set of PV
features that we rely on the host kernel having, and we should only
expose those PV features to the guest.  It's possible that we might
define some machine types as relying on certain PV features, or that
some PV features wouldn't make sense on some machine types.

Dave

> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Vitaly Kuznetsov April 21, 2021, 9:29 a.m. UTC | #9
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
>> Eduardo Habkost <ehabkost@redhat.com> writes:
>> 
>> > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
>> >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>> >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
>> >> > > older machine types are still available (I disable it for <= 5.1 but we
>> >> > > can consider disabling it for 5.2 too). The feature is upstream since
>> >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
>> >> > > probably mean that we can't enable new KVM PV features unless all
>> >> > > supported kernels have it, we'd have to wait many years otherwise.
>> >> > 
>> >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
>> >> > though that will go away in 6.1.
>> >> > 
>> >> > We should take the occasion of dropping RHEL7 to be clearer about which
>> >> > kernels are supported.
>> >> 
>> >> It would be nice to be able to define sets of KVM functonality that we
>> >> can either start given machine types with, or provide a separate switch
>> >> to limit kvm functionality back to some defined point.  We do trip over
>> >> the same things pretty regularly when accidentally turning on new
>> >> features.
>> >
>> > The same idea can apply to the hyperv=on stuff Vitaly is working
>> > on.  Maybe we should consider making a generic version of the
>> > s390x FeatGroup code, use it to define convenient sets of KVM and
>> > hyperv features.
>> 
>> True, the more I look at PV features enablement, the more I think that
>> we're missing something important in the logic. All machine types we
>> have are generally suposed to work with the oldest supported kernel so
>> we should wait many years before enabling some of the new PV features
>> (KVM or Hyper-V) by default.
>> 
>> This also links to our parallel discussion regarding migration
>> policies. Currently, we can't enable PV features by default based on
>> their availability on the host because of migration, the set may differ
>> on the destination host. What if we introduce (and maybe even switch to
>> it by default) something like
>> 
>>  -migratable opportunistic (stupid name, I know)
>> 
>> which would allow to enable all features supported by the source host
>> and then somehow checking that the destination host has them all. This
>> would effectively mean that it is possible to migrate a VM to a
>> same-or-newer software (both kernel an QEMU) but not the other way
>> around. This may be a reasonable choice.
>
> I don't think this is usable in pratice. Any large cloud or data center
> mgmt app using QEMU relies on migration, so can't opportunistically
> use arbitrary new features. They can only use features in the oldest
> kernel their deployment cares about. This can be newer than the oldest
> that QEMU supports, but still older than the newest that exists.
>
> ie we have situation where:
>
>  - QEMU upstream minimum host is version 7
>  - Latest possible host is version 45
>  - A particular deployment has a mixture of hosts at version 24 and 37
>
> "-migratable opportunistic"  would let QEMU use features from version 37
> despite the deployment needing compatibility with host version 24 still.
>

True; I was not really thinking about 'big' clouds/data centers, these
should have enough resources to carefully set all the required features
and not rely on the 'default'. My thoughts were around using migration
for host upgrade on smaller (several hosts) deployments and in this case
it's probably fairly reasonable to require to start with the oldest host
and upgrade them all if getting new features is one of the upgrade goals.

>
> It is almost as if we need to have a way to explicitly express a minimum
> required host version that VM requires compatibility with, so deployments
> can set their own baseline that is newer than QEMU minimum.

Yes, maybe, but setting the baseline is also a non-trivial task:
e.g. how would users know which PV features they can enable without
going through Linux kernel logs or just trying them on the oldest kernel
they need? This should probably be solved by some upper layer management
app which would collect feature sets from all hosts and come up with a
common subset. I'm not sure if this is done by some tools already.
Dr. David Alan Gilbert April 21, 2021, 9:34 a.m. UTC | #10
* Vitaly Kuznetsov (vkuznets@redhat.com) wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >> 
> >> > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
> >> >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
> >> >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> >> >> > > older machine types are still available (I disable it for <= 5.1 but we
> >> >> > > can consider disabling it for 5.2 too). The feature is upstream since
> >> >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> >> >> > > probably mean that we can't enable new KVM PV features unless all
> >> >> > > supported kernels have it, we'd have to wait many years otherwise.
> >> >> > 
> >> >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> >> >> > though that will go away in 6.1.
> >> >> > 
> >> >> > We should take the occasion of dropping RHEL7 to be clearer about which
> >> >> > kernels are supported.
> >> >> 
> >> >> It would be nice to be able to define sets of KVM functonality that we
> >> >> can either start given machine types with, or provide a separate switch
> >> >> to limit kvm functionality back to some defined point.  We do trip over
> >> >> the same things pretty regularly when accidentally turning on new
> >> >> features.
> >> >
> >> > The same idea can apply to the hyperv=on stuff Vitaly is working
> >> > on.  Maybe we should consider making a generic version of the
> >> > s390x FeatGroup code, use it to define convenient sets of KVM and
> >> > hyperv features.
> >> 
> >> True, the more I look at PV features enablement, the more I think that
> >> we're missing something important in the logic. All machine types we
> >> have are generally suposed to work with the oldest supported kernel so
> >> we should wait many years before enabling some of the new PV features
> >> (KVM or Hyper-V) by default.
> >> 
> >> This also links to our parallel discussion regarding migration
> >> policies. Currently, we can't enable PV features by default based on
> >> their availability on the host because of migration, the set may differ
> >> on the destination host. What if we introduce (and maybe even switch to
> >> it by default) something like
> >> 
> >>  -migratable opportunistic (stupid name, I know)
> >> 
> >> which would allow to enable all features supported by the source host
> >> and then somehow checking that the destination host has them all. This
> >> would effectively mean that it is possible to migrate a VM to a
> >> same-or-newer software (both kernel an QEMU) but not the other way
> >> around. This may be a reasonable choice.
> >
> > I don't think this is usable in pratice. Any large cloud or data center
> > mgmt app using QEMU relies on migration, so can't opportunistically
> > use arbitrary new features. They can only use features in the oldest
> > kernel their deployment cares about. This can be newer than the oldest
> > that QEMU supports, but still older than the newest that exists.
> >
> > ie we have situation where:
> >
> >  - QEMU upstream minimum host is version 7
> >  - Latest possible host is version 45
> >  - A particular deployment has a mixture of hosts at version 24 and 37
> >
> > "-migratable opportunistic"  would let QEMU use features from version 37
> > despite the deployment needing compatibility with host version 24 still.
> >
> 
> True; I was not really thinking about 'big' clouds/data centers, these
> should have enough resources to carefully set all the required features
> and not rely on the 'default'. My thoughts were around using migration
> for host upgrade on smaller (several hosts) deployments and in this case
> it's probably fairly reasonable to require to start with the oldest host
> and upgrade them all if getting new features is one of the upgrade goals.

It's not actually that simple.
Small installations tend to have less spare hardware available and/or
flexibility; if you've got say a 3 or 5 host cluster, once you start
upgrading one node you've now got nowhere to go if you hit a problem.

Dave

> >
> > It is almost as if we need to have a way to explicitly express a minimum
> > required host version that VM requires compatibility with, so deployments
> > can set their own baseline that is newer than QEMU minimum.
> 
> Yes, maybe, but setting the baseline is also a non-trivial task:
> e.g. how would users know which PV features they can enable without
> going through Linux kernel logs or just trying them on the oldest kernel
> they need? This should probably be solved by some upper layer management
> app which would collect feature sets from all hosts and come up with a
> common subset. I'm not sure if this is done by some tools already.
> 
> -- 
> Vitaly
>
Daniel P. Berrangé April 21, 2021, 9:37 a.m. UTC | #11
On Wed, Apr 21, 2021 at 11:29:45AM +0200, Vitaly Kuznetsov wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
> >> Eduardo Habkost <ehabkost@redhat.com> writes:
> >> 
> >> > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
> >> >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
> >> >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
> >> >> > > older machine types are still available (I disable it for <= 5.1 but we
> >> >> > > can consider disabling it for 5.2 too). The feature is upstream since
> >> >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
> >> >> > > probably mean that we can't enable new KVM PV features unless all
> >> >> > > supported kernels have it, we'd have to wait many years otherwise.
> >> >> > 
> >> >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
> >> >> > though that will go away in 6.1.
> >> >> > 
> >> >> > We should take the occasion of dropping RHEL7 to be clearer about which
> >> >> > kernels are supported.
> >> >> 
> >> >> It would be nice to be able to define sets of KVM functonality that we
> >> >> can either start given machine types with, or provide a separate switch
> >> >> to limit kvm functionality back to some defined point.  We do trip over
> >> >> the same things pretty regularly when accidentally turning on new
> >> >> features.
> >> >
> >> > The same idea can apply to the hyperv=on stuff Vitaly is working
> >> > on.  Maybe we should consider making a generic version of the
> >> > s390x FeatGroup code, use it to define convenient sets of KVM and
> >> > hyperv features.
> >> 
> >> True, the more I look at PV features enablement, the more I think that
> >> we're missing something important in the logic. All machine types we
> >> have are generally suposed to work with the oldest supported kernel so
> >> we should wait many years before enabling some of the new PV features
> >> (KVM or Hyper-V) by default.
> >> 
> >> This also links to our parallel discussion regarding migration
> >> policies. Currently, we can't enable PV features by default based on
> >> their availability on the host because of migration, the set may differ
> >> on the destination host. What if we introduce (and maybe even switch to
> >> it by default) something like
> >> 
> >>  -migratable opportunistic (stupid name, I know)
> >> 
> >> which would allow to enable all features supported by the source host
> >> and then somehow checking that the destination host has them all. This
> >> would effectively mean that it is possible to migrate a VM to a
> >> same-or-newer software (both kernel an QEMU) but not the other way
> >> around. This may be a reasonable choice.
> >
> > I don't think this is usable in pratice. Any large cloud or data center
> > mgmt app using QEMU relies on migration, so can't opportunistically
> > use arbitrary new features. They can only use features in the oldest
> > kernel their deployment cares about. This can be newer than the oldest
> > that QEMU supports, but still older than the newest that exists.
> >
> > ie we have situation where:
> >
> >  - QEMU upstream minimum host is version 7
> >  - Latest possible host is version 45
> >  - A particular deployment has a mixture of hosts at version 24 and 37
> >
> > "-migratable opportunistic"  would let QEMU use features from version 37
> > despite the deployment needing compatibility with host version 24 still.
> >
> 
> True; I was not really thinking about 'big' clouds/data centers, these
> should have enough resources to carefully set all the required features
> and not rely on the 'default'. My thoughts were around using migration
> for host upgrade on smaller (several hosts) deployments and in this case
> it's probably fairly reasonable to require to start with the oldest host
> and upgrade them all if getting new features is one of the upgrade goals.


> > It is almost as if we need to have a way to explicitly express a minimum
> > required host version that VM requires compatibility with, so deployments
> > can set their own baseline that is newer than QEMU minimum.
> 
> Yes, maybe, but setting the baseline is also a non-trivial task:
> e.g. how would users know which PV features they can enable without
> going through Linux kernel logs or just trying them on the oldest kernel
> they need? This should probably be solved by some upper layer management
> app which would collect feature sets from all hosts and come up with a
> common subset. I'm not sure if this is done by some tools already.

I specifically didn't talk in terms of features, because the problem you
describe is unreasonable to push onto applications.

Rather QEMU could express host baseline

   - "host-v1"  - features A and B
   - "host-v2"  - features A, B and C
   - "host-v3"  - features A, B, C, D, E and f

The mgmt app / admin only has to know which QEMU host baselines their
hosts support.

Essentially this could be viewed as separating the host kernel dependant
bits out of the machine type, into a separate configuration axis.

Regards,
Daniel
Vitaly Kuznetsov April 21, 2021, 9:48 a.m. UTC | #12
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Apr 21, 2021 at 11:29:45AM +0200, Vitaly Kuznetsov wrote:
>> Daniel P. Berrangé <berrange@redhat.com> writes:
>> 
>> > On Wed, Apr 21, 2021 at 10:38:06AM +0200, Vitaly Kuznetsov wrote:
>> >> Eduardo Habkost <ehabkost@redhat.com> writes:
>> >> 
>> >> > On Thu, Apr 15, 2021 at 08:14:30PM +0100, Dr. David Alan Gilbert wrote:
>> >> >> * Paolo Bonzini (pbonzini@redhat.com) wrote:
>> >> >> > On 06/04/21 13:42, Vitaly Kuznetsov wrote:
>> >> >> > > older machine types are still available (I disable it for <= 5.1 but we
>> >> >> > > can consider disabling it for 5.2 too). The feature is upstream since
>> >> >> > > Linux 5.8, I know that QEMU supports much older kernels but this doesn't
>> >> >> > > probably mean that we can't enable new KVM PV features unless all
>> >> >> > > supported kernels have it, we'd have to wait many years otherwise.
>> >> >> > 
>> >> >> > Yes, this is a known problem in fact. :(  In 6.0 we even support RHEL 7,
>> >> >> > though that will go away in 6.1.
>> >> >> > 
>> >> >> > We should take the occasion of dropping RHEL7 to be clearer about which
>> >> >> > kernels are supported.
>> >> >> 
>> >> >> It would be nice to be able to define sets of KVM functonality that we
>> >> >> can either start given machine types with, or provide a separate switch
>> >> >> to limit kvm functionality back to some defined point.  We do trip over
>> >> >> the same things pretty regularly when accidentally turning on new
>> >> >> features.
>> >> >
>> >> > The same idea can apply to the hyperv=on stuff Vitaly is working
>> >> > on.  Maybe we should consider making a generic version of the
>> >> > s390x FeatGroup code, use it to define convenient sets of KVM and
>> >> > hyperv features.
>> >> 
>> >> True, the more I look at PV features enablement, the more I think that
>> >> we're missing something important in the logic. All machine types we
>> >> have are generally suposed to work with the oldest supported kernel so
>> >> we should wait many years before enabling some of the new PV features
>> >> (KVM or Hyper-V) by default.
>> >> 
>> >> This also links to our parallel discussion regarding migration
>> >> policies. Currently, we can't enable PV features by default based on
>> >> their availability on the host because of migration, the set may differ
>> >> on the destination host. What if we introduce (and maybe even switch to
>> >> it by default) something like
>> >> 
>> >>  -migratable opportunistic (stupid name, I know)
>> >> 
>> >> which would allow to enable all features supported by the source host
>> >> and then somehow checking that the destination host has them all. This
>> >> would effectively mean that it is possible to migrate a VM to a
>> >> same-or-newer software (both kernel an QEMU) but not the other way
>> >> around. This may be a reasonable choice.
>> >
>> > I don't think this is usable in pratice. Any large cloud or data center
>> > mgmt app using QEMU relies on migration, so can't opportunistically
>> > use arbitrary new features. They can only use features in the oldest
>> > kernel their deployment cares about. This can be newer than the oldest
>> > that QEMU supports, but still older than the newest that exists.
>> >
>> > ie we have situation where:
>> >
>> >  - QEMU upstream minimum host is version 7
>> >  - Latest possible host is version 45
>> >  - A particular deployment has a mixture of hosts at version 24 and 37
>> >
>> > "-migratable opportunistic"  would let QEMU use features from version 37
>> > despite the deployment needing compatibility with host version 24 still.
>> >
>> 
>> True; I was not really thinking about 'big' clouds/data centers, these
>> should have enough resources to carefully set all the required features
>> and not rely on the 'default'. My thoughts were around using migration
>> for host upgrade on smaller (several hosts) deployments and in this case
>> it's probably fairly reasonable to require to start with the oldest host
>> and upgrade them all if getting new features is one of the upgrade goals.
>
>
>> > It is almost as if we need to have a way to explicitly express a minimum
>> > required host version that VM requires compatibility with, so deployments
>> > can set their own baseline that is newer than QEMU minimum.
>> 
>> Yes, maybe, but setting the baseline is also a non-trivial task:
>> e.g. how would users know which PV features they can enable without
>> going through Linux kernel logs or just trying them on the oldest kernel
>> they need? This should probably be solved by some upper layer management
>> app which would collect feature sets from all hosts and come up with a
>> common subset. I'm not sure if this is done by some tools already.
>
> I specifically didn't talk in terms of features, because the problem you
> describe is unreasonable to push onto applications.
>
> Rather QEMU could express host baseline
>
>    - "host-v1"  - features A and B
>    - "host-v2"  - features A, B and C
>    - "host-v3"  - features A, B, C, D, E and f
>
> The mgmt app / admin only has to know which QEMU host baselines their
> hosts support.
>
> Essentially this could be viewed as separating the host kernel dependant
> bits out of the machine type, into a separate configuration axis.

In case we only think about upstream kernels and assuming PV features
never go away that coud work. Distro kernels, however, exist too and
feature backports are common, so which version should I declare when my
kernel has e.g. features A, B and E ? (There used to be
KVM_GET_API_VERSION ioctl but then we switched to CAPs and this happened
for a reason.)

Personaly, I'd vote for having individual PV features in the config if
it ever gets introduced.