[RFC] x86: kvm: remove KVM_SOFT_MAX_VCPUS

Message ID	1379161011-28239-1-git-send-email-drjones@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> From: Andrew Jones <drjones@redhat.com> To: kvm@vger.kernel.org Cc: gleb@redhat.com, pbonzini@redhat.com, linux-kernel@vger.kernel.org Subject: [PATCH] [RFC] x86: kvm: remove KVM_SOFT_MAX_VCPUS Date: Sat, 14 Sep 2013 14:16:51 +0200 Message-Id: <1379161011-28239-1-git-send-email-drjones@redhat.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk

Andrew Jones Sept. 14, 2013, 12:16 p.m. UTC

This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for
KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says
KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus,
it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the
maximum tested number of vcpus. As that concept could be
distro-specific, this patch uses the other recommended maximum, the
number of physical cpus, as we never recommend configuring a guest that
has more vcpus than the host has pcpus. Of course a guest can always
still be configured with up to KVM_CAP_MAX_VCPUS though anyway.

I've put RFC on this patch because I'm not sure if there are any gotchas
lurking with this change. The change now means hosts no longer return
the same number for KVM_CAP_NR_VCPUS, and that number is likely going to
generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I
can't think of anything other than generating more warnings[1] from qemu
with guests that configure more vcpus than pcpus though.

[1] Actually, until 972fc544b6034a in uq/master is merged there won't be
    any warnings either.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 1 -
 arch/x86/kvm/x86.c              | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

Gleb Natapov Sept. 15, 2013, 9:03 a.m. UTC | #1

On Sat, Sep 14, 2013 at 02:16:51PM +0200, Andrew Jones wrote:
> This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for
> KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says
> KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus,
> it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the
> maximum tested number of vcpus. As that concept could be
> distro-specific, this patch uses the other recommended maximum, the
> number of physical cpus, as we never recommend configuring a guest that
> has more vcpus than the host has pcpus. Of course a guest can always
> still be configured with up to KVM_CAP_MAX_VCPUS though anyway.
> 
> I've put RFC on this patch because I'm not sure if there are any gotchas
> lurking with this change. The change now means hosts no longer return
> the same number for KVM_CAP_NR_VCPUS, and that number is likely going to
> generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I
> can't think of anything other than generating more warnings[1] from qemu
> with guests that configure more vcpus than pcpus though.
> 
Another gotcha is that on a host with more then 160 cpus recommended
value will grow which is not a good idea without appropriate testing.

> [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
>     any warnings either.
> 
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  arch/x86/include/asm/kvm_host.h | 1 -
>  arch/x86/kvm/x86.c              | 2 +-
>  2 files changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index c76ff74a98f2e..9236c63315a9b 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -32,7 +32,6 @@
>  #include <asm/asm.h>
>  
>  #define KVM_MAX_VCPUS 255
> -#define KVM_SOFT_MAX_VCPUS 160
>  #define KVM_USER_MEM_SLOTS 125
>  /* memory slots that are not exposed to userspace */
>  #define KVM_PRIVATE_MEM_SLOTS 3
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
>  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
>  		break;
>  	case KVM_CAP_NR_VCPUS:
> -		r = KVM_SOFT_MAX_VCPUS;
> +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?

>  		break;
>  	case KVM_CAP_MAX_VCPUS:
>  		r = KVM_MAX_VCPUS;
> -- 
> 1.8.1.4

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Jones Sept. 16, 2013, 8:22 a.m. UTC | #2

On Sun, Sep 15, 2013 at 12:03:22PM +0300, Gleb Natapov wrote:
> On Sat, Sep 14, 2013 at 02:16:51PM +0200, Andrew Jones wrote:
> > This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for
> > KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says
> > KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus,
> > it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the
> > maximum tested number of vcpus. As that concept could be
> > distro-specific, this patch uses the other recommended maximum, the
> > number of physical cpus, as we never recommend configuring a guest that
> > has more vcpus than the host has pcpus. Of course a guest can always
> > still be configured with up to KVM_CAP_MAX_VCPUS though anyway.
> > 
> > I've put RFC on this patch because I'm not sure if there are any gotchas
> > lurking with this change. The change now means hosts no longer return
> > the same number for KVM_CAP_NR_VCPUS, and that number is likely going to
> > generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I
> > can't think of anything other than generating more warnings[1] from qemu
> > with guests that configure more vcpus than pcpus though.
> > 
> Another gotcha is that on a host with more then 160 cpus recommended
> value will grow which is not a good idea without appropriate testing.

Good point. Of course the objective could be to test a guest with
vcpus > 160 on that host, and then the potential warning messages would
need to be ignored. Probably the best place to set the cap on the number
of vcpus used in a stable environment would be in KVM_MAX_VCPUS. That said,
then at least until KVM_SOFT_MAX_VCPUS catches up to KVM_MAX_VCPUS, I guess
we should keep them both to avoid breaking anything.

> 
> > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> >     any warnings either.
> > 
> > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > ---
> >  arch/x86/include/asm/kvm_host.h | 1 -
> >  arch/x86/kvm/x86.c              | 2 +-
> >  2 files changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index c76ff74a98f2e..9236c63315a9b 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -32,7 +32,6 @@
> >  #include <asm/asm.h>
> >  
> >  #define KVM_MAX_VCPUS 255
> > -#define KVM_SOFT_MAX_VCPUS 160
> >  #define KVM_USER_MEM_SLOTS 125
> >  /* memory slots that are not exposed to userspace */
> >  #define KVM_PRIVATE_MEM_SLOTS 3
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> >  		break;
> >  	case KVM_CAP_NR_VCPUS:
> > -		r = KVM_SOFT_MAX_VCPUS;
> > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?

I'll send a v2 with this change.

I thought a bit about hotplug, and thus using num_possible_cpus()
instead, but then decided it made more sense to stick to what's online now
for the recommended number. It's just a recommendation anyway. So as long
as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
more vcpus to count for all hotplugable cpus, if they wish.

drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov Sept. 16, 2013, 8:55 a.m. UTC | #3

On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > >     any warnings either.
> > > 
> > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > ---
> > >  arch/x86/include/asm/kvm_host.h | 1 -
> > >  arch/x86/kvm/x86.c              | 2 +-
> > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > 
> > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > index c76ff74a98f2e..9236c63315a9b 100644
> > > --- a/arch/x86/include/asm/kvm_host.h
> > > +++ b/arch/x86/include/asm/kvm_host.h
> > > @@ -32,7 +32,6 @@
> > >  #include <asm/asm.h>
> > >  
> > >  #define KVM_MAX_VCPUS 255
> > > -#define KVM_SOFT_MAX_VCPUS 160
> > >  #define KVM_USER_MEM_SLOTS 125
> > >  /* memory slots that are not exposed to userspace */
> > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > --- a/arch/x86/kvm/x86.c
> > > +++ b/arch/x86/kvm/x86.c
> > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > >  		break;
> > >  	case KVM_CAP_NR_VCPUS:
> > > -		r = KVM_SOFT_MAX_VCPUS;
> > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> 
> I'll send a v2 with this change.
> 
> I thought a bit about hotplug, and thus using num_possible_cpus()
> instead, but then decided it made more sense to stick to what's online now
> for the recommended number. It's just a recommendation anyway. So as long
> as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> more vcpus to count for all hotplugable cpus, if they wish.
> 
It is just recommended, but we do warn about it, so it is user visible.
Well, the whole point of it existence is to be user visible ;). If user
creates a guest with max cpus greater than current number if online
cpus, taking into account feature grows, he will get a warning, but we
should not warn about it.
 
--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Jones Sept. 16, 2013, 11:47 a.m. UTC | #4

On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > >     any warnings either.
> > > > 
> > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > ---
> > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > >  arch/x86/kvm/x86.c              | 2 +-
> > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > 
> > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > @@ -32,7 +32,6 @@
> > > >  #include <asm/asm.h>
> > > >  
> > > >  #define KVM_MAX_VCPUS 255
> > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > >  #define KVM_USER_MEM_SLOTS 125
> > > >  /* memory slots that are not exposed to userspace */
> > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > --- a/arch/x86/kvm/x86.c
> > > > +++ b/arch/x86/kvm/x86.c
> > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > >  		break;
> > > >  	case KVM_CAP_NR_VCPUS:
> > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > 
> > I'll send a v2 with this change.
> > 
> > I thought a bit about hotplug, and thus using num_possible_cpus()
> > instead, but then decided it made more sense to stick to what's online now
> > for the recommended number. It's just a recommendation anyway. So as long
> > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > more vcpus to count for all hotplugable cpus, if they wish.
> > 
> It is just recommended, but we do warn about it, so it is user visible.
> Well, the whole point of it existence is to be user visible ;). If user
> creates a guest with max cpus greater than current number if online
> cpus, taking into account feature grows, he will get a warning, but we
> should not warn about it.

Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
indefinitely? I'd rather warn about it, which could remind them to offline
32 vcpus for the time being. Although, as we're just discussing when or
when not to output a warning, then I'm not really stressed about it either
way. I can certainly change this to num_possible_cpus(), if all are in
agreement that that is a better recommendation.

drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov Sept. 16, 2013, 2:41 p.m. UTC | #5

On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote:
> On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > > >     any warnings either.
> > > > > 
> > > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > > ---
> > > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > > >  arch/x86/kvm/x86.c              | 2 +-
> > > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > @@ -32,7 +32,6 @@
> > > > >  #include <asm/asm.h>
> > > > >  
> > > > >  #define KVM_MAX_VCPUS 255
> > > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > > >  #define KVM_USER_MEM_SLOTS 125
> > > > >  /* memory slots that are not exposed to userspace */
> > > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > > --- a/arch/x86/kvm/x86.c
> > > > > +++ b/arch/x86/kvm/x86.c
> > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > > >  		break;
> > > > >  	case KVM_CAP_NR_VCPUS:
> > > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > > 
> > > I'll send a v2 with this change.
> > > 
> > > I thought a bit about hotplug, and thus using num_possible_cpus()
> > > instead, but then decided it made more sense to stick to what's online now
> > > for the recommended number. It's just a recommendation anyway. So as long
> > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > > more vcpus to count for all hotplugable cpus, if they wish.
> > > 
> > It is just recommended, but we do warn about it, so it is user visible.
> > Well, the whole point of it existence is to be user visible ;). If user
> > creates a guest with max cpus greater than current number if online
> > cpus, taking into account feature grows, he will get a warning, but we
> > should not warn about it.
> 
> Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
> indefinitely? I'd rather warn about it, which could remind them to offline
> 32 vcpus for the time being.
But there are other means to detect number of online cpus:
sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of
possible cpus too with _SC_NPROCESSORS_CONF, so returning those values
as KVM_CAP_NR_VCPUS does not provide any additional information. What
if QEMU process is bound to two cores on 64 core host, do you want to
warn if qemu is created with more then 2 vcpus in such case? You can do
that too with pthread_setaffinity_np(). 

>                               Although, as we're just discussing when or
> when not to output a warning, then I'm not really stressed about it either
> way. I can certainly change this to num_possible_cpus(), if all are in
> agreement that that is a better recommendation.
> 
With this patch we only reduce information available to userspace. QEMU
can already obtain all the information it needs to produce meaningful
warning.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Jones Sept. 16, 2013, 3:22 p.m. UTC | #6

On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote:
> On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote:
> > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > > > >     any warnings either.
> > > > > > 
> > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > > > ---
> > > > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > > > >  arch/x86/kvm/x86.c              | 2 +-
> > > > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > @@ -32,7 +32,6 @@
> > > > > >  #include <asm/asm.h>
> > > > > >  
> > > > > >  #define KVM_MAX_VCPUS 255
> > > > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > > > >  #define KVM_USER_MEM_SLOTS 125
> > > > > >  /* memory slots that are not exposed to userspace */
> > > > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > > > >  		break;
> > > > > >  	case KVM_CAP_NR_VCPUS:
> > > > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > > > 
> > > > I'll send a v2 with this change.
> > > > 
> > > > I thought a bit about hotplug, and thus using num_possible_cpus()
> > > > instead, but then decided it made more sense to stick to what's online now
> > > > for the recommended number. It's just a recommendation anyway. So as long
> > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > > > more vcpus to count for all hotplugable cpus, if they wish.
> > > > 
> > > It is just recommended, but we do warn about it, so it is user visible.
> > > Well, the whole point of it existence is to be user visible ;). If user
> > > creates a guest with max cpus greater than current number if online
> > > cpus, taking into account feature grows, he will get a warning, but we
> > > should not warn about it.
> > 
> > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
> > indefinitely? I'd rather warn about it, which could remind them to offline
> > 32 vcpus for the time being.
> But there are other means to detect number of online cpus:
> sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of
> possible cpus too with _SC_NPROCESSORS_CONF, so returning those values
> as KVM_CAP_NR_VCPUS does not provide any additional information. What
> if QEMU process is bound to two cores on 64 core host, do you want to
> warn if qemu is created with more then 2 vcpus in such case? You can do
> that too with pthread_setaffinity_np(). 
> 
> >                               Although, as we're just discussing when or
> > when not to output a warning, then I'm not really stressed about it either
> > way. I can certainly change this to num_possible_cpus(), if all are in
> > agreement that that is a better recommendation.
> > 
> With this patch we only reduce information available to userspace. QEMU
> can already obtain all the information it needs to produce meaningful
> warning.

All good points. We're still left with the fact that KVM_CAP_NR_VCPU
currently returns a distro-specific number though, which can only be
modified by changing a constant embedded in the source. So I still believe
that a config option is in order, but now you're convincing me that the
option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also
remain distro-neutral, so I vote 255. We'd then change the defines to be

#define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS
#define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS

distros can then configure something lower than 255 (160), and developers
can configure anything they want. Neither will create a warning gap unless
the developer manually changes the KVM_MAX_VCPUS define to create one.

drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov Sept. 17, 2013, 9:36 a.m. UTC | #7

On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote:
> On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote:
> > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote:
> > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > > > > >     any warnings either.
> > > > > > > 
> > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > > > > ---
> > > > > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > > > > >  arch/x86/kvm/x86.c              | 2 +-
> > > > > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > > @@ -32,7 +32,6 @@
> > > > > > >  #include <asm/asm.h>
> > > > > > >  
> > > > > > >  #define KVM_MAX_VCPUS 255
> > > > > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > > > > >  #define KVM_USER_MEM_SLOTS 125
> > > > > > >  /* memory slots that are not exposed to userspace */
> > > > > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > > > > >  		break;
> > > > > > >  	case KVM_CAP_NR_VCPUS:
> > > > > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > > > > 
> > > > > I'll send a v2 with this change.
> > > > > 
> > > > > I thought a bit about hotplug, and thus using num_possible_cpus()
> > > > > instead, but then decided it made more sense to stick to what's online now
> > > > > for the recommended number. It's just a recommendation anyway. So as long
> > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > > > > more vcpus to count for all hotplugable cpus, if they wish.
> > > > > 
> > > > It is just recommended, but we do warn about it, so it is user visible.
> > > > Well, the whole point of it existence is to be user visible ;). If user
> > > > creates a guest with max cpus greater than current number if online
> > > > cpus, taking into account feature grows, he will get a warning, but we
> > > > should not warn about it.
> > > 
> > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
> > > indefinitely? I'd rather warn about it, which could remind them to offline
> > > 32 vcpus for the time being.
> > But there are other means to detect number of online cpus:
> > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of
> > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values
> > as KVM_CAP_NR_VCPUS does not provide any additional information. What
> > if QEMU process is bound to two cores on 64 core host, do you want to
> > warn if qemu is created with more then 2 vcpus in such case? You can do
> > that too with pthread_setaffinity_np(). 
> > 
> > >                               Although, as we're just discussing when or
> > > when not to output a warning, then I'm not really stressed about it either
> > > way. I can certainly change this to num_possible_cpus(), if all are in
> > > agreement that that is a better recommendation.
> > > 
> > With this patch we only reduce information available to userspace. QEMU
> > can already obtain all the information it needs to produce meaningful
> > warning.
> 
> All good points. We're still left with the fact that KVM_CAP_NR_VCPU
> currently returns a distro-specific number though, which can only be
> modified by changing a constant embedded in the source. So I still believe
> that a config option is in order, but now you're convincing me that the
> option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also
> remain distro-neutral, so I vote 255. We'd then change the defines to be
> 
> #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS
> #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS
> 
So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point
to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports
because of architectural and/or implementation reasons. Current maximum
is 255 because this is what X2APIC supports without interrupt remapping
and we cannot grow this number without additional coding.
KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can
safely scale to, the feeling is backed up by some amount of testing of
course. It may make sense for downstream to change this value, but if
they want to lower it I'd rather get bug reports from them that we are
not as scalable as we claim, and if they want to make it large I want to
hear about their successful testing too.

> distros can then configure something lower than 255 (160), and developers
> can configure anything they want. Neither will create a warning gap unless
> the developer manually changes the KVM_MAX_VCPUS define to create one.
> 
As I said in other email KVM_SOFT_MAX_VCPUS/KVM_MAX_VCPUS distinction
was introduce precisely to avoid recompilation requirements. Not everyone
who is interested in running VM with 255 vcpus want to recompile the
kernel or have permission to do so.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andrew Jones Sept. 17, 2013, 10:03 a.m. UTC | #8

On Tue, Sep 17, 2013 at 12:36:19PM +0300, Gleb Natapov wrote:
> On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote:
> > On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote:
> > > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote:
> > > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> > > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > > > > > >     any warnings either.
> > > > > > > > 
> > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > > > > > ---
> > > > > > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > > > > > >  arch/x86/kvm/x86.c              | 2 +-
> > > > > > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > > > @@ -32,7 +32,6 @@
> > > > > > > >  #include <asm/asm.h>
> > > > > > > >  
> > > > > > > >  #define KVM_MAX_VCPUS 255
> > > > > > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > > > > > >  #define KVM_USER_MEM_SLOTS 125
> > > > > > > >  /* memory slots that are not exposed to userspace */
> > > > > > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > > > > > >  		break;
> > > > > > > >  	case KVM_CAP_NR_VCPUS:
> > > > > > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > > > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > > > > > 
> > > > > > I'll send a v2 with this change.
> > > > > > 
> > > > > > I thought a bit about hotplug, and thus using num_possible_cpus()
> > > > > > instead, but then decided it made more sense to stick to what's online now
> > > > > > for the recommended number. It's just a recommendation anyway. So as long
> > > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > > > > > more vcpus to count for all hotplugable cpus, if they wish.
> > > > > > 
> > > > > It is just recommended, but we do warn about it, so it is user visible.
> > > > > Well, the whole point of it existence is to be user visible ;). If user
> > > > > creates a guest with max cpus greater than current number if online
> > > > > cpus, taking into account feature grows, he will get a warning, but we
> > > > > should not warn about it.
> > > > 
> > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
> > > > indefinitely? I'd rather warn about it, which could remind them to offline
> > > > 32 vcpus for the time being.
> > > But there are other means to detect number of online cpus:
> > > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of
> > > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values
> > > as KVM_CAP_NR_VCPUS does not provide any additional information. What
> > > if QEMU process is bound to two cores on 64 core host, do you want to
> > > warn if qemu is created with more then 2 vcpus in such case? You can do
> > > that too with pthread_setaffinity_np(). 
> > > 
> > > >                               Although, as we're just discussing when or
> > > > when not to output a warning, then I'm not really stressed about it either
> > > > way. I can certainly change this to num_possible_cpus(), if all are in
> > > > agreement that that is a better recommendation.
> > > > 
> > > With this patch we only reduce information available to userspace. QEMU
> > > can already obtain all the information it needs to produce meaningful
> > > warning.
> > 
> > All good points. We're still left with the fact that KVM_CAP_NR_VCPU
> > currently returns a distro-specific number though, which can only be
> > modified by changing a constant embedded in the source. So I still believe
> > that a config option is in order, but now you're convincing me that the
> > option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also
> > remain distro-neutral, so I vote 255. We'd then change the defines to be
> > 
> > #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS
> > #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS
> > 
> So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point
> to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports

I actually didn't believe there was a good point, until...

> because of architectural and/or implementation reasons. Current maximum
> is 255 because this is what X2APIC supports without interrupt remapping
> and we cannot grow this number without additional coding.
> KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can

...learning this. I didn't know that KVM_SOFT_MAX_VCPUS was an upstream
agreement. I thought that number came out of distro-specific testing (indeed
that's what the commit message says), and thus I wanted to move it into
distro-configurable territory. I also didn't know that the 255 limit
serves to document the maximum the x2apic supports. Should we add a
comment for that define stating that?

Anyway, now that'd you've clarified all this for me, please disregard both
this and the other patch.

drew
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Gleb Natapov Sept. 17, 2013, 1:46 p.m. UTC | #9

On Tue, Sep 17, 2013 at 12:03:09PM +0200, Andrew Jones wrote:
> On Tue, Sep 17, 2013 at 12:36:19PM +0300, Gleb Natapov wrote:
> > On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote:
> > > On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote:
> > > > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote:
> > > > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote:
> > > > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote:
> > > > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be
> > > > > > > > >     any warnings either.
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > > > > > > > > ---
> > > > > > > > >  arch/x86/include/asm/kvm_host.h | 1 -
> > > > > > > > >  arch/x86/kvm/x86.c              | 2 +-
> > > > > > > > >  2 files changed, 1 insertion(+), 2 deletions(-)
> > > > > > > > > 
> > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > > > > > > > > index c76ff74a98f2e..9236c63315a9b 100644
> > > > > > > > > --- a/arch/x86/include/asm/kvm_host.h
> > > > > > > > > +++ b/arch/x86/include/asm/kvm_host.h
> > > > > > > > > @@ -32,7 +32,6 @@
> > > > > > > > >  #include <asm/asm.h>
> > > > > > > > >  
> > > > > > > > >  #define KVM_MAX_VCPUS 255
> > > > > > > > > -#define KVM_SOFT_MAX_VCPUS 160
> > > > > > > > >  #define KVM_USER_MEM_SLOTS 125
> > > > > > > > >  /* memory slots that are not exposed to userspace */
> > > > > > > > >  #define KVM_PRIVATE_MEM_SLOTS 3
> > > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644
> > > > > > > > > --- a/arch/x86/kvm/x86.c
> > > > > > > > > +++ b/arch/x86/kvm/x86.c
> > > > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext)
> > > > > > > > >  		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
> > > > > > > > >  		break;
> > > > > > > > >  	case KVM_CAP_NR_VCPUS:
> > > > > > > > > -		r = KVM_SOFT_MAX_VCPUS;
> > > > > > > > > +		r = min(num_online_cpus(), KVM_MAX_VCPUS);
> > > > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/.  Also what about hotplug cpus?
> > > > > > > 
> > > > > > > I'll send a v2 with this change.
> > > > > > > 
> > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus()
> > > > > > > instead, but then decided it made more sense to stick to what's online now
> > > > > > > for the recommended number. It's just a recommendation anyway. So as long
> > > > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure
> > > > > > > more vcpus to count for all hotplugable cpus, if they wish.
> > > > > > > 
> > > > > > It is just recommended, but we do warn about it, so it is user visible.
> > > > > > Well, the whole point of it existence is to be user visible ;). If user
> > > > > > creates a guest with max cpus greater than current number if online
> > > > > > cpus, taking into account feature grows, he will get a warning, but we
> > > > > > should not warn about it.
> > > > > 
> > > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus
> > > > > indefinitely? I'd rather warn about it, which could remind them to offline
> > > > > 32 vcpus for the time being.
> > > > But there are other means to detect number of online cpus:
> > > > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of
> > > > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values
> > > > as KVM_CAP_NR_VCPUS does not provide any additional information. What
> > > > if QEMU process is bound to two cores on 64 core host, do you want to
> > > > warn if qemu is created with more then 2 vcpus in such case? You can do
> > > > that too with pthread_setaffinity_np(). 
> > > > 
> > > > >                               Although, as we're just discussing when or
> > > > > when not to output a warning, then I'm not really stressed about it either
> > > > > way. I can certainly change this to num_possible_cpus(), if all are in
> > > > > agreement that that is a better recommendation.
> > > > > 
> > > > With this patch we only reduce information available to userspace. QEMU
> > > > can already obtain all the information it needs to produce meaningful
> > > > warning.
> > > 
> > > All good points. We're still left with the fact that KVM_CAP_NR_VCPU
> > > currently returns a distro-specific number though, which can only be
> > > modified by changing a constant embedded in the source. So I still believe
> > > that a config option is in order, but now you're convincing me that the
> > > option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also
> > > remain distro-neutral, so I vote 255. We'd then change the defines to be
> > > 
> > > #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS
> > > #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS
> > > 
> > So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point
> > to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports
> 
> I actually didn't believe there was a good point, until...
> 
> > because of architectural and/or implementation reasons. Current maximum
> > is 255 because this is what X2APIC supports without interrupt remapping
> > and we cannot grow this number without additional coding.
> > KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can
> 
> ...learning this. I didn't know that KVM_SOFT_MAX_VCPUS was an upstream
> agreement. I thought that number came out of distro-specific testing (indeed
> that's what the commit message says), and thus I wanted to move it into
The testing was done with dist kernel, yes, but it was actually slightly
behind upstream and upstream should be better, not worse then dist
kernel. If upstream perform works than older dist kernel we have a
regression that should be investigated.

> distro-configurable territory. I also didn't know that the 255 limit
> serves to document the maximum the x2apic supports. Should we add a
> comment for that define stating that?
Yes, the comment would be useful.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[RFC] x86: kvm: remove KVM_SOFT_MAX_VCPUS

Commit Message

Comments

Patch