Message ID | 1379161011-28239-1-git-send-email-drjones@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, Sep 14, 2013 at 02:16:51PM +0200, Andrew Jones wrote: > This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for > KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says > KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus, > it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the > maximum tested number of vcpus. As that concept could be > distro-specific, this patch uses the other recommended maximum, the > number of physical cpus, as we never recommend configuring a guest that > has more vcpus than the host has pcpus. Of course a guest can always > still be configured with up to KVM_CAP_MAX_VCPUS though anyway. > > I've put RFC on this patch because I'm not sure if there are any gotchas > lurking with this change. The change now means hosts no longer return > the same number for KVM_CAP_NR_VCPUS, and that number is likely going to > generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I > can't think of anything other than generating more warnings[1] from qemu > with guests that configure more vcpus than pcpus though. > Another gotcha is that on a host with more then 160 cpus recommended value will grow which is not a good idea without appropriate testing. > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > any warnings either. > > Signed-off-by: Andrew Jones <drjones@redhat.com> > --- > arch/x86/include/asm/kvm_host.h | 1 - > arch/x86/kvm/x86.c | 2 +- > 2 files changed, 1 insertion(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > index c76ff74a98f2e..9236c63315a9b 100644 > --- a/arch/x86/include/asm/kvm_host.h > +++ b/arch/x86/include/asm/kvm_host.h > @@ -32,7 +32,6 @@ > #include <asm/asm.h> > > #define KVM_MAX_VCPUS 255 > -#define KVM_SOFT_MAX_VCPUS 160 > #define KVM_USER_MEM_SLOTS 125 > /* memory slots that are not exposed to userspace */ > #define KVM_PRIVATE_MEM_SLOTS 3 > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > break; > case KVM_CAP_NR_VCPUS: > - r = KVM_SOFT_MAX_VCPUS; > + r = min(num_online_cpus(), KVM_MAX_VCPUS); s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > break; > case KVM_CAP_MAX_VCPUS: > r = KVM_MAX_VCPUS; > -- > 1.8.1.4 -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Sep 15, 2013 at 12:03:22PM +0300, Gleb Natapov wrote: > On Sat, Sep 14, 2013 at 02:16:51PM +0200, Andrew Jones wrote: > > This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for > > KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says > > KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus, > > it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the > > maximum tested number of vcpus. As that concept could be > > distro-specific, this patch uses the other recommended maximum, the > > number of physical cpus, as we never recommend configuring a guest that > > has more vcpus than the host has pcpus. Of course a guest can always > > still be configured with up to KVM_CAP_MAX_VCPUS though anyway. > > > > I've put RFC on this patch because I'm not sure if there are any gotchas > > lurking with this change. The change now means hosts no longer return > > the same number for KVM_CAP_NR_VCPUS, and that number is likely going to > > generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I > > can't think of anything other than generating more warnings[1] from qemu > > with guests that configure more vcpus than pcpus though. > > > Another gotcha is that on a host with more then 160 cpus recommended > value will grow which is not a good idea without appropriate testing. Good point. Of course the objective could be to test a guest with vcpus > 160 on that host, and then the potential warning messages would need to be ignored. Probably the best place to set the cap on the number of vcpus used in a stable environment would be in KVM_MAX_VCPUS. That said, then at least until KVM_SOFT_MAX_VCPUS catches up to KVM_MAX_VCPUS, I guess we should keep them both to avoid breaking anything. > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > any warnings either. > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > --- > > arch/x86/include/asm/kvm_host.h | 1 - > > arch/x86/kvm/x86.c | 2 +- > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > index c76ff74a98f2e..9236c63315a9b 100644 > > --- a/arch/x86/include/asm/kvm_host.h > > +++ b/arch/x86/include/asm/kvm_host.h > > @@ -32,7 +32,6 @@ > > #include <asm/asm.h> > > > > #define KVM_MAX_VCPUS 255 > > -#define KVM_SOFT_MAX_VCPUS 160 > > #define KVM_USER_MEM_SLOTS 125 > > /* memory slots that are not exposed to userspace */ > > #define KVM_PRIVATE_MEM_SLOTS 3 > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > break; > > case KVM_CAP_NR_VCPUS: > > - r = KVM_SOFT_MAX_VCPUS; > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? I'll send a v2 with this change. I thought a bit about hotplug, and thus using num_possible_cpus() instead, but then decided it made more sense to stick to what's online now for the recommended number. It's just a recommendation anyway. So as long as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure more vcpus to count for all hotplugable cpus, if they wish. drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > any warnings either. > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > --- > > > arch/x86/include/asm/kvm_host.h | 1 - > > > arch/x86/kvm/x86.c | 2 +- > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > --- a/arch/x86/include/asm/kvm_host.h > > > +++ b/arch/x86/include/asm/kvm_host.h > > > @@ -32,7 +32,6 @@ > > > #include <asm/asm.h> > > > > > > #define KVM_MAX_VCPUS 255 > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > #define KVM_USER_MEM_SLOTS 125 > > > /* memory slots that are not exposed to userspace */ > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > --- a/arch/x86/kvm/x86.c > > > +++ b/arch/x86/kvm/x86.c > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > break; > > > case KVM_CAP_NR_VCPUS: > > > - r = KVM_SOFT_MAX_VCPUS; > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > I'll send a v2 with this change. > > I thought a bit about hotplug, and thus using num_possible_cpus() > instead, but then decided it made more sense to stick to what's online now > for the recommended number. It's just a recommendation anyway. So as long > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > more vcpus to count for all hotplugable cpus, if they wish. > It is just recommended, but we do warn about it, so it is user visible. Well, the whole point of it existence is to be user visible ;). If user creates a guest with max cpus greater than current number if online cpus, taking into account feature grows, he will get a warning, but we should not warn about it. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > any warnings either. > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > --- > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > arch/x86/kvm/x86.c | 2 +- > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > @@ -32,7 +32,6 @@ > > > > #include <asm/asm.h> > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > #define KVM_USER_MEM_SLOTS 125 > > > > /* memory slots that are not exposed to userspace */ > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > --- a/arch/x86/kvm/x86.c > > > > +++ b/arch/x86/kvm/x86.c > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > break; > > > > case KVM_CAP_NR_VCPUS: > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > I'll send a v2 with this change. > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > instead, but then decided it made more sense to stick to what's online now > > for the recommended number. It's just a recommendation anyway. So as long > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > more vcpus to count for all hotplugable cpus, if they wish. > > > It is just recommended, but we do warn about it, so it is user visible. > Well, the whole point of it existence is to be user visible ;). If user > creates a guest with max cpus greater than current number if online > cpus, taking into account feature grows, he will get a warning, but we > should not warn about it. Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus indefinitely? I'd rather warn about it, which could remind them to offline 32 vcpus for the time being. Although, as we're just discussing when or when not to output a warning, then I'm not really stressed about it either way. I can certainly change this to num_possible_cpus(), if all are in agreement that that is a better recommendation. drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote: > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > > any warnings either. > > > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > > --- > > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > > arch/x86/kvm/x86.c | 2 +- > > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > @@ -32,7 +32,6 @@ > > > > > #include <asm/asm.h> > > > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > > #define KVM_USER_MEM_SLOTS 125 > > > > > /* memory slots that are not exposed to userspace */ > > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > > --- a/arch/x86/kvm/x86.c > > > > > +++ b/arch/x86/kvm/x86.c > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > > break; > > > > > case KVM_CAP_NR_VCPUS: > > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > > > I'll send a v2 with this change. > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > > instead, but then decided it made more sense to stick to what's online now > > > for the recommended number. It's just a recommendation anyway. So as long > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > > more vcpus to count for all hotplugable cpus, if they wish. > > > > > It is just recommended, but we do warn about it, so it is user visible. > > Well, the whole point of it existence is to be user visible ;). If user > > creates a guest with max cpus greater than current number if online > > cpus, taking into account feature grows, he will get a warning, but we > > should not warn about it. > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus > indefinitely? I'd rather warn about it, which could remind them to offline > 32 vcpus for the time being. But there are other means to detect number of online cpus: sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of possible cpus too with _SC_NPROCESSORS_CONF, so returning those values as KVM_CAP_NR_VCPUS does not provide any additional information. What if QEMU process is bound to two cores on 64 core host, do you want to warn if qemu is created with more then 2 vcpus in such case? You can do that too with pthread_setaffinity_np(). > Although, as we're just discussing when or > when not to output a warning, then I'm not really stressed about it either > way. I can certainly change this to num_possible_cpus(), if all are in > agreement that that is a better recommendation. > With this patch we only reduce information available to userspace. QEMU can already obtain all the information it needs to produce meaningful warning. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote: > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote: > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > > > any warnings either. > > > > > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > > > --- > > > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > > > arch/x86/kvm/x86.c | 2 +- > > > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > > @@ -32,7 +32,6 @@ > > > > > > #include <asm/asm.h> > > > > > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > > > #define KVM_USER_MEM_SLOTS 125 > > > > > > /* memory slots that are not exposed to userspace */ > > > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > > > --- a/arch/x86/kvm/x86.c > > > > > > +++ b/arch/x86/kvm/x86.c > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > > > break; > > > > > > case KVM_CAP_NR_VCPUS: > > > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > > > > > I'll send a v2 with this change. > > > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > > > instead, but then decided it made more sense to stick to what's online now > > > > for the recommended number. It's just a recommendation anyway. So as long > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > > > more vcpus to count for all hotplugable cpus, if they wish. > > > > > > > It is just recommended, but we do warn about it, so it is user visible. > > > Well, the whole point of it existence is to be user visible ;). If user > > > creates a guest with max cpus greater than current number if online > > > cpus, taking into account feature grows, he will get a warning, but we > > > should not warn about it. > > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus > > indefinitely? I'd rather warn about it, which could remind them to offline > > 32 vcpus for the time being. > But there are other means to detect number of online cpus: > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values > as KVM_CAP_NR_VCPUS does not provide any additional information. What > if QEMU process is bound to two cores on 64 core host, do you want to > warn if qemu is created with more then 2 vcpus in such case? You can do > that too with pthread_setaffinity_np(). > > > Although, as we're just discussing when or > > when not to output a warning, then I'm not really stressed about it either > > way. I can certainly change this to num_possible_cpus(), if all are in > > agreement that that is a better recommendation. > > > With this patch we only reduce information available to userspace. QEMU > can already obtain all the information it needs to produce meaningful > warning. All good points. We're still left with the fact that KVM_CAP_NR_VCPU currently returns a distro-specific number though, which can only be modified by changing a constant embedded in the source. So I still believe that a config option is in order, but now you're convincing me that the option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also remain distro-neutral, so I vote 255. We'd then change the defines to be #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS distros can then configure something lower than 255 (160), and developers can configure anything they want. Neither will create a warning gap unless the developer manually changes the KVM_MAX_VCPUS define to create one. drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote: > On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote: > > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote: > > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > > > > any warnings either. > > > > > > > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > > > > --- > > > > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > > > > arch/x86/kvm/x86.c | 2 +- > > > > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > > > @@ -32,7 +32,6 @@ > > > > > > > #include <asm/asm.h> > > > > > > > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > > > > #define KVM_USER_MEM_SLOTS 125 > > > > > > > /* memory slots that are not exposed to userspace */ > > > > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > > > > --- a/arch/x86/kvm/x86.c > > > > > > > +++ b/arch/x86/kvm/x86.c > > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > > > > break; > > > > > > > case KVM_CAP_NR_VCPUS: > > > > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > > > > > > > I'll send a v2 with this change. > > > > > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > > > > instead, but then decided it made more sense to stick to what's online now > > > > > for the recommended number. It's just a recommendation anyway. So as long > > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > > > > more vcpus to count for all hotplugable cpus, if they wish. > > > > > > > > > It is just recommended, but we do warn about it, so it is user visible. > > > > Well, the whole point of it existence is to be user visible ;). If user > > > > creates a guest with max cpus greater than current number if online > > > > cpus, taking into account feature grows, he will get a warning, but we > > > > should not warn about it. > > > > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus > > > indefinitely? I'd rather warn about it, which could remind them to offline > > > 32 vcpus for the time being. > > But there are other means to detect number of online cpus: > > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of > > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values > > as KVM_CAP_NR_VCPUS does not provide any additional information. What > > if QEMU process is bound to two cores on 64 core host, do you want to > > warn if qemu is created with more then 2 vcpus in such case? You can do > > that too with pthread_setaffinity_np(). > > > > > Although, as we're just discussing when or > > > when not to output a warning, then I'm not really stressed about it either > > > way. I can certainly change this to num_possible_cpus(), if all are in > > > agreement that that is a better recommendation. > > > > > With this patch we only reduce information available to userspace. QEMU > > can already obtain all the information it needs to produce meaningful > > warning. > > All good points. We're still left with the fact that KVM_CAP_NR_VCPU > currently returns a distro-specific number though, which can only be > modified by changing a constant embedded in the source. So I still believe > that a config option is in order, but now you're convincing me that the > option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also > remain distro-neutral, so I vote 255. We'd then change the defines to be > > #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS > #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS > So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports because of architectural and/or implementation reasons. Current maximum is 255 because this is what X2APIC supports without interrupt remapping and we cannot grow this number without additional coding. KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can safely scale to, the feeling is backed up by some amount of testing of course. It may make sense for downstream to change this value, but if they want to lower it I'd rather get bug reports from them that we are not as scalable as we claim, and if they want to make it large I want to hear about their successful testing too. > distros can then configure something lower than 255 (160), and developers > can configure anything they want. Neither will create a warning gap unless > the developer manually changes the KVM_MAX_VCPUS define to create one. > As I said in other email KVM_SOFT_MAX_VCPUS/KVM_MAX_VCPUS distinction was introduce precisely to avoid recompilation requirements. Not everyone who is interested in running VM with 255 vcpus want to recompile the kernel or have permission to do so. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 17, 2013 at 12:36:19PM +0300, Gleb Natapov wrote: > On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote: > > On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote: > > > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote: > > > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > > > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > > > > > any warnings either. > > > > > > > > > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > > > > > --- > > > > > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > > > > > arch/x86/kvm/x86.c | 2 +- > > > > > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > > > > @@ -32,7 +32,6 @@ > > > > > > > > #include <asm/asm.h> > > > > > > > > > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > > > > > #define KVM_USER_MEM_SLOTS 125 > > > > > > > > /* memory slots that are not exposed to userspace */ > > > > > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > > > > > --- a/arch/x86/kvm/x86.c > > > > > > > > +++ b/arch/x86/kvm/x86.c > > > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > > > > > break; > > > > > > > > case KVM_CAP_NR_VCPUS: > > > > > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > > > > > > > > > I'll send a v2 with this change. > > > > > > > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > > > > > instead, but then decided it made more sense to stick to what's online now > > > > > > for the recommended number. It's just a recommendation anyway. So as long > > > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > > > > > more vcpus to count for all hotplugable cpus, if they wish. > > > > > > > > > > > It is just recommended, but we do warn about it, so it is user visible. > > > > > Well, the whole point of it existence is to be user visible ;). If user > > > > > creates a guest with max cpus greater than current number if online > > > > > cpus, taking into account feature grows, he will get a warning, but we > > > > > should not warn about it. > > > > > > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus > > > > indefinitely? I'd rather warn about it, which could remind them to offline > > > > 32 vcpus for the time being. > > > But there are other means to detect number of online cpus: > > > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of > > > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values > > > as KVM_CAP_NR_VCPUS does not provide any additional information. What > > > if QEMU process is bound to two cores on 64 core host, do you want to > > > warn if qemu is created with more then 2 vcpus in such case? You can do > > > that too with pthread_setaffinity_np(). > > > > > > > Although, as we're just discussing when or > > > > when not to output a warning, then I'm not really stressed about it either > > > > way. I can certainly change this to num_possible_cpus(), if all are in > > > > agreement that that is a better recommendation. > > > > > > > With this patch we only reduce information available to userspace. QEMU > > > can already obtain all the information it needs to produce meaningful > > > warning. > > > > All good points. We're still left with the fact that KVM_CAP_NR_VCPU > > currently returns a distro-specific number though, which can only be > > modified by changing a constant embedded in the source. So I still believe > > that a config option is in order, but now you're convincing me that the > > option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also > > remain distro-neutral, so I vote 255. We'd then change the defines to be > > > > #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS > > #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS > > > So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point > to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports I actually didn't believe there was a good point, until... > because of architectural and/or implementation reasons. Current maximum > is 255 because this is what X2APIC supports without interrupt remapping > and we cannot grow this number without additional coding. > KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can ...learning this. I didn't know that KVM_SOFT_MAX_VCPUS was an upstream agreement. I thought that number came out of distro-specific testing (indeed that's what the commit message says), and thus I wanted to move it into distro-configurable territory. I also didn't know that the 255 limit serves to document the maximum the x2apic supports. Should we add a comment for that define stating that? Anyway, now that'd you've clarified all this for me, please disregard both this and the other patch. drew -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 17, 2013 at 12:03:09PM +0200, Andrew Jones wrote: > On Tue, Sep 17, 2013 at 12:36:19PM +0300, Gleb Natapov wrote: > > On Mon, Sep 16, 2013 at 05:22:26PM +0200, Andrew Jones wrote: > > > On Mon, Sep 16, 2013 at 05:41:18PM +0300, Gleb Natapov wrote: > > > > On Mon, Sep 16, 2013 at 01:47:26PM +0200, Andrew Jones wrote: > > > > > On Mon, Sep 16, 2013 at 11:55:17AM +0300, Gleb Natapov wrote: > > > > > > On Mon, Sep 16, 2013 at 10:22:09AM +0200, Andrew Jones wrote: > > > > > > > > > [1] Actually, until 972fc544b6034a in uq/master is merged there won't be > > > > > > > > > any warnings either. > > > > > > > > > > > > > > > > > > Signed-off-by: Andrew Jones <drjones@redhat.com> > > > > > > > > > --- > > > > > > > > > arch/x86/include/asm/kvm_host.h | 1 - > > > > > > > > > arch/x86/kvm/x86.c | 2 +- > > > > > > > > > 2 files changed, 1 insertion(+), 2 deletions(-) > > > > > > > > > > > > > > > > > > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h > > > > > > > > > index c76ff74a98f2e..9236c63315a9b 100644 > > > > > > > > > --- a/arch/x86/include/asm/kvm_host.h > > > > > > > > > +++ b/arch/x86/include/asm/kvm_host.h > > > > > > > > > @@ -32,7 +32,6 @@ > > > > > > > > > #include <asm/asm.h> > > > > > > > > > > > > > > > > > > #define KVM_MAX_VCPUS 255 > > > > > > > > > -#define KVM_SOFT_MAX_VCPUS 160 > > > > > > > > > #define KVM_USER_MEM_SLOTS 125 > > > > > > > > > /* memory slots that are not exposed to userspace */ > > > > > > > > > #define KVM_PRIVATE_MEM_SLOTS 3 > > > > > > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > > > > > > > > index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 > > > > > > > > > --- a/arch/x86/kvm/x86.c > > > > > > > > > +++ b/arch/x86/kvm/x86.c > > > > > > > > > @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) > > > > > > > > > r = !kvm_x86_ops->cpu_has_accelerated_tpr(); > > > > > > > > > break; > > > > > > > > > case KVM_CAP_NR_VCPUS: > > > > > > > > > - r = KVM_SOFT_MAX_VCPUS; > > > > > > > > > + r = min(num_online_cpus(), KVM_MAX_VCPUS); > > > > > > > > s/KVM_MAX_VCPUS/KVM_SOFT_MAX_VCPUS/. Also what about hotplug cpus? > > > > > > > > > > > > > > I'll send a v2 with this change. > > > > > > > > > > > > > > I thought a bit about hotplug, and thus using num_possible_cpus() > > > > > > > instead, but then decided it made more sense to stick to what's online now > > > > > > > for the recommended number. It's just a recommendation anyway. So as long > > > > > > > as KVM_MAX_VCPUS is >= num_possible_cpus(), then one can still configure > > > > > > > more vcpus to count for all hotplugable cpus, if they wish. > > > > > > > > > > > > > It is just recommended, but we do warn about it, so it is user visible. > > > > > > Well, the whole point of it existence is to be user visible ;). If user > > > > > > creates a guest with max cpus greater than current number if online > > > > > > cpus, taking into account feature grows, he will get a warning, but we > > > > > > should not warn about it. > > > > > > > > > > Even it if means the user may end up running, e.g. 128 vcpus on 96 pcpus > > > > > indefinitely? I'd rather warn about it, which could remind them to offline > > > > > 32 vcpus for the time being. > > > > But there are other means to detect number of online cpus: > > > > sysconf(_SC_NPROCESSORS_ONLN). Actually you can determine number of > > > > possible cpus too with _SC_NPROCESSORS_CONF, so returning those values > > > > as KVM_CAP_NR_VCPUS does not provide any additional information. What > > > > if QEMU process is bound to two cores on 64 core host, do you want to > > > > warn if qemu is created with more then 2 vcpus in such case? You can do > > > > that too with pthread_setaffinity_np(). > > > > > > > > > Although, as we're just discussing when or > > > > > when not to output a warning, then I'm not really stressed about it either > > > > > way. I can certainly change this to num_possible_cpus(), if all are in > > > > > agreement that that is a better recommendation. > > > > > > > > > With this patch we only reduce information available to userspace. QEMU > > > > can already obtain all the information it needs to produce meaningful > > > > warning. > > > > > > All good points. We're still left with the fact that KVM_CAP_NR_VCPU > > > currently returns a distro-specific number though, which can only be > > > modified by changing a constant embedded in the source. So I still believe > > > that a config option is in order, but now you're convincing me that the > > > option should adjust KVM_SOFT_MAX_VCPUS instead. The default should also > > > remain distro-neutral, so I vote 255. We'd then change the defines to be > > > > > > #define KVM_SOFT_MAX_VCPUS CONFIG_KVM_SOFT_MAX_VCPUS > > > #define KVM_MAX_VCPUS KVM_SOFT_MAX_VCPUS > > > > > So you make KVM_MAX_VCPUS same as KVM_SOFT_MAX_VCPUS, what's the point > > to have both then? KVM_MAX_VCPUS is max number of cpu that KVM supports > > I actually didn't believe there was a good point, until... > > > because of architectural and/or implementation reasons. Current maximum > > is 255 because this is what X2APIC supports without interrupt remapping > > and we cannot grow this number without additional coding. > > KVM_SOFT_MAX_VCPUS is the number we (upstream) feel single VM can > > ...learning this. I didn't know that KVM_SOFT_MAX_VCPUS was an upstream > agreement. I thought that number came out of distro-specific testing (indeed > that's what the commit message says), and thus I wanted to move it into The testing was done with dist kernel, yes, but it was actually slightly behind upstream and upstream should be better, not worse then dist kernel. If upstream perform works than older dist kernel we have a regression that should be investigated. > distro-configurable territory. I also didn't know that the 255 limit > serves to document the maximum the x2apic supports. Should we add a > comment for that define stating that? Yes, the comment would be useful. -- Gleb. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index c76ff74a98f2e..9236c63315a9b 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -32,7 +32,6 @@ #include <asm/asm.h> #define KVM_MAX_VCPUS 255 -#define KVM_SOFT_MAX_VCPUS 160 #define KVM_USER_MEM_SLOTS 125 /* memory slots that are not exposed to userspace */ #define KVM_PRIVATE_MEM_SLOTS 3 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index e5ca72a5cdb6d..d9d3e2ed68ee9 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -2604,7 +2604,7 @@ int kvm_dev_ioctl_check_extension(long ext) r = !kvm_x86_ops->cpu_has_accelerated_tpr(); break; case KVM_CAP_NR_VCPUS: - r = KVM_SOFT_MAX_VCPUS; + r = min(num_online_cpus(), KVM_MAX_VCPUS); break; case KVM_CAP_MAX_VCPUS: r = KVM_MAX_VCPUS;
This patch removes KVM_SOFT_MAX_VCPUS and uses num_online_cpus() for KVM_CAP_NR_VCPUS instead, as ARM does. While the API doc simply says KVM_CAP_NR_VCPUS should return the recommended maximum number of vcpus, it has been returning KVM_SOFT_MAX_VCPUS, which was defined as the maximum tested number of vcpus. As that concept could be distro-specific, this patch uses the other recommended maximum, the number of physical cpus, as we never recommend configuring a guest that has more vcpus than the host has pcpus. Of course a guest can always still be configured with up to KVM_CAP_MAX_VCPUS though anyway. I've put RFC on this patch because I'm not sure if there are any gotchas lurking with this change. The change now means hosts no longer return the same number for KVM_CAP_NR_VCPUS, and that number is likely going to generally be quite a bit less than what KVM_SOFT_MAX_VCPUS was (160). I can't think of anything other than generating more warnings[1] from qemu with guests that configure more vcpus than pcpus though. [1] Actually, until 972fc544b6034a in uq/master is merged there won't be any warnings either. Signed-off-by: Andrew Jones <drjones@redhat.com> --- arch/x86/include/asm/kvm_host.h | 1 - arch/x86/kvm/x86.c | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-)