Message ID | 1416931449-24585-2-git-send-email-dahi@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 25.11.2014 um 17:04 schrieb David Hildenbrand: > As some architectures (e.g. s390) can't disable preemption while > entering/leaving the guest, they won't receive the yield in all situations. > > kvm_enter_guest() has to be called with preemption_disabled and will set > PF_VCPU. After that point e.g. s390 reenables preemption and starts to execute the > guest. The thread might therefore be scheduled out between kvm_enter_guest() and > kvm_exit_guest(), resulting in PF_VCPU being set but not being run. > > Please note that preemption has to stay enabled in order to correctly process > page faults on s390. > > Current code takes PF_VCPU as a hint that the VCPU thread is running and > therefore needs no yield. yield_to() checks whether the target thread is running, > so let's use the inbuilt functionality to make it independent of PF_VCPU and > preemption. This change is a trade-off. PRO: This patch would improve the case of preemption on s390. This is probably a corner case as most distros have preemption off anyway. CON: The downside is that kvm_vcpu_yield_to is called also from kvm_vcpu_on_spin. Here we want to avoid the scheduler overhead for a wrong decision. So I think this patch is probably not what we want in most cases. > > Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> > --- > virt/kvm/kvm_main.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 5b45330..184f52e 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1782,10 +1782,6 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target) > rcu_read_unlock(); > if (!task) > return ret; > - if (task->flags & PF_VCPU) { > - put_task_struct(task); > - return ret; > - } > ret = yield_to(task, 1); > put_task_struct(task); > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> This change is a trade-off. > PRO: This patch would improve the case of preemption on s390. This is probably a corner case as most distros have preemption off anyway. > CON: The downside is that kvm_vcpu_yield_to is called also from kvm_vcpu_on_spin. Here we want to avoid the scheduler overhead for a wrong decision. Won't most of that part be covered by: if (!ACCESS_ONCE(vcpu->preempted)) vcpu->preempted is only set when scheduled out involuntarily. It is cleared when scheduled in. s390 sets it manually, to speed up waking up a vcpu. So when our task is scheduled in (an PF_VCPU is set), this check will already avoid scheduler overhead in kvm_vcpu_on_spin() or am I missing something? -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 26.11.2014 um 10:23 schrieb David Hildenbrand: >> This change is a trade-off. >> PRO: This patch would improve the case of preemption on s390. This is probably a corner case as most distros have preemption off anyway. >> CON: The downside is that kvm_vcpu_yield_to is called also from kvm_vcpu_on_spin. Here we want to avoid the scheduler overhead for a wrong decision. > > Won't most of that part be covered by: > if (!ACCESS_ONCE(vcpu->preempted)) Hmm, right. Checking vcpu->preempted and PF_VCPU might boil down to the same. Would be good if to have to performance regression test, though. > > vcpu->preempted is only set when scheduled out involuntarily. It is cleared > when scheduled in. s390 sets it manually, to speed up waking up a vcpu. > > So when our task is scheduled in (an PF_VCPU is set), this check will already > avoid scheduler overhead in kvm_vcpu_on_spin() or am I missing something? > CC Raghavendra KT. Could be rerun your kernbench/sysbench/ebizzy setup on x86 to see if the patch in this thread causes any regression? If think your commit 7bc7ae25b143"kvm: Iterate over only vcpus that are preempted" might have really made the PF_VCPU check unnecessary CC Michael Mueller, do we still have our yield performance setup handy to check if this patch causes any regression? Christian -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Was able to test the patch, here is the result: I have not tested with bigger VMs though. Results make it difficult to talk about any side effect of patch if any. System 16 core 32cpu (+ht) sandybridge with 4 guests of 16vcpu each +-----------+-----------+-----------+------------+-----------+ kernbench (time taken lower is better) +-----------+-----------+-----------+------------+-----------+ base %stdev patched %stdev %improvement +-----------+-----------+-----------+------------+-----------+ 1x 53.1421 2.3086 54.6671 2.9673 -2.86966 2x 89.6858 6.4540 94.0626 6.8317 -4.88015 +-----------+-----------+-----------+------------+-----------+ +-----------+-----------+-----------+------------+-----------+ ebizzy (recors/sec higher is better) +-----------+-----------+-----------+------------+-----------+ base %stdev patched %stdev %improvement +-----------+-----------+-----------+------------+-----------+ 1x 14523.2500 8.4388 14928.8750 3.0478 2.79294 2x 3338.8750 1.4592 3270.8750 2.3980 -2.03661 +-----------+-----------+-----------+------------+-----------+ +-----------+-----------+-----------+------------+-----------+ dbench (Throughput higher is better) +-----------+-----------+-----------+------------+-----------+ base %stdev patched %stdev %improvement +-----------+-----------+-----------+------------+-----------+ 1x 6386.4737 1.0487 6703.9113 1.2298 4.97047 2x 2571.4712 1.3733 2571.8175 1.6919 0.01347 +-----------+-----------+-----------+------------+-----------+ Raghu On Wed, Nov 26, 2014 at 3:01 PM, Christian Borntraeger <borntraeger@de.ibm.com> wrote: > Am 26.11.2014 um 10:23 schrieb David Hildenbrand: >>> This change is a trade-off. >>> PRO: This patch would improve the case of preemption on s390. This is probably a corner case as most distros have preemption off anyway. >>> CON: The downside is that kvm_vcpu_yield_to is called also from kvm_vcpu_on_spin. Here we want to avoid the scheduler overhead for a wrong decision. >> >> Won't most of that part be covered by: >> if (!ACCESS_ONCE(vcpu->preempted)) > > Hmm, right. Checking vcpu->preempted and PF_VCPU might boil down to the same. > Would be good if to have to performance regression test, though. > >> >> vcpu->preempted is only set when scheduled out involuntarily. It is cleared >> when scheduled in. s390 sets it manually, to speed up waking up a vcpu. >> >> So when our task is scheduled in (an PF_VCPU is set), this check will already >> avoid scheduler overhead in kvm_vcpu_on_spin() or am I missing something? >> > > CC Raghavendra KT. Could be rerun your kernbench/sysbench/ebizzy setup on x86 to see if the patch in this thread causes any regression? If think your commit 7bc7ae25b143"kvm: Iterate over only vcpus that are preempted" might have really made the PF_VCPU check unnecessary > > CC Michael Mueller, do we still have our yield performance setup handy to check if this patch causes any regression? > > > Christian > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 28.11.2014 um 11:08 schrieb Raghavendra KT: > Was able to test the patch, here is the result: I have not tested with > bigger VMs though. Results make it difficult to talk about any side > effect of > patch if any. Thanks a log. If our assumption is correct, then this patch should have no side effect on x86. Do you have any confidence guess if the numbers below mean: no-change vs. regression vs improvement? Christian > > System 16 core 32cpu (+ht) sandybridge > with 4 guests of 16vcpu each > > +-----------+-----------+-----------+------------+-----------+ > kernbench (time taken lower is better) > +-----------+-----------+-----------+------------+-----------+ > base %stdev patched %stdev %improvement > +-----------+-----------+-----------+------------+-----------+ > 1x 53.1421 2.3086 54.6671 2.9673 -2.86966 > 2x 89.6858 6.4540 94.0626 6.8317 -4.88015 > +-----------+-----------+-----------+------------+-----------+ > > +-----------+-----------+-----------+------------+-----------+ > ebizzy (recors/sec higher is better) > +-----------+-----------+-----------+------------+-----------+ > base %stdev patched %stdev %improvement > +-----------+-----------+-----------+------------+-----------+ > 1x 14523.2500 8.4388 14928.8750 3.0478 2.79294 > 2x 3338.8750 1.4592 3270.8750 2.3980 -2.03661 > +-----------+-----------+-----------+------------+-----------+ > +-----------+-----------+-----------+------------+-----------+ > dbench (Throughput higher is better) > +-----------+-----------+-----------+------------+-----------+ > base %stdev patched %stdev %improvement > +-----------+-----------+-----------+------------+-----------+ > 1x 6386.4737 1.0487 6703.9113 1.2298 4.97047 > 2x 2571.4712 1.3733 2571.8175 1.6919 0.01347 > +-----------+-----------+-----------+------------+-----------+ > > Raghu > > On Wed, Nov 26, 2014 at 3:01 PM, Christian Borntraeger > <borntraeger@de.ibm.com> wrote: >> Am 26.11.2014 um 10:23 schrieb David Hildenbrand: >>>> This change is a trade-off. >>>> PRO: This patch would improve the case of preemption on s390. This is probably a corner case as most distros have preemption off anyway. >>>> CON: The downside is that kvm_vcpu_yield_to is called also from kvm_vcpu_on_spin. Here we want to avoid the scheduler overhead for a wrong decision. >>> >>> Won't most of that part be covered by: >>> if (!ACCESS_ONCE(vcpu->preempted)) >> >> Hmm, right. Checking vcpu->preempted and PF_VCPU might boil down to the same. >> Would be good if to have to performance regression test, though. >> >>> >>> vcpu->preempted is only set when scheduled out involuntarily. It is cleared >>> when scheduled in. s390 sets it manually, to speed up waking up a vcpu. >>> >>> So when our task is scheduled in (an PF_VCPU is set), this check will already >>> avoid scheduler overhead in kvm_vcpu_on_spin() or am I missing something? >>> >> >> CC Raghavendra KT. Could be rerun your kernbench/sysbench/ebizzy setup on x86 to see if the patch in this thread causes any regression? If think your commit 7bc7ae25b143"kvm: Iterate over only vcpus that are preempted" might have really made the PF_VCPU check unnecessary >> >> CC Michael Mueller, do we still have our yield performance setup handy to check if this patch causes any regression? >> >> >> Christian >> > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/28/2014 04:28 PM, Christian Borntraeger wrote: > Am 28.11.2014 um 11:08 schrieb Raghavendra KT: >> Was able to test the patch, here is the result: I have not tested with >> bigger VMs though. Results make it difficult to talk about any side >> effect of >> patch if any. > > Thanks a log. > > If our assumption is correct, then this patch should have no side effect on x86. Do you have any confidence guess if the numbers below mean: no-change vs. regression vs improvement? > I am seeing very small improvement in <= 1x commit cases and for >1x overcommit, a very slight regression. But considering the test environment noises, I do not see much effect from the patch. But I admit, I have not explored deeply about, 1. assumption of preempted approximately equals PF_VCPU case logic, 2. whether it helps for any future usages of yield_to against current sole usage of virtualization. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On 11/28/2014 04:28 PM, Christian Borntraeger wrote: > > Am 28.11.2014 um 11:08 schrieb Raghavendra KT: > >> Was able to test the patch, here is the result: I have not tested with > >> bigger VMs though. Results make it difficult to talk about any side > >> effect of > >> patch if any. > > > > Thanks a log. > > > > If our assumption is correct, then this patch should have no side effect on x86. Do you have any confidence guess if the numbers below mean: no-change vs. regression vs improvement? > > > > I am seeing very small improvement in <= 1x commit cases > and for >1x overcommit, a very slight regression. But considering the > test environment noises, I do not see much effect from the > patch. > > But I admit, I have not explored deeply about, > 1. assumption of preempted approximately equals PF_VCPU case logic, PF_VCPU is only a hint whether the target vcpu is executing the guest. If preemption is off or !s390, PF_VCPU means that the target vcpu is running and can't be preempted. Although for preemption on and s390, this statement is false. Therefore this check is not always right. > 2. whether it helps for any future usages of yield_to against current > sole usage of virtualization. > > > Thanks for your test! -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28/11/2014 12:40, Raghavendra K T wrote: > I am seeing very small improvement in <= 1x commit cases > and for >1x overcommit, a very slight regression. But considering the > test environment noises, I do not see much effect from the > patch. I think these results are the only one that could be statisically significant: base %stdev patched %stdev %improvement kernbench 1x 53.1421 2.3086 54.6671 2.9673 -2.86966 dbench 1x 6386.4737 1.0487 6703.9113 1.2298 4.97047 and, of course :) one of them says things get worse and the other says things get better. Paolo > But I admit, I have not explored deeply about, > 1. assumption of preempted approximately equals PF_VCPU case logic, > 2. whether it helps for any future usages of yield_to against current > sole usage of virtualization. > > > -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 25/11/2014 17:04, David Hildenbrand wrote: > As some architectures (e.g. s390) can't disable preemption while > entering/leaving the guest, they won't receive the yield in all situations. > > kvm_enter_guest() has to be called with preemption_disabled and will set > PF_VCPU. After that point e.g. s390 reenables preemption and starts to execute the > guest. The thread might therefore be scheduled out between kvm_enter_guest() and > kvm_exit_guest(), resulting in PF_VCPU being set but not being run. > > Please note that preemption has to stay enabled in order to correctly process > page faults on s390. > > Current code takes PF_VCPU as a hint that the VCPU thread is running and > therefore needs no yield. yield_to() checks whether the target thread is running, > so let's use the inbuilt functionality to make it independent of PF_VCPU and > preemption. > > Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> > --- > virt/kvm/kvm_main.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 5b45330..184f52e 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -1782,10 +1782,6 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target) > rcu_read_unlock(); > if (!task) > return ret; > - if (task->flags & PF_VCPU) { > - put_task_struct(task); > - return ret; > - } > ret = yield_to(task, 1); > put_task_struct(task); > > Applied with a rewritten commit message: KVM: don't check for PF_VCPU when yielding kvm_enter_guest() has to be called with preemption disabled and will set PF_VCPU. Current code takes PF_VCPU as a hint that the VCPU thread is running and therefore needs no yield. However, the check on PF_VCPU is wrong on s390, where preemption has to stay enabled on s390 in order to correctly process page faults. Thus, s390 reenables preemption and starts to execute the guest. The thread might be scheduled out between kvm_enter_guest() and kvm_exit_guest(), resulting in PF_VCPU being set but not being run. When this happens, the opportunity for directed yield is missed. However, this check is done already in kvm_vcpu_on_spin before calling kvm_vcpu_yield_loop: if (!ACCESS_ONCE(vcpu->preempted)) continue; so the check on PF_VCPU is superfluous in general, and this patch removes it. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Applied with a rewritten commit message: > > KVM: don't check for PF_VCPU when yielding > > kvm_enter_guest() has to be called with preemption disabled and will > set PF_VCPU. Current code takes PF_VCPU as a hint that the VCPU thread > is running and therefore needs no yield. > > However, the check on PF_VCPU is wrong on s390, where preemption > has to stay enabled on s390 in order to correctly process page faults. > Thus, s390 reenables preemption and starts to execute the guest. > The thread might be scheduled out between kvm_enter_guest() and > kvm_exit_guest(), resulting in PF_VCPU being set but not being run. > When this happens, the opportunity for directed yield is missed. > > However, this check is done already in kvm_vcpu_on_spin before calling > kvm_vcpu_yield_loop: > > if (!ACCESS_ONCE(vcpu->preempted)) > continue; > > so the check on PF_VCPU is superfluous in general, and this patch > removes it. > > Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > Perfect, thanks! David -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 5b45330..184f52e 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1782,10 +1782,6 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target) rcu_read_unlock(); if (!task) return ret; - if (task->flags & PF_VCPU) { - put_task_struct(task); - return ret; - } ret = yield_to(task, 1); put_task_struct(task);
As some architectures (e.g. s390) can't disable preemption while entering/leaving the guest, they won't receive the yield in all situations. kvm_enter_guest() has to be called with preemption_disabled and will set PF_VCPU. After that point e.g. s390 reenables preemption and starts to execute the guest. The thread might therefore be scheduled out between kvm_enter_guest() and kvm_exit_guest(), resulting in PF_VCPU being set but not being run. Please note that preemption has to stay enabled in order to correctly process page faults on s390. Current code takes PF_VCPU as a hint that the VCPU thread is running and therefore needs no yield. yield_to() checks whether the target thread is running, so let's use the inbuilt functionality to make it independent of PF_VCPU and preemption. Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com> --- virt/kvm/kvm_main.c | 4 ---- 1 file changed, 4 deletions(-)