Message ID | BLU436-SMTP2016CD0BF5AFCCDAD29C2C780570@phx.gbl (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Paolo, On 9/3/15 10:07 PM, Wanpeng Li wrote: > [...] > static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) > { > if (kvm_arch_vcpu_runnable(vcpu)) { > @@ -1928,7 +1962,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) > { > ktime_t start, cur; > DEFINE_WAIT(wait); > - bool waited = false; > + bool polled = false, waited = false; > + u64 poll_ns = 0, wait_ns = 0, block_ns = 0; > > start = cur = ktime_get(); > if (vcpu->halt_poll_ns) { > @@ -1940,11 +1975,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) > * arrives. > */ > if (kvm_vcpu_check_block(vcpu) < 0) { > + polled = true; > ++vcpu->stat.halt_successful_poll; > - goto out; > + break; > } > cur = ktime_get(); > } while (single_task_running() && ktime_before(cur, stop)); > + > + poll_ns = ktime_to_ns(cur) - ktime_to_ns(start); > + if (polled) > + goto out; > Please move poll_ns caculation under if() when you applied, as I explained in reply to v6. Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/09/2015 00:38, Wanpeng Li wrote: >> >> @@ -1940,11 +1975,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) >> * arrives. >> */ >> if (kvm_vcpu_check_block(vcpu) < 0) { >> + polled = true; >> ++vcpu->stat.halt_successful_poll; >> - goto out; >> + break; >> } >> cur = ktime_get(); >> } while (single_task_running() && ktime_before(cur, stop)); >> + >> + poll_ns = ktime_to_ns(cur) - ktime_to_ns(start); >> + if (polled) >> + goto out; >> > > Please move poll_ns caculation under if() when you applied, as I > explained in reply to v6. You can do much more than just that, the patch reduces to this: @@ -1929,6 +1963,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) ktime_t start, cur; DEFINE_WAIT(wait); bool waited = false; + u64 block_ns; start = cur = ktime_get(); if (vcpu->halt_poll_ns) { @@ -1961,7 +1996,21 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) cur = ktime_get(); out: - trace_kvm_vcpu_wakeup(ktime_to_ns(cur) - ktime_to_ns(start), waited); + block_ns = ktime_to_ns(cur) - ktime_to_ns(start); + + if (halt_poll_ns) { + if (block_ns <= vcpu->halt_poll_ns) + ; + /* we had a long block, shrink polling */ + else if (vcpu->halt_poll_ns && block_ns > halt_poll_ns) + shrink_halt_poll_ns(vcpu); + /* we had a short halt and our poll time is too small */ + else if (vcpu->halt_poll_ns < halt_poll_ns && + block_ns < halt_poll_ns) + grow_halt_poll_ns(vcpu); + } + + trace_kvm_vcpu_wakeup(block_ns, waited); } EXPORT_SYMBOL_GPL(kvm_vcpu_block); Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/6/15 10:32 PM, Paolo Bonzini wrote: > > On 05/09/2015 00:38, Wanpeng Li wrote: >>> @@ -1940,11 +1975,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) >>> * arrives. >>> */ >>> if (kvm_vcpu_check_block(vcpu) < 0) { >>> + polled = true; >>> ++vcpu->stat.halt_successful_poll; >>> - goto out; >>> + break; >>> } >>> cur = ktime_get(); >>> } while (single_task_running() && ktime_before(cur, stop)); >>> + >>> + poll_ns = ktime_to_ns(cur) - ktime_to_ns(start); >>> + if (polled) >>> + goto out; >>> >> Please move poll_ns caculation under if() when you applied, as I >> explained in reply to v6. > You can do much more than just that, the patch reduces to this: > Cool, many thanks for your help, Paolo! :) Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index c06e57c..d5e07e9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -66,9 +66,18 @@ MODULE_AUTHOR("Qumranet"); MODULE_LICENSE("GPL"); -static unsigned int halt_poll_ns; +/* halt polling only reduces halt latency by 5-7 us, 500us is enough */ +static unsigned int halt_poll_ns = 500000; module_param(halt_poll_ns, uint, S_IRUGO | S_IWUSR); +/* Default doubles per-vcpu halt_poll_ns. */ +static unsigned int halt_poll_ns_grow = 2; +module_param(halt_poll_ns_grow, int, S_IRUGO); + +/* Default resets per-vcpu halt_poll_ns . */ +static unsigned int halt_poll_ns_shrink; +module_param(halt_poll_ns_shrink, int, S_IRUGO); + /* * Ordering of locks: * @@ -1907,6 +1916,31 @@ void kvm_vcpu_mark_page_dirty(struct kvm_vcpu *vcpu, gfn_t gfn) } EXPORT_SYMBOL_GPL(kvm_vcpu_mark_page_dirty); +static void grow_halt_poll_ns(struct kvm_vcpu *vcpu) +{ + int val = vcpu->halt_poll_ns; + + /* 10us base */ + if (val == 0 && halt_poll_ns_grow) + val = 10000; + else + val *= halt_poll_ns_grow; + + vcpu->halt_poll_ns = val; +} + +static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu) +{ + int val = vcpu->halt_poll_ns; + + if (halt_poll_ns_shrink == 0) + val = 0; + else + val /= halt_poll_ns_shrink; + + vcpu->halt_poll_ns = val; +} + static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu) { if (kvm_arch_vcpu_runnable(vcpu)) { @@ -1928,7 +1962,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) { ktime_t start, cur; DEFINE_WAIT(wait); - bool waited = false; + bool polled = false, waited = false; + u64 poll_ns = 0, wait_ns = 0, block_ns = 0; start = cur = ktime_get(); if (vcpu->halt_poll_ns) { @@ -1940,11 +1975,16 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) * arrives. */ if (kvm_vcpu_check_block(vcpu) < 0) { + polled = true; ++vcpu->stat.halt_successful_poll; - goto out; + break; } cur = ktime_get(); } while (single_task_running() && ktime_before(cur, stop)); + + poll_ns = ktime_to_ns(cur) - ktime_to_ns(start); + if (polled) + goto out; } for (;;) { @@ -1959,9 +1999,24 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) finish_wait(&vcpu->wq, &wait); cur = ktime_get(); + wait_ns = ktime_to_ns(cur) - ktime_to_ns(start); out: - trace_kvm_vcpu_wakeup(ktime_to_ns(cur) - ktime_to_ns(start), waited); + block_ns = poll_ns + wait_ns; + + if (halt_poll_ns) { + if (block_ns <= vcpu->halt_poll_ns) + ; + /* we had a long block, shrink polling */ + else if (vcpu->halt_poll_ns && block_ns > halt_poll_ns) + shrink_halt_poll_ns(vcpu); + /* we had a short halt and our poll time is too small */ + else if (vcpu->halt_poll_ns < halt_poll_ns && + block_ns < halt_poll_ns) + grow_halt_poll_ns(vcpu); + } + + trace_kvm_vcpu_wakeup(block_ns, waited); } EXPORT_SYMBOL_GPL(kvm_vcpu_block);
There is a downside of always-poll since poll is still happened for idle vCPUs which can waste cpu usage. This patchset add the ability to adjust halt_poll_ns dynamically, to grow halt_poll_ns when shot halt is detected, and to shrink halt_poll_ns when long halt is detected. There are two new kernel parameters for changing the halt_poll_ns: halt_poll_ns_grow and halt_poll_ns_shrink. no-poll always-poll dynamic-poll ----------------------------------------------------------------------- Idle (nohz) vCPU %c0 0.15% 0.3% 0.2% Idle (250HZ) vCPU %c0 1.1% 4.6%~14% 1.2% TCP_RR latency 34us 27us 26.7us "Idle (X) vCPU %c0" is the percent of time the physical cpu spent in c0 over 60 seconds (each vCPU is pinned to a pCPU). (nohz) means the guest was tickless. (250HZ) means the guest was ticking at 250HZ. The big win is with ticking operating systems. Running the linux guest with nohz=off (and HZ=250), we save 3.4%~12.8% CPUs/second and get close to no-polling overhead levels by using the dynamic-poll. The savings should be even higher for higher frequency ticks. Suggested-by: David Matlack <dmatlack@google.com> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> --- virt/kvm/kvm_main.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 59 insertions(+), 4 deletions(-)