Message ID | 20210817230508.142907-1-jingzhangos@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: stats: add stats to detect if vcpu is currently halted | expand |
On Tue, Aug 17, 2021, Jing Zhang wrote: > Current guest/host/halt stats don't show when we are currently halting s/we are/KVM is And I would probably reword it to "when KVM is blocking a vCPU in response to the vCPU activity state, e.g. halt". More on that below. > well. If a guest halts for a long period of time they could appear > pathologically blocked but really it's the opposite there's nothing to > do. > Simply count the number of times we enter and leave the kvm_vcpu_block s/we/KVM In general, it's good practice to avoid pronouns in comments and changelogs as doing so all but forces using precise, unambiguous language. Things like 'it' and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' are best avoided entirely. > function per vcpu, if they are unequal, then a VCPU is currently > halting. > The existing stats like halt_exits and halt_wakeups don't quite capture > this. The time spend halted and halt polling is reported eventually, but > not until we wakeup and resume. If a guest were to indefinitely halt one > of it's CPUs we would never know, it may simply appear blocked. ^^^^ ^^ its userspace? The "blocked" terminology is a bit confusing since KVM is explicitly blocking the vCPU, it just happens to mostly do so in response to a guest HLT. I think "block" is intended to mean "vCPU task not run", but it would be helpful to make that clear. > Original-by: Cannon Matthews <cannonmatthews@google.com> > Signed-off-by: Jing Zhang <jingzhangos@google.com> > --- > include/linux/kvm_host.h | 4 +++- > include/linux/kvm_types.h | 2 ++ > virt/kvm/kvm_main.c | 2 ++ > 3 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index d447b21cdd73..23d2e19af3ce 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > HALT_POLL_HIST_COUNT), \ > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > - HALT_POLL_HIST_COUNT) > + HALT_POLL_HIST_COUNT), \ > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I get that all the other stats use "halt", but that's technically wrong as KVM will block vCPUs that are not runnable for other reason, e.g. because they're in WFS on x86.
Hi Sean, On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote: > > On Tue, Aug 17, 2021, Jing Zhang wrote: > > Current guest/host/halt stats don't show when we are currently halting > > s/we are/KVM is > > And I would probably reword it to "when KVM is blocking a vCPU in response to > the vCPU activity state, e.g. halt". More on that below. > > > well. If a guest halts for a long period of time they could appear > > pathologically blocked but really it's the opposite there's nothing to > > do. > > Simply count the number of times we enter and leave the kvm_vcpu_block > > s/we/KVM > > In general, it's good practice to avoid pronouns in comments and changelogs as > doing so all but forces using precise, unambiguous language. Things like 'it' > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' > are best avoided entirely. > > > function per vcpu, if they are unequal, then a VCPU is currently > > halting. > > The existing stats like halt_exits and halt_wakeups don't quite capture > > this. The time spend halted and halt polling is reported eventually, but > > not until we wakeup and resume. If a guest were to indefinitely halt one > > of it's CPUs we would never know, it may simply appear blocked. > ^^^^ ^^ > its userspace? > > > The "blocked" terminology is a bit confusing since KVM is explicitly blocking > the vCPU, it just happens to mostly do so in response to a guest HLT. I think > "block" is intended to mean "vCPU task not run", but it would be helpful to make > that clear. > That's a good point. Will reword the comments as you suggested. > > Original-by: Cannon Matthews <cannonmatthews@google.com> > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > --- > > include/linux/kvm_host.h | 4 +++- > > include/linux/kvm_types.h | 2 ++ > > virt/kvm/kvm_main.c | 2 ++ > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index d447b21cdd73..23d2e19af3ce 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > > HALT_POLL_HIST_COUNT), \ > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > > - HALT_POLL_HIST_COUNT) > > + HALT_POLL_HIST_COUNT), \ > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) > > Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I > get that all the other stats use "halt", but that's technically wrong as KVM will > block vCPUs that are not runnable for other reason, e.g. because they're in WFS > on x86. The two counters are used to determine the reason why vCPU is not running. If the halt_block_ends is one less than halt_block_starts, then we know the vCPU is explicitly blocked by KVM. Otherwise, we know there might be something wrong with the vCPU. Does this make sense? Will rename from "halt_block_*" to "vcpu_block_*". Thanks, Jing
On Wed, Aug 18, 2021, Jing Zhang wrote: > > > Original-by: Cannon Matthews <cannonmatthews@google.com> > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > > --- > > > include/linux/kvm_host.h | 4 +++- > > > include/linux/kvm_types.h | 2 ++ > > > virt/kvm/kvm_main.c | 2 ++ > > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > > index d447b21cdd73..23d2e19af3ce 100644 > > > --- a/include/linux/kvm_host.h > > > +++ b/include/linux/kvm_host.h > > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > > > HALT_POLL_HIST_COUNT), \ > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > > > - HALT_POLL_HIST_COUNT) > > > + HALT_POLL_HIST_COUNT), \ > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) > > > > Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I > > get that all the other stats use "halt", but that's technically wrong as KVM will > > block vCPUs that are not runnable for other reason, e.g. because they're in WFS > > on x86. > The two counters are used to determine the reason why vCPU is not > running. If the halt_block_ends is one less than halt_block_starts, > then we know the vCPU is explicitly blocked by KVM. Otherwise, we know > there might be something wrong with the vCPU. Does this make sense? > Will rename from "halt_block_*" to "vcpu_block_*". Yeah, but why not do the below? "halt_block_starts - halt_block_ends" can only ever be '0' or '1'; if it's anything else, KVM done messed up, e.g. failed to bump halt_block_ends when unblocking. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3e67c93ca403..aa98dec727d0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3208,6 +3208,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) kvm_arch_vcpu_blocking(vcpu); + vcpu->stat.generic.blocked = 1; + start = cur = poll_end = ktime_get(); if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) { ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns); @@ -3258,6 +3260,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) ktime_to_ns(cur) - ktime_to_ns(poll_end)); } out: + vcpu->stat.generic.blocked = 0; kvm_arch_vcpu_unblocking(vcpu); block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
+1 to the rephrasing of the commit message. On Wed, Aug 18, 2021 at 11:09 AM Jing Zhang <jingzhangos@google.com> wrote: > > Hi Sean, > > On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Tue, Aug 17, 2021, Jing Zhang wrote: > > > Current guest/host/halt stats don't show when we are currently halting > > > > s/we are/KVM is > > > > And I would probably reword it to "when KVM is blocking a vCPU in response to > > the vCPU activity state, e.g. halt". More on that below. > > > > > well. If a guest halts for a long period of time they could appear > > > pathologically blocked but really it's the opposite there's nothing to > > > do. > > > Simply count the number of times we enter and leave the kvm_vcpu_block > > > > s/we/KVM > > > > In general, it's good practice to avoid pronouns in comments and changelogs as > > doing so all but forces using precise, unambiguous language. Things like 'it' > > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' > > are best avoided entirely. > > > > > function per vcpu, if they are unequal, then a VCPU is currently > > > halting. > > > The existing stats like halt_exits and halt_wakeups don't quite capture > > > this. The time spend halted and halt polling is reported eventually, but > > > not until we wakeup and resume. If a guest were to indefinitely halt one > > > of it's CPUs we would never know, it may simply appear blocked. > > ^^^^ ^^ > > its userspace? > > > > > > The "blocked" terminology is a bit confusing since KVM is explicitly blocking > > the vCPU, it just happens to mostly do so in response to a guest HLT. I think > > "block" is intended to mean "vCPU task not run", but it would be helpful to make > > that clear. > > > That's a good point. Will reword the comments as you suggested. > > > Original-by: Cannon Matthews <cannonmatthews@google.com> > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > > --- > > > include/linux/kvm_host.h | 4 +++- > > > include/linux/kvm_types.h | 2 ++ > > > virt/kvm/kvm_main.c | 2 ++ > > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > > index d447b21cdd73..23d2e19af3ce 100644 > > > --- a/include/linux/kvm_host.h > > > +++ b/include/linux/kvm_host.h > > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > > > HALT_POLL_HIST_COUNT), \ > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > > > - HALT_POLL_HIST_COUNT) > > > + HALT_POLL_HIST_COUNT), \ > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) > > > > Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I > > get that all the other stats use "halt", but that's technically wrong as KVM will > > block vCPUs that are not runnable for other reason, e.g. because they're in WFS > > on x86. The point is to separate "blocked because not runable" and "guest explicitly asked to do nothing" IIRC I originally wrote this patch to help discriminate how we spent VCPU thread time, in particular into two categories of essentially "doing useful work on behalf of the guest" and "blocked from doing useful work." Since a guest has explictly asked for a vcpu to HLT, this is "useful work on behalf of the guest" even though the thread is "blocked" from running. This allows answering questions like, are we spending too much time waiting on mutexes, or long running kernel routines rather than running the vcpu in guest mode, or did the guest explictly tell us to not doing anything. So I would suggest keeping the "halt" part of the counters' name, and remove the "blocked" part rather than the other way around. We explicitly do not want to include non-halt blockages in this. That being said I suppose a boolean could work as well. I think we did this because it worked with and mirrored existing stats rather than anything particularly nuanced. Though there might be some eventual consistency sort of concerns with how these stats are updated and exported that could make monotonic increasing counters more useful. > The two counters are used to determine the reason why vCPU is not > running. If the halt_block_ends is one less than halt_block_starts, > then we know the vCPU is explicitly blocked by KVM. Otherwise, we know > there might be something wrong with the vCPU. Does this make sense? > Will rename from "halt_block_*" to "vcpu_block_*". > > Thanks, > Jing
Hi Sean, On Wed, Aug 18, 2021 at 10:01 AM Cannon Matthews <cannonmatthews@google.com> wrote: > > +1 to the rephrasing of the commit message. > > On Wed, Aug 18, 2021 at 11:09 AM Jing Zhang <jingzhangos@google.com> wrote: > > > > Hi Sean, > > > > On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote: > > > > > > On Tue, Aug 17, 2021, Jing Zhang wrote: > > > > Current guest/host/halt stats don't show when we are currently halting > > > > > > s/we are/KVM is > > > > > > And I would probably reword it to "when KVM is blocking a vCPU in response to > > > the vCPU activity state, e.g. halt". More on that below. > > > > > > > well. If a guest halts for a long period of time they could appear > > > > pathologically blocked but really it's the opposite there's nothing to > > > > do. > > > > Simply count the number of times we enter and leave the kvm_vcpu_block > > > > > > s/we/KVM > > > > > > In general, it's good practice to avoid pronouns in comments and changelogs as > > > doing so all but forces using precise, unambiguous language. Things like 'it' > > > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us' > > > are best avoided entirely. > > > > > > > function per vcpu, if they are unequal, then a VCPU is currently > > > > halting. > > > > The existing stats like halt_exits and halt_wakeups don't quite capture > > > > this. The time spend halted and halt polling is reported eventually, but > > > > not until we wakeup and resume. If a guest were to indefinitely halt one > > > > of it's CPUs we would never know, it may simply appear blocked. > > > ^^^^ ^^ > > > its userspace? > > > > > > > > > The "blocked" terminology is a bit confusing since KVM is explicitly blocking > > > the vCPU, it just happens to mostly do so in response to a guest HLT. I think > > > "block" is intended to mean "vCPU task not run", but it would be helpful to make > > > that clear. > > > > > That's a good point. Will reword the comments as you suggested. > > > > Original-by: Cannon Matthews <cannonmatthews@google.com> > > > > Signed-off-by: Jing Zhang <jingzhangos@google.com> > > > > --- > > > > include/linux/kvm_host.h | 4 +++- > > > > include/linux/kvm_types.h | 2 ++ > > > > virt/kvm/kvm_main.c | 2 ++ > > > > 3 files changed, 7 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > > > index d447b21cdd73..23d2e19af3ce 100644 > > > > --- a/include/linux/kvm_host.h > > > > +++ b/include/linux/kvm_host.h > > > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { > > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ > > > > HALT_POLL_HIST_COUNT), \ > > > > STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ > > > > - HALT_POLL_HIST_COUNT) > > > > + HALT_POLL_HIST_COUNT), \ > > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ > > > > + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) > > > > > > Why two counters? It's per-vCPU, can't this just be a "blocked" flag or so? I > > > get that all the other stats use "halt", but that's technically wrong as KVM will > > > block vCPUs that are not runnable for other reason, e.g. because they're in WFS > > > on x86. > > The point is to separate "blocked because not runable" and "guest > explicitly asked to do nothing" > > IIRC I originally wrote this patch to help discriminate how we spent > VCPU thread time, > in particular into two categories of essentially "doing useful work > on behalf of the guest" and > "blocked from doing useful work." > > Since a guest has explictly asked for a vcpu to HLT, this is "useful > work on behalf of the guest" > even though the thread is "blocked" from running. > > This allows answering questions like, are we spending too much time > waiting on mutexes, or > long running kernel routines rather than running the vcpu in guest > mode, or did the guest explictly > tell us to not doing anything. > > So I would suggest keeping the "halt" part of the counters' name, and > remove the "blocked" part > rather than the other way around. We explicitly do not want to include > non-halt blockages in this. > > That being said I suppose a boolean could work as well. I think we did > this because it worked with > and mirrored existing stats rather than anything particularly nuanced. > Though there might be some > eventual consistency sort of concerns with how these stats are updated > and exported that could make > monotonic increasing counters more useful. Any more comments on the naming and the increasing counters? > > > The two counters are used to determine the reason why vCPU is not > > running. If the halt_block_ends is one less than halt_block_starts, > > then we know the vCPU is explicitly blocked by KVM. Otherwise, we know > > there might be something wrong with the vCPU. Does this make sense? > > Will rename from "halt_block_*" to "vcpu_block_*". > > > > Thanks, > > Jing
On Wed, Aug 18, 2021, Cannon Matthews wrote: > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on > behalf of the guest" even though the thread is "blocked" from running. > > This allows answering questions like, are we spending too much time waiting > on mutexes, or long running kernel routines rather than running the vcpu in > guest mode, or did the guest explictly tell us to not doing anything. > > So I would suggest keeping the "halt" part of the counters' name, and remove > the "blocked" part rather than the other way around. We explicitly do not > want to include non-halt blockages in this. But this patch does include non-halt blockages, which is why I brought up the technically-wrong naming. Specifically, x86 reaches this path for any !RUNNABLE vCPU state, e.g. if the vCPU is in WFS. Non-x86 usage appears to mostly call this for halt-like behavior, but PPC looks like it has at least one path that's not halt-like. I doubt anyone actually cares if the stat is a misnomer in some cases, but at the same time I think there's opportunity for clean up here. E.g. halt polling if a vCPU is in WFS or UNINITIALIZED is a waste of cycles. Ditto for the calls to kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete waste of cycles if the vCPU doesn't actually block. And kvm_arch_vcpu_block_finish() can be dropped by moving the one line of code into s390, which can add its own wrapper if necessary. So with a bit of massaging and a slight change in tracing behavior, I believe we can isolate the actual wait/halt and avoid "halted" being technically-wrong, and fix some inefficiencies at the same time. Jing, can you do a v2 of this patch and send it to me off-list? With luck, my idea will work and I can fold your patch in, and if not we can always post v2 standalone in a few weeks. E.g. I'm thinking something like... void kvm_vcpu_halt(struct kvm_vcpu *vcpu) { vcpu->stat.generic.halted = 1; if (<halt polling failed>) kvm_vcpu_block(); vcpu->stat.generic.halted = 0; <update halt polling stuff> } void kvm_vcpu_block(struct kvm_vcpu *vcpu) { bool waited = false; ktime_t start, cur; u64 block_ns; start = ktime_get(); prepare_to_rcuwait(&vcpu->wait); for (;;) { set_current_state(TASK_INTERRUPTIBLE); if (kvm_vcpu_check_block(vcpu) < 0) break; waited = true; schedule(); } finish_rcuwait(&vcpu->wait); block_ns = ktime_to_ns(cur) - ktime_to_ns(start); trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu)); }
Hi Sean, On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote: > > On Wed, Aug 18, 2021, Cannon Matthews wrote: > > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on > > behalf of the guest" even though the thread is "blocked" from running. > > > > This allows answering questions like, are we spending too much time waiting > > on mutexes, or long running kernel routines rather than running the vcpu in > > guest mode, or did the guest explictly tell us to not doing anything. > > > > So I would suggest keeping the "halt" part of the counters' name, and remove > > the "blocked" part rather than the other way around. We explicitly do not > > want to include non-halt blockages in this. > > But this patch does include non-halt blockages, which is why I brought up the > technically-wrong naming. Specifically, x86 reaches this path for any !RUNNABLE > vCPU state, e.g. if the vCPU is in WFS. Non-x86 usage appears to mostly call > this for halt-like behavior, but PPC looks like it has at least one path that's > not halt-like. > > I doubt anyone actually cares if the stat is a misnomer in some cases, but at the > same time I think there's opportunity for clean up here. E.g. halt polling if a > vCPU is in WFS or UNINITIALIZED is a waste of cycles. Ditto for the calls to > kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is > successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete > waste of cycles if the vCPU doesn't actually block. And kvm_arch_vcpu_block_finish() > can be dropped by moving the one line of code into s390, which can add its own > wrapper if necessary. > > So with a bit of massaging and a slight change in tracing behavior, I believe we > can isolate the actual wait/halt and avoid "halted" being technically-wrong, and > fix some inefficiencies at the same time. > > Jing, can you do a v2 of this patch and send it to me off-list? With luck, my > idea will work and I can fold your patch in, and if not we can always post v2 > standalone in a few weeks. Of course, will do. Thanks, Jing > > E.g. I'm thinking something like... > > void kvm_vcpu_halt(struct kvm_vcpu *vcpu) > { > vcpu->stat.generic.halted = 1; > > if (<halt polling failed>) > kvm_vcpu_block(); > > vcpu->stat.generic.halted = 0; > > <update halt polling stuff> > } > > void kvm_vcpu_block(struct kvm_vcpu *vcpu) > { > bool waited = false; > ktime_t start, cur; > u64 block_ns; > > start = ktime_get(); > > > prepare_to_rcuwait(&vcpu->wait); > for (;;) { > set_current_state(TASK_INTERRUPTIBLE); > > if (kvm_vcpu_check_block(vcpu) < 0) > break; > > waited = true; > schedule(); > } > finish_rcuwait(&vcpu->wait); > > block_ns = ktime_to_ns(cur) - ktime_to_ns(start); > trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu)); > } >
On Fri, Aug 20, 2021, Jing Zhang wrote: > Hi Sean, > > On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote: > > > > On Wed, Aug 18, 2021, Cannon Matthews wrote: > > > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on > > > behalf of the guest" even though the thread is "blocked" from running. > > > > > > This allows answering questions like, are we spending too much time waiting > > > on mutexes, or long running kernel routines rather than running the vcpu in > > > guest mode, or did the guest explictly tell us to not doing anything. > > > > > > So I would suggest keeping the "halt" part of the counters' name, and remove > > > the "blocked" part rather than the other way around. We explicitly do not > > > want to include non-halt blockages in this. > > > > But this patch does include non-halt blockages, which is why I brought up the > > technically-wrong naming. Specifically, x86 reaches this path for any !RUNNABLE > > vCPU state, e.g. if the vCPU is in WFS. Non-x86 usage appears to mostly call > > this for halt-like behavior, but PPC looks like it has at least one path that's > > not halt-like. > > > > I doubt anyone actually cares if the stat is a misnomer in some cases, but at the > > same time I think there's opportunity for clean up here. E.g. halt polling if a > > vCPU is in WFS or UNINITIALIZED is a waste of cycles. Ditto for the calls to > > kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is > > successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete > > waste of cycles if the vCPU doesn't actually block. And kvm_arch_vcpu_block_finish() > > can be dropped by moving the one line of code into s390, which can add its own > > wrapper if necessary. > > > > So with a bit of massaging and a slight change in tracing behavior, I believe we > > can isolate the actual wait/halt and avoid "halted" being technically-wrong, and > > fix some inefficiencies at the same time. > > > > Jing, can you do a v2 of this patch and send it to me off-list? With luck, my > > idea will work and I can fold your patch in, and if not we can always post v2 > > standalone in a few weeks. Circling back to this with fresh eyes, limiting the state to "halted" would be wrong. I still stand by my assertion that non-halt states such as WFS should not go through halt polling, but the intent of the proposed stat is to differentiate between not running a vCPU because of a guest action and not running a vCPU because the host is not scheduling its task. E.g. on x86, if a vCPU is put into WFS for an extended time, the vCPU will not be run because of a guest action, not because of any host activity. But again, WFS has very different meaning than "halt", which was the basis for my original objection to the "halt_block" terminology. One option would be to invert the stat, e.g. vcpu->stat.runnable, but that has the downside of needed to be set somewhere outside of kvm_vcpu_block/halt(), and it would likely be difficult to come up with a name that isn't confusing, e.g. the vCPU would show up as "runnable" when mp_state!=RUNNABLE and it's not actively blocking. Back to bikeshedding the "halted" name, what about "blocking" or "waiting"? I.e. switch from past tense to present tense to convey that the _vCPU_ is "actively" blocking/waiting, as opposed to the vCPU being blocked by some host condition.
On Wed, Sep 22, 2021, Sean Christopherson wrote: > On Fri, Aug 20, 2021, Jing Zhang wrote: > > Hi Sean, > > > > On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote: > > > > > > On Wed, Aug 18, 2021, Cannon Matthews wrote: > > > > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on > > > > behalf of the guest" even though the thread is "blocked" from running. > > > > > > > > This allows answering questions like, are we spending too much time waiting > > > > on mutexes, or long running kernel routines rather than running the vcpu in > > > > guest mode, or did the guest explictly tell us to not doing anything. > > > > > > > > So I would suggest keeping the "halt" part of the counters' name, and remove > > > > the "blocked" part rather than the other way around. We explicitly do not > > > > want to include non-halt blockages in this. > > > > > > But this patch does include non-halt blockages, which is why I brought up the > > > technically-wrong naming. Specifically, x86 reaches this path for any !RUNNABLE > > > vCPU state, e.g. if the vCPU is in WFS. Non-x86 usage appears to mostly call > > > this for halt-like behavior, but PPC looks like it has at least one path that's > > > not halt-like. > > > > > > I doubt anyone actually cares if the stat is a misnomer in some cases, but at the > > > same time I think there's opportunity for clean up here. E.g. halt polling if a > > > vCPU is in WFS or UNINITIALIZED is a waste of cycles. Ditto for the calls to > > > kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is > > > successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete > > > waste of cycles if the vCPU doesn't actually block. And kvm_arch_vcpu_block_finish() > > > can be dropped by moving the one line of code into s390, which can add its own > > > wrapper if necessary. > > > > > > So with a bit of massaging and a slight change in tracing behavior, I believe we > > > can isolate the actual wait/halt and avoid "halted" being technically-wrong, and > > > fix some inefficiencies at the same time. > > > > > > Jing, can you do a v2 of this patch and send it to me off-list? With luck, my > > > idea will work and I can fold your patch in, and if not we can always post v2 > > > standalone in a few weeks. > > Circling back to this with fresh eyes, limiting the state to "halted" would be > wrong. I still stand by my assertion that non-halt states such as WFS should not > go through halt polling, but the intent of the proposed stat is to differentiate > between not running a vCPU because of a guest action and not running a vCPU because > the host is not scheduling its task. > > E.g. on x86, if a vCPU is put into WFS for an extended time, the vCPU will not be > run because of a guest action, not because of any host activity. But again, WFS > has very different meaning than "halt", which was the basis for my original > objection to the "halt_block" terminology. > > One option would be to invert the stat, e.g. vcpu->stat.runnable, but that has the > downside of needed to be set somewhere outside of kvm_vcpu_block/halt(), and it > would likely be difficult to come up with a name that isn't confusing, e.g. the > vCPU would show up as "runnable" when mp_state!=RUNNABLE and it's not actively > blocking. > > Back to bikeshedding the "halted" name, what about "blocking" or "waiting"? I.e. > switch from past tense to present tense to convey that the _vCPU_ is "actively" > blocking/waiting, as opposed to the vCPU being blocked by some host condition. Doh, forgot to say that "blocking" would be my first choice as it would match KVM's internal "block" terminology.
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d447b21cdd73..23d2e19af3ce 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc { STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist, \ HALT_POLL_HIST_COUNT), \ STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist, \ - HALT_POLL_HIST_COUNT) + HALT_POLL_HIST_COUNT), \ + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts), \ + STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends) extern struct dentry *kvm_debugfs_dir; diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h index de7fb5f364d8..ea7d26017fa6 100644 --- a/include/linux/kvm_types.h +++ b/include/linux/kvm_types.h @@ -93,6 +93,8 @@ struct kvm_vcpu_stat_generic { u64 halt_poll_success_hist[HALT_POLL_HIST_COUNT]; u64 halt_poll_fail_hist[HALT_POLL_HIST_COUNT]; u64 halt_wait_hist[HALT_POLL_HIST_COUNT]; + u64 halt_block_starts; + u64 halt_block_ends; }; #define KVM_STATS_NAME_SIZE 48 diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3e67c93ca403..5c3a21d2fbea 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3206,6 +3206,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) bool waited = false; u64 block_ns; + ++vcpu->stat.generic.halt_block_starts; kvm_arch_vcpu_blocking(vcpu); start = cur = poll_end = ktime_get(); @@ -3285,6 +3286,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu) trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu)); kvm_arch_vcpu_block_finish(vcpu); + ++vcpu->stat.generic.halt_block_ends; } EXPORT_SYMBOL_GPL(kvm_vcpu_block);
Current guest/host/halt stats don't show when we are currently halting well. If a guest halts for a long period of time they could appear pathologically blocked but really it's the opposite there's nothing to do. Simply count the number of times we enter and leave the kvm_vcpu_block function per vcpu, if they are unequal, then a VCPU is currently halting. The existing stats like halt_exits and halt_wakeups don't quite capture this. The time spend halted and halt polling is reported eventually, but not until we wakeup and resume. If a guest were to indefinitely halt one of it's CPUs we would never know, it may simply appear blocked. Original-by: Cannon Matthews <cannonmatthews@google.com> Signed-off-by: Jing Zhang <jingzhangos@google.com> --- include/linux/kvm_host.h | 4 +++- include/linux/kvm_types.h | 2 ++ virt/kvm/kvm_main.c | 2 ++ 3 files changed, 7 insertions(+), 1 deletion(-) base-commit: a3e0b8bd99ab098514bde2434301fa6fde040da2