KVM: stats: add stats to detect if vcpu is currently halted

Message ID	20210817230508.142907-1-jingzhangos@google.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <kvm-owner@kernel.org> Date: Tue, 17 Aug 2021 23:05:08 +0000 Message-Id: <20210817230508.142907-1-jingzhangos@google.com> Mime-Version: 1.0 Subject: [PATCH] KVM: stats: add stats to detect if vcpu is currently halted From: Jing Zhang <jingzhangos@google.com> To: KVM <kvm@vger.kernel.org>, Paolo Bonzini <pbonzini@redhat.com>, David Matlack <dmatlack@google.com>, Peter Shier <pshier@google.com>, Oliver Upton <oupton@google.com>, Cannon Matthews <cannonmatthews@google.com> Cc: Jing Zhang <jingzhangos@google.com> Content-Type: text/plain; charset="UTF-8" Precedence: bulk
Series	KVM: stats: add stats to detect if vcpu is currently halted \| expand KVM: stats: add stats to detect if vcpu is currently halted

Jing Zhang Aug. 17, 2021, 11:05 p.m. UTC

Current guest/host/halt stats don't show when we are currently halting
well. If a guest halts for a long period of time they could appear
pathologically blocked but really it's the opposite there's nothing to
do.
Simply count the number of times we enter and leave the kvm_vcpu_block
function per vcpu, if they are unequal, then a VCPU is currently
halting.
The existing stats like halt_exits and halt_wakeups don't quite capture
this. The time spend halted and halt polling is reported eventually, but
not until we wakeup and resume. If a guest were to indefinitely halt one
of it's CPUs we would never know, it may simply appear blocked.

Original-by: Cannon Matthews <cannonmatthews@google.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
---
 include/linux/kvm_host.h  | 4 +++-
 include/linux/kvm_types.h | 2 ++
 virt/kvm/kvm_main.c       | 2 ++
 3 files changed, 7 insertions(+), 1 deletion(-)


base-commit: a3e0b8bd99ab098514bde2434301fa6fde040da2

Sean Christopherson Aug. 17, 2021, 11:46 p.m. UTC | #1

On Tue, Aug 17, 2021, Jing Zhang wrote:
> Current guest/host/halt stats don't show when we are currently halting

s/we are/KVM is

And I would probably reword it to "when KVM is blocking a vCPU in response to
the vCPU activity state, e.g. halt".  More on that below.

> well. If a guest halts for a long period of time they could appear
> pathologically blocked but really it's the opposite there's nothing to
> do.
> Simply count the number of times we enter and leave the kvm_vcpu_block

s/we/KVM

In general, it's good practice to avoid pronouns in comments and changelogs as
doing so all but forces using precise, unambiguous language.  Things like 'it'
and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us'
are best avoided entirely.

> function per vcpu, if they are unequal, then a VCPU is currently
> halting.
> The existing stats like halt_exits and halt_wakeups don't quite capture
> this. The time spend halted and halt polling is reported eventually, but
> not until we wakeup and resume. If a guest were to indefinitely halt one
> of it's CPUs we would never know, it may simply appear blocked.
     ^^^^      ^^
     its       userspace?

The "blocked" terminology is a bit confusing since KVM is explicitly blocking
the vCPU, it just happens to mostly do so in response to a guest HLT.  I think
"block" is intended to mean "vCPU task not run", but it would be helpful to make
that clear.

> Original-by: Cannon Matthews <cannonmatthews@google.com>
> Signed-off-by: Jing Zhang <jingzhangos@google.com>
> ---
>  include/linux/kvm_host.h  | 4 +++-
>  include/linux/kvm_types.h | 2 ++
>  virt/kvm/kvm_main.c       | 2 ++
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index d447b21cdd73..23d2e19af3ce 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc {
>  	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,	       \
>  			HALT_POLL_HIST_COUNT),				       \
>  	STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,	       \
> -			HALT_POLL_HIST_COUNT)
> +			HALT_POLL_HIST_COUNT),				       \
> +	STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts),		       \
> +	STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends)

Why two counters?  It's per-vCPU, can't this just be a "blocked" flag or so?  I
get that all the other stats use "halt", but that's technically wrong as KVM will
block vCPUs that are not runnable for other reason, e.g. because they're in WFS
on x86.

Jing Zhang Aug. 18, 2021, 4:09 p.m. UTC | #2

Hi Sean,

On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Tue, Aug 17, 2021, Jing Zhang wrote:
> > Current guest/host/halt stats don't show when we are currently halting
>
> s/we are/KVM is
>
> And I would probably reword it to "when KVM is blocking a vCPU in response to
> the vCPU activity state, e.g. halt".  More on that below.
>
> > well. If a guest halts for a long period of time they could appear
> > pathologically blocked but really it's the opposite there's nothing to
> > do.
> > Simply count the number of times we enter and leave the kvm_vcpu_block
>
> s/we/KVM
>
> In general, it's good practice to avoid pronouns in comments and changelogs as
> doing so all but forces using precise, unambiguous language.  Things like 'it'
> and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us'
> are best avoided entirely.
>
> > function per vcpu, if they are unequal, then a VCPU is currently
> > halting.
> > The existing stats like halt_exits and halt_wakeups don't quite capture
> > this. The time spend halted and halt polling is reported eventually, but
> > not until we wakeup and resume. If a guest were to indefinitely halt one
> > of it's CPUs we would never know, it may simply appear blocked.
>      ^^^^      ^^
>      its       userspace?
>
>
> The "blocked" terminology is a bit confusing since KVM is explicitly blocking
> the vCPU, it just happens to mostly do so in response to a guest HLT.  I think
> "block" is intended to mean "vCPU task not run", but it would be helpful to make
> that clear.
>
That's a good point. Will reword the comments as you suggested.
> > Original-by: Cannon Matthews <cannonmatthews@google.com>
> > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > ---
> >  include/linux/kvm_host.h  | 4 +++-
> >  include/linux/kvm_types.h | 2 ++
> >  virt/kvm/kvm_main.c       | 2 ++
> >  3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > index d447b21cdd73..23d2e19af3ce 100644
> > --- a/include/linux/kvm_host.h
> > +++ b/include/linux/kvm_host.h
> > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc {
> >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,        \
> >                       HALT_POLL_HIST_COUNT),                                 \
> >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,             \
> > -                     HALT_POLL_HIST_COUNT)
> > +                     HALT_POLL_HIST_COUNT),                                 \
> > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts),                   \
> > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends)
>
> Why two counters?  It's per-vCPU, can't this just be a "blocked" flag or so?  I
> get that all the other stats use "halt", but that's technically wrong as KVM will
> block vCPUs that are not runnable for other reason, e.g. because they're in WFS
> on x86.
The two counters are used to determine the reason why vCPU is not
running. If the halt_block_ends is one less than halt_block_starts,
then we know the vCPU is explicitly blocked by KVM. Otherwise, we know
there might be something wrong with the vCPU. Does this make sense?
Will rename from "halt_block_*" to "vcpu_block_*".

Thanks,
Jing

Sean Christopherson Aug. 18, 2021, 4:57 p.m. UTC | #3

On Wed, Aug 18, 2021, Jing Zhang wrote:
> > > Original-by: Cannon Matthews <cannonmatthews@google.com>
> > > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > > ---
> > >  include/linux/kvm_host.h  | 4 +++-
> > >  include/linux/kvm_types.h | 2 ++
> > >  virt/kvm/kvm_main.c       | 2 ++
> > >  3 files changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index d447b21cdd73..23d2e19af3ce 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc {
> > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,        \
> > >                       HALT_POLL_HIST_COUNT),                                 \
> > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,             \
> > > -                     HALT_POLL_HIST_COUNT)
> > > +                     HALT_POLL_HIST_COUNT),                                 \
> > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts),                   \
> > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends)
> >
> > Why two counters?  It's per-vCPU, can't this just be a "blocked" flag or so?  I
> > get that all the other stats use "halt", but that's technically wrong as KVM will
> > block vCPUs that are not runnable for other reason, e.g. because they're in WFS
> > on x86.
> The two counters are used to determine the reason why vCPU is not
> running. If the halt_block_ends is one less than halt_block_starts,
> then we know the vCPU is explicitly blocked by KVM. Otherwise, we know
> there might be something wrong with the vCPU. Does this make sense?
> Will rename from "halt_block_*" to "vcpu_block_*".

Yeah, but why not do the below?  "halt_block_starts - halt_block_ends" can only
ever be '0' or '1'; if it's anything else, KVM done messed up, e.g. failed to
bump halt_block_ends when unblocking.

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 3e67c93ca403..aa98dec727d0 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3208,6 +3208,8 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)

        kvm_arch_vcpu_blocking(vcpu);

+       vcpu->stat.generic.blocked = 1;
+
        start = cur = poll_end = ktime_get();
        if (vcpu->halt_poll_ns && !kvm_arch_no_poll(vcpu)) {
                ktime_t stop = ktime_add_ns(ktime_get(), vcpu->halt_poll_ns);
@@ -3258,6 +3260,7 @@ void kvm_vcpu_block(struct kvm_vcpu *vcpu)
                                ktime_to_ns(cur) - ktime_to_ns(poll_end));
        }
 out:
+       vcpu->stat.generic.blocked = 0;
        kvm_arch_vcpu_unblocking(vcpu);
        block_ns = ktime_to_ns(cur) - ktime_to_ns(start);

Cannon Matthews Aug. 18, 2021, 5:01 p.m. UTC | #4

+1 to the rephrasing of the commit message.

On Wed, Aug 18, 2021 at 11:09 AM Jing Zhang <jingzhangos@google.com> wrote:
>
> Hi Sean,
>
> On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Tue, Aug 17, 2021, Jing Zhang wrote:
> > > Current guest/host/halt stats don't show when we are currently halting
> >
> > s/we are/KVM is
> >
> > And I would probably reword it to "when KVM is blocking a vCPU in response to
> > the vCPU activity state, e.g. halt".  More on that below.
> >
> > > well. If a guest halts for a long period of time they could appear
> > > pathologically blocked but really it's the opposite there's nothing to
> > > do.
> > > Simply count the number of times we enter and leave the kvm_vcpu_block
> >
> > s/we/KVM
> >
> > In general, it's good practice to avoid pronouns in comments and changelogs as
> > doing so all but forces using precise, unambiguous language.  Things like 'it'
> > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us'
> > are best avoided entirely.
> >
> > > function per vcpu, if they are unequal, then a VCPU is currently
> > > halting.
> > > The existing stats like halt_exits and halt_wakeups don't quite capture
> > > this. The time spend halted and halt polling is reported eventually, but
> > > not until we wakeup and resume. If a guest were to indefinitely halt one
> > > of it's CPUs we would never know, it may simply appear blocked.
> >      ^^^^      ^^
> >      its       userspace?
> >
> >
> > The "blocked" terminology is a bit confusing since KVM is explicitly blocking
> > the vCPU, it just happens to mostly do so in response to a guest HLT.  I think
> > "block" is intended to mean "vCPU task not run", but it would be helpful to make
> > that clear.
> >
> That's a good point. Will reword the comments as you suggested.
> > > Original-by: Cannon Matthews <cannonmatthews@google.com>
> > > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > > ---
> > >  include/linux/kvm_host.h  | 4 +++-
> > >  include/linux/kvm_types.h | 2 ++
> > >  virt/kvm/kvm_main.c       | 2 ++
> > >  3 files changed, 7 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > index d447b21cdd73..23d2e19af3ce 100644
> > > --- a/include/linux/kvm_host.h
> > > +++ b/include/linux/kvm_host.h
> > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc {
> > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,        \
> > >                       HALT_POLL_HIST_COUNT),                                 \
> > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,             \
> > > -                     HALT_POLL_HIST_COUNT)
> > > +                     HALT_POLL_HIST_COUNT),                                 \
> > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts),                   \
> > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends)
> >
> > Why two counters?  It's per-vCPU, can't this just be a "blocked" flag or so?  I
> > get that all the other stats use "halt", but that's technically wrong as KVM will
> > block vCPUs that are not runnable for other reason, e.g. because they're in WFS
> > on x86.

The point is to separate "blocked because not runable" and "guest
explicitly asked to do nothing"

IIRC I originally wrote this patch to help discriminate how we spent
VCPU thread time,
in particular into two categories of essentially  "doing useful work
on behalf of the guest" and
"blocked from doing useful work."

Since a guest has explictly asked for a vcpu to HLT, this is "useful
work on behalf of the guest"
even though the thread is "blocked" from running.

This allows answering questions like, are we spending too much time
waiting on mutexes, or
long running kernel routines rather than running the vcpu in guest
mode, or did the guest explictly
tell us to not doing anything.

So I would suggest keeping the "halt" part of the counters' name, and
remove the "blocked" part
rather than the other way around. We explicitly do not want to include
non-halt blockages in this.

That being said I suppose a boolean could work as well. I think we did
this because it worked with
and mirrored existing stats rather than anything particularly nuanced.
Though there might be some
eventual consistency sort of concerns with how these stats are updated
and exported that could make
monotonic increasing counters more useful.

> The two counters are used to determine the reason why vCPU is not
> running. If the halt_block_ends is one less than halt_block_starts,
> then we know the vCPU is explicitly blocked by KVM. Otherwise, we know
> there might be something wrong with the vCPU. Does this make sense?
> Will rename from "halt_block_*" to "vcpu_block_*".
>
> Thanks,
> Jing

Jing Zhang Aug. 19, 2021, 6:09 p.m. UTC | #5

Hi Sean,

On Wed, Aug 18, 2021 at 10:01 AM Cannon Matthews
<cannonmatthews@google.com> wrote:
>
> +1 to the rephrasing of the commit message.
>
> On Wed, Aug 18, 2021 at 11:09 AM Jing Zhang <jingzhangos@google.com> wrote:
> >
> > Hi Sean,
> >
> > On Tue, Aug 17, 2021 at 4:46 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Tue, Aug 17, 2021, Jing Zhang wrote:
> > > > Current guest/host/halt stats don't show when we are currently halting
> > >
> > > s/we are/KVM is
> > >
> > > And I would probably reword it to "when KVM is blocking a vCPU in response to
> > > the vCPU activity state, e.g. halt".  More on that below.
> > >
> > > > well. If a guest halts for a long period of time they could appear
> > > > pathologically blocked but really it's the opposite there's nothing to
> > > > do.
> > > > Simply count the number of times we enter and leave the kvm_vcpu_block
> > >
> > > s/we/KVM
> > >
> > > In general, it's good practice to avoid pronouns in comments and changelogs as
> > > doing so all but forces using precise, unambiguous language.  Things like 'it'
> > > and 'they' are ok when it's abundantly clear what they refer to, but 'we' and 'us'
> > > are best avoided entirely.
> > >
> > > > function per vcpu, if they are unequal, then a VCPU is currently
> > > > halting.
> > > > The existing stats like halt_exits and halt_wakeups don't quite capture
> > > > this. The time spend halted and halt polling is reported eventually, but
> > > > not until we wakeup and resume. If a guest were to indefinitely halt one
> > > > of it's CPUs we would never know, it may simply appear blocked.
> > >      ^^^^      ^^
> > >      its       userspace?
> > >
> > >
> > > The "blocked" terminology is a bit confusing since KVM is explicitly blocking
> > > the vCPU, it just happens to mostly do so in response to a guest HLT.  I think
> > > "block" is intended to mean "vCPU task not run", but it would be helpful to make
> > > that clear.
> > >
> > That's a good point. Will reword the comments as you suggested.
> > > > Original-by: Cannon Matthews <cannonmatthews@google.com>
> > > > Signed-off-by: Jing Zhang <jingzhangos@google.com>
> > > > ---
> > > >  include/linux/kvm_host.h  | 4 +++-
> > > >  include/linux/kvm_types.h | 2 ++
> > > >  virt/kvm/kvm_main.c       | 2 ++
> > > >  3 files changed, 7 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> > > > index d447b21cdd73..23d2e19af3ce 100644
> > > > --- a/include/linux/kvm_host.h
> > > > +++ b/include/linux/kvm_host.h
> > > > @@ -1459,7 +1459,9 @@ struct _kvm_stats_desc {
> > > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_poll_fail_hist,        \
> > > >                       HALT_POLL_HIST_COUNT),                                 \
> > > >       STATS_DESC_LOGHIST_TIME_NSEC(VCPU_GENERIC, halt_wait_hist,             \
> > > > -                     HALT_POLL_HIST_COUNT)
> > > > +                     HALT_POLL_HIST_COUNT),                                 \
> > > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_starts),                   \
> > > > +     STATS_DESC_COUNTER(VCPU_GENERIC, halt_block_ends)
> > >
> > > Why two counters?  It's per-vCPU, can't this just be a "blocked" flag or so?  I
> > > get that all the other stats use "halt", but that's technically wrong as KVM will
> > > block vCPUs that are not runnable for other reason, e.g. because they're in WFS
> > > on x86.
>
> The point is to separate "blocked because not runable" and "guest
> explicitly asked to do nothing"
>
> IIRC I originally wrote this patch to help discriminate how we spent
> VCPU thread time,
> in particular into two categories of essentially  "doing useful work
> on behalf of the guest" and
> "blocked from doing useful work."
>
> Since a guest has explictly asked for a vcpu to HLT, this is "useful
> work on behalf of the guest"
> even though the thread is "blocked" from running.
>
> This allows answering questions like, are we spending too much time
> waiting on mutexes, or
> long running kernel routines rather than running the vcpu in guest
> mode, or did the guest explictly
> tell us to not doing anything.
>
> So I would suggest keeping the "halt" part of the counters' name, and
> remove the "blocked" part
> rather than the other way around. We explicitly do not want to include
> non-halt blockages in this.
>
> That being said I suppose a boolean could work as well. I think we did
> this because it worked with
> and mirrored existing stats rather than anything particularly nuanced.
> Though there might be some
> eventual consistency sort of concerns with how these stats are updated
> and exported that could make
> monotonic increasing counters more useful.
Any more comments on the naming and the increasing counters?
>
> > The two counters are used to determine the reason why vCPU is not
> > running. If the halt_block_ends is one less than halt_block_starts,
> > then we know the vCPU is explicitly blocked by KVM. Otherwise, we know
> > there might be something wrong with the vCPU. Does this make sense?
> > Will rename from "halt_block_*" to "vcpu_block_*".
> >
> > Thanks,
> > Jing

Sean Christopherson Aug. 19, 2021, 10:37 p.m. UTC | #6

On Wed, Aug 18, 2021, Cannon Matthews wrote:
> Since a guest has explictly asked for a vcpu to HLT, this is "useful work on
> behalf of the guest" even though the thread is "blocked" from running.
> 
> This allows answering questions like, are we spending too much time waiting
> on mutexes, or long running kernel routines rather than running the vcpu in
> guest mode, or did the guest explictly tell us to not doing anything.
> 
> So I would suggest keeping the "halt" part of the counters' name, and remove
> the "blocked" part rather than the other way around. We explicitly do not
> want to include non-halt blockages in this.

But this patch does include non-halt blockages, which is why I brought up the
technically-wrong naming.  Specifically, x86 reaches this path for any !RUNNABLE
vCPU state, e.g. if the vCPU is in WFS.  Non-x86 usage appears to mostly call
this for halt-like behavior, but PPC looks like it has at least one path that's
not halt-like.

I doubt anyone actually cares if the stat is a misnomer in some cases, but at the
same time I think there's opportunity for clean up here.  E.g. halt polling if a
vCPU is in WFS or UNINITIALIZED is a waste of cycles.  Ditto for the calls to
kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is
successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete
waste of cycles if the vCPU doesn't actually block.  And kvm_arch_vcpu_block_finish()
can be dropped by moving the one line of code into s390, which can add its own
wrapper if necessary.

So with a bit of massaging and a slight change in tracing behavior, I believe we
can isolate the actual wait/halt and avoid "halted" being technically-wrong, and
fix some inefficiencies at the same time.

Jing, can you do a v2 of this patch and send it to me off-list?  With luck, my
idea will work and I can fold your patch in, and if not we can always post v2
standalone in a few weeks.

E.g. I'm thinking something like...

void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
{
	vcpu->stat.generic.halted = 1;

	if (<halt polling failed>)
		kvm_vcpu_block();

	vcpu->stat.generic.halted = 0;

	<update halt polling stuff>
}

void kvm_vcpu_block(struct kvm_vcpu *vcpu)
{
	bool waited = false;
	ktime_t start, cur;
	u64 block_ns;

	start = ktime_get();

	prepare_to_rcuwait(&vcpu->wait);
	for (;;) {
		set_current_state(TASK_INTERRUPTIBLE);

		if (kvm_vcpu_check_block(vcpu) < 0)
			break;

		waited = true;
		schedule();
	}
	finish_rcuwait(&vcpu->wait);

	block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
	trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
}

Jing Zhang Aug. 20, 2021, 6:42 p.m. UTC | #7

Hi Sean,

On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote:
>
> On Wed, Aug 18, 2021, Cannon Matthews wrote:
> > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on
> > behalf of the guest" even though the thread is "blocked" from running.
> >
> > This allows answering questions like, are we spending too much time waiting
> > on mutexes, or long running kernel routines rather than running the vcpu in
> > guest mode, or did the guest explictly tell us to not doing anything.
> >
> > So I would suggest keeping the "halt" part of the counters' name, and remove
> > the "blocked" part rather than the other way around. We explicitly do not
> > want to include non-halt blockages in this.
>
> But this patch does include non-halt blockages, which is why I brought up the
> technically-wrong naming.  Specifically, x86 reaches this path for any !RUNNABLE
> vCPU state, e.g. if the vCPU is in WFS.  Non-x86 usage appears to mostly call
> this for halt-like behavior, but PPC looks like it has at least one path that's
> not halt-like.
>
> I doubt anyone actually cares if the stat is a misnomer in some cases, but at the
> same time I think there's opportunity for clean up here.  E.g. halt polling if a
> vCPU is in WFS or UNINITIALIZED is a waste of cycles.  Ditto for the calls to
> kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is
> successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete
> waste of cycles if the vCPU doesn't actually block.  And kvm_arch_vcpu_block_finish()
> can be dropped by moving the one line of code into s390, which can add its own
> wrapper if necessary.
>
> So with a bit of massaging and a slight change in tracing behavior, I believe we
> can isolate the actual wait/halt and avoid "halted" being technically-wrong, and
> fix some inefficiencies at the same time.
>
> Jing, can you do a v2 of this patch and send it to me off-list?  With luck, my
> idea will work and I can fold your patch in, and if not we can always post v2
> standalone in a few weeks.
Of course, will do.
Thanks,
Jing
>
> E.g. I'm thinking something like...
>
> void kvm_vcpu_halt(struct kvm_vcpu *vcpu)
> {
>         vcpu->stat.generic.halted = 1;
>
>         if (<halt polling failed>)
>                 kvm_vcpu_block();
>
>         vcpu->stat.generic.halted = 0;
>
>         <update halt polling stuff>
> }
>
> void kvm_vcpu_block(struct kvm_vcpu *vcpu)
> {
>         bool waited = false;
>         ktime_t start, cur;
>         u64 block_ns;
>
>         start = ktime_get();
>
>
>         prepare_to_rcuwait(&vcpu->wait);
>         for (;;) {
>                 set_current_state(TASK_INTERRUPTIBLE);
>
>                 if (kvm_vcpu_check_block(vcpu) < 0)
>                         break;
>
>                 waited = true;
>                 schedule();
>         }
>         finish_rcuwait(&vcpu->wait);
>
>         block_ns = ktime_to_ns(cur) - ktime_to_ns(start);
>         trace_kvm_vcpu_wakeup(block_ns, waited, vcpu_valid_wakeup(vcpu));
> }
>

Sean Christopherson Sept. 22, 2021, 4:39 p.m. UTC | #8

On Fri, Aug 20, 2021, Jing Zhang wrote:
> Hi Sean,
> 
> On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote:
> >
> > On Wed, Aug 18, 2021, Cannon Matthews wrote:
> > > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on
> > > behalf of the guest" even though the thread is "blocked" from running.
> > >
> > > This allows answering questions like, are we spending too much time waiting
> > > on mutexes, or long running kernel routines rather than running the vcpu in
> > > guest mode, or did the guest explictly tell us to not doing anything.
> > >
> > > So I would suggest keeping the "halt" part of the counters' name, and remove
> > > the "blocked" part rather than the other way around. We explicitly do not
> > > want to include non-halt blockages in this.
> >
> > But this patch does include non-halt blockages, which is why I brought up the
> > technically-wrong naming.  Specifically, x86 reaches this path for any !RUNNABLE
> > vCPU state, e.g. if the vCPU is in WFS.  Non-x86 usage appears to mostly call
> > this for halt-like behavior, but PPC looks like it has at least one path that's
> > not halt-like.
> >
> > I doubt anyone actually cares if the stat is a misnomer in some cases, but at the
> > same time I think there's opportunity for clean up here.  E.g. halt polling if a
> > vCPU is in WFS or UNINITIALIZED is a waste of cycles.  Ditto for the calls to
> > kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is
> > successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete
> > waste of cycles if the vCPU doesn't actually block.  And kvm_arch_vcpu_block_finish()
> > can be dropped by moving the one line of code into s390, which can add its own
> > wrapper if necessary.
> >
> > So with a bit of massaging and a slight change in tracing behavior, I believe we
> > can isolate the actual wait/halt and avoid "halted" being technically-wrong, and
> > fix some inefficiencies at the same time.
> >
> > Jing, can you do a v2 of this patch and send it to me off-list?  With luck, my
> > idea will work and I can fold your patch in, and if not we can always post v2
> > standalone in a few weeks.

Circling back to this with fresh eyes, limiting the state to "halted" would be
wrong.  I still stand by my assertion that non-halt states such as WFS should not
go through halt polling, but the intent of the proposed stat is to differentiate
between not running a vCPU because of a guest action and not running a vCPU because
the host is not scheduling its task.

E.g. on x86, if a vCPU is put into WFS for an extended time, the vCPU will not be
run because of a guest action, not because of any host activity.  But again, WFS
has very different meaning than "halt", which was the basis for my original
objection to the "halt_block" terminology.

One option would be to invert the stat, e.g. vcpu->stat.runnable, but that has the
downside of needed to be set somewhere outside of kvm_vcpu_block/halt(), and it
would likely be difficult to come up with a name that isn't confusing, e.g. the
vCPU would show up as "runnable" when mp_state!=RUNNABLE and it's not actively
blocking.

Back to bikeshedding the "halted" name, what about "blocking" or "waiting"?  I.e.
switch from past tense to present tense to convey that the _vCPU_ is "actively"
blocking/waiting, as opposed to the vCPU being blocked by some host condition.

Sean Christopherson Sept. 22, 2021, 4:41 p.m. UTC | #9

On Wed, Sep 22, 2021, Sean Christopherson wrote:
> On Fri, Aug 20, 2021, Jing Zhang wrote:
> > Hi Sean,
> > 
> > On Thu, Aug 19, 2021 at 3:37 PM Sean Christopherson <seanjc@google.com> wrote:
> > >
> > > On Wed, Aug 18, 2021, Cannon Matthews wrote:
> > > > Since a guest has explictly asked for a vcpu to HLT, this is "useful work on
> > > > behalf of the guest" even though the thread is "blocked" from running.
> > > >
> > > > This allows answering questions like, are we spending too much time waiting
> > > > on mutexes, or long running kernel routines rather than running the vcpu in
> > > > guest mode, or did the guest explictly tell us to not doing anything.
> > > >
> > > > So I would suggest keeping the "halt" part of the counters' name, and remove
> > > > the "blocked" part rather than the other way around. We explicitly do not
> > > > want to include non-halt blockages in this.
> > >
> > > But this patch does include non-halt blockages, which is why I brought up the
> > > technically-wrong naming.  Specifically, x86 reaches this path for any !RUNNABLE
> > > vCPU state, e.g. if the vCPU is in WFS.  Non-x86 usage appears to mostly call
> > > this for halt-like behavior, but PPC looks like it has at least one path that's
> > > not halt-like.
> > >
> > > I doubt anyone actually cares if the stat is a misnomer in some cases, but at the
> > > same time I think there's opportunity for clean up here.  E.g. halt polling if a
> > > vCPU is in WFS or UNINITIALIZED is a waste of cycles.  Ditto for the calls to
> > > kvm_arch_vcpu_blocking() and kvm_arch_vcpu_unblocking() when halt polling is
> > > successful, e.g. arm64 puts and reloads the vgic, which I assume is a complete
> > > waste of cycles if the vCPU doesn't actually block.  And kvm_arch_vcpu_block_finish()
> > > can be dropped by moving the one line of code into s390, which can add its own
> > > wrapper if necessary.
> > >
> > > So with a bit of massaging and a slight change in tracing behavior, I believe we
> > > can isolate the actual wait/halt and avoid "halted" being technically-wrong, and
> > > fix some inefficiencies at the same time.
> > >
> > > Jing, can you do a v2 of this patch and send it to me off-list?  With luck, my
> > > idea will work and I can fold your patch in, and if not we can always post v2
> > > standalone in a few weeks.
> 
> Circling back to this with fresh eyes, limiting the state to "halted" would be
> wrong.  I still stand by my assertion that non-halt states such as WFS should not
> go through halt polling, but the intent of the proposed stat is to differentiate
> between not running a vCPU because of a guest action and not running a vCPU because
> the host is not scheduling its task.
> 
> E.g. on x86, if a vCPU is put into WFS for an extended time, the vCPU will not be
> run because of a guest action, not because of any host activity.  But again, WFS
> has very different meaning than "halt", which was the basis for my original
> objection to the "halt_block" terminology.
> 
> One option would be to invert the stat, e.g. vcpu->stat.runnable, but that has the
> downside of needed to be set somewhere outside of kvm_vcpu_block/halt(), and it
> would likely be difficult to come up with a name that isn't confusing, e.g. the
> vCPU would show up as "runnable" when mp_state!=RUNNABLE and it's not actively
> blocking.
> 
> Back to bikeshedding the "halted" name, what about "blocking" or "waiting"?  I.e.
> switch from past tense to present tense to convey that the _vCPU_ is "actively"
> blocking/waiting, as opposed to the vCPU being blocked by some host condition.

Doh, forgot to say that "blocking" would be my first choice as it would match KVM's
internal "block" terminology.

KVM: stats: add stats to detect if vcpu is currently halted

Commit Message

Comments

Patch