diff mbox series

[v2,3/3] KVM: vCPU kick tax cut for running vCPU

Message ID 1633770532-23664-3-git-send-email-wanpengli@tencent.com (mailing list archive)
State New, archived
Headers show
Series [v2,1/3] KVM: emulate: Don't inject #GP when emulating RDMPC if CR0.PE=0 | expand

Commit Message

Wanpeng Li Oct. 9, 2021, 9:08 a.m. UTC
From: Wanpeng Li <wanpengli@tencent.com>

Sometimes a vCPU kick is following a pending request, even if @vcpu is 
the running vCPU. It suffers from both rcuwait_wake_up() which has 
rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait 
before rcu_wait_wake_up() and whether @vcpu is the running vCPU before 
cmpxchg() to tax cut this overhead.

We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the 
scores can improve ~600 cpu cycles especially when APICv is disabled.

tscdeadline_immed
tscdeadline
self_ipi_sti_nop
..............
x2apic_self_ipi_tpr_sti_hlt

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v1 -> v2:
 * move checking running vCPU logic to kvm_vcpu_kick
 * check rcuwait_active(&vcpu->wait) etc

 virt/kvm/kvm_main.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Sean Christopherson Oct. 15, 2021, 11:26 p.m. UTC | #1
On Sat, Oct 09, 2021, Wanpeng Li wrote:
> From: Wanpeng Li <wanpengli@tencent.com>
> 
> Sometimes a vCPU kick is following a pending request, even if @vcpu is 
> the running vCPU. It suffers from both rcuwait_wake_up() which has 
> rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait 
> before rcu_wait_wake_up() and whether @vcpu is the running vCPU before 
> cmpxchg() to tax cut this overhead.
> 
> We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the 
> scores can improve ~600 cpu cycles especially when APICv is disabled.
> 
> tscdeadline_immed
> tscdeadline
> self_ipi_sti_nop
> ..............
> x2apic_self_ipi_tpr_sti_hlt
> 
> Suggested-by: Sean Christopherson <seanjc@google.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> v1 -> v2:
>  * move checking running vCPU logic to kvm_vcpu_kick
>  * check rcuwait_active(&vcpu->wait) etc
> 
>  virt/kvm/kvm_main.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 7851f3a1b5f7..18209d7b3711 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3314,8 +3314,15 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
>  {
>  	int me, cpu;
>  
> -	if (kvm_vcpu_wake_up(vcpu))
> -		return;
> +	me = get_cpu();
> +
> +	if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))

This needs to use kvm_arch_vcpu_get_wait(), not vcpu->wait, because PPC has some
funky wait stuff.

One potential issue I didn't think of before.  rcuwait_active() comes with the
below warning, which means we might be at risk of a false negative that could
result in a missed wakeup.  I'm not postive on that though.

/*
 * Note: this provides no serialization and, just as with waitqueues,
 * requires care to estimate as to whether or not the wait is active.
 */

> +		goto out;
> +
> +	if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
> +		WARN_ON_ONCE(vcpu->mode == IN_GUEST_MODE);
> +		goto out;
> +	}
>  
>  	/*
>  	 * Note, the vCPU could get migrated to a different pCPU at any point
> @@ -3324,12 +3331,12 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
>  	 * IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the
>  	 * vCPU also requires it to leave IN_GUEST_MODE.
>  	 */
> -	me = get_cpu();
>  	if (kvm_arch_vcpu_should_kick(vcpu)) {
>  		cpu = READ_ONCE(vcpu->cpu);
>  		if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
>  			smp_send_reschedule(cpu);
>  	}
> +out:
>  	put_cpu();
>  }
>  EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
> -- 
> 2.25.1
>
Wanpeng Li Oct. 16, 2021, 2:48 a.m. UTC | #2
On Sat, 16 Oct 2021 at 07:26, Sean Christopherson <seanjc@google.com> wrote:
>
> On Sat, Oct 09, 2021, Wanpeng Li wrote:
> > From: Wanpeng Li <wanpengli@tencent.com>
> >
> > Sometimes a vCPU kick is following a pending request, even if @vcpu is
> > the running vCPU. It suffers from both rcuwait_wake_up() which has
> > rcu/memory barrier operations and cmpxchg(). Let's check vcpu->wait
> > before rcu_wait_wake_up() and whether @vcpu is the running vCPU before
> > cmpxchg() to tax cut this overhead.
> >
> > We evaluate the kvm-unit-test/vmexit.flat on an Intel ICX box, most of the
> > scores can improve ~600 cpu cycles especially when APICv is disabled.
> >
> > tscdeadline_immed
> > tscdeadline
> > self_ipi_sti_nop
> > ..............
> > x2apic_self_ipi_tpr_sti_hlt
> >
> > Suggested-by: Sean Christopherson <seanjc@google.com>
> > Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> > ---
> > v1 -> v2:
> >  * move checking running vCPU logic to kvm_vcpu_kick
> >  * check rcuwait_active(&vcpu->wait) etc
> >
> >  virt/kvm/kvm_main.c | 13 ++++++++++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> >
> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > index 7851f3a1b5f7..18209d7b3711 100644
> > --- a/virt/kvm/kvm_main.c
> > +++ b/virt/kvm/kvm_main.c
> > @@ -3314,8 +3314,15 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
> >  {
> >       int me, cpu;
> >
> > -     if (kvm_vcpu_wake_up(vcpu))
> > -             return;
> > +     me = get_cpu();
> > +
> > +     if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))
>
> This needs to use kvm_arch_vcpu_get_wait(), not vcpu->wait, because PPC has some
> funky wait stuff.
>
> One potential issue I didn't think of before.  rcuwait_active() comes with the
> below warning, which means we might be at risk of a false negative that could
> result in a missed wakeup.  I'm not postive on that though.

There is only ever a single waiting vCPU, an event will be requested
before kick the sleeping vCPU and it will be checked after setting
vcpu->wait to task. I can't find scenario could result in a missed
wakeup.

    Wanpeng

>
> /*
>  * Note: this provides no serialization and, just as with waitqueues,
>  * requires care to estimate as to whether or not the wait is active.
>  */
diff mbox series

Patch

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7851f3a1b5f7..18209d7b3711 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3314,8 +3314,15 @@  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 {
 	int me, cpu;
 
-	if (kvm_vcpu_wake_up(vcpu))
-		return;
+	me = get_cpu();
+
+	if (rcuwait_active(&vcpu->wait) && kvm_vcpu_wake_up(vcpu))
+		goto out;
+
+	if (vcpu == __this_cpu_read(kvm_running_vcpu)) {
+		WARN_ON_ONCE(vcpu->mode == IN_GUEST_MODE);
+		goto out;
+	}
 
 	/*
 	 * Note, the vCPU could get migrated to a different pCPU at any point
@@ -3324,12 +3331,12 @@  void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 	 * IPI is to force the vCPU to leave IN_GUEST_MODE, and migrating the
 	 * vCPU also requires it to leave IN_GUEST_MODE.
 	 */
-	me = get_cpu();
 	if (kvm_arch_vcpu_should_kick(vcpu)) {
 		cpu = READ_ONCE(vcpu->cpu);
 		if (cpu != me && (unsigned)cpu < nr_cpu_ids && cpu_online(cpu))
 			smp_send_reschedule(cpu);
 	}
+out:
 	put_cpu();
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_kick);