Message ID | 1620871189-4763-1-git-send-email-wanpengli@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2,1/4] KVM: PPC: Book3S HV: exit halt polling on need_resched() as well | expand |
On Wed, May 12, 2021 at 7:01 PM Wanpeng Li <kernellwp@gmail.com> wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > In case of under-comitted scenarios, vCPU can get scheduling easily, > kvm_vcpu_yield_to adds extra overhead, we can observe a lot of race > between vcpu->ready is true and yield fails due to p->state is > TASK_RUNNING. Let's bail out in such scenarios by checking the length > of current cpu runqueue, it can be treated as a hint of under-committed > instead of guarantee of accuracy. > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > --- > v1 -> v2: > * move the check after attempted counting > * update patch description > > arch/x86/kvm/x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9b6bca6..dfb7c32 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) > > vcpu->stat.directed_yield_attempted++; > > + if (single_task_running()) > + goto no_yield; Since this is a heuristic, do you have any experimental or real world results that show the benefit? > + > rcu_read_lock(); > map = rcu_dereference(vcpu->kvm->arch.apic_map); > > -- > 2.7.4 >
On Sat, 15 May 2021 at 05:33, David Matlack <dmatlack@google.com> wrote: > > On Wed, May 12, 2021 at 7:01 PM Wanpeng Li <kernellwp@gmail.com> wrote: > > > > From: Wanpeng Li <wanpengli@tencent.com> > > > > In case of under-comitted scenarios, vCPU can get scheduling easily, > > kvm_vcpu_yield_to adds extra overhead, we can observe a lot of race > > between vcpu->ready is true and yield fails due to p->state is > > TASK_RUNNING. Let's bail out in such scenarios by checking the length > > of current cpu runqueue, it can be treated as a hint of under-committed > > instead of guarantee of accuracy. > > > > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > > --- > > v1 -> v2: > > * move the check after attempted counting > > * update patch description > > > > arch/x86/kvm/x86.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > > index 9b6bca6..dfb7c32 100644 > > --- a/arch/x86/kvm/x86.c > > +++ b/arch/x86/kvm/x86.c > > @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) > > > > vcpu->stat.directed_yield_attempted++; > > > > + if (single_task_running()) > > + goto no_yield; > > Since this is a heuristic, do you have any experimental or real world > results that show the benefit? I observe the directed_yield_successful/directed_yield_attempted ratio, it can be improved from 50%+ to 80%+ in the under-committed scenario. Wanpeng
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9b6bca6..dfb7c32 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) vcpu->stat.directed_yield_attempted++; + if (single_task_running()) + goto no_yield; + rcu_read_lock(); map = rcu_dereference(vcpu->kvm->arch.apic_map);