Message ID | 1621339235-11131-2-git-send-email-wanpengli@tencent.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v4,1/5] KVM: exit halt polling on need_resched() for both book3s and generic halt-polling | expand |
On Tue, May 18, 2021, Wanpeng Li wrote: > From: Wanpeng Li <wanpengli@tencent.com> > > In case of under-committed scenarios, vCPU can get scheduling easily, > kvm_vcpu_yield_to add extra overhead, we can observe a lot of race > between vcpu->ready is true and yield fails due to p->state is > TASK_RUNNING. Let's bail out in such scenarios by checking the length > of current cpu runqueue, it can be treated as a hint of under-committed > instead of guarantee of accuracy. The directed_yield_successful/attempted > ratio can be improved from 50+% to 80+% in the under-committed scenario. The "50+% to 80+%" comment will be a bit confusing for future readers now that the single_task_running() case counts as an attempt. I think the new comment would be something like "30%+ of directed-yield attempts can avoid the expensive lookups in kvm_sched_yield() in an under-committed scenario." That would also provide the real justification, as bumping the success ratio isn't the true goal of this path. > Signed-off-by: Wanpeng Li <wanpengli@tencent.com> > --- > v2 -> v3: > * update patch description > v1 -> v2: > * move the check after attempted counting > * update patch description > > arch/x86/kvm/x86.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 9b6bca616929..dfb7c320581f 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) > > vcpu->stat.directed_yield_attempted++; > > + if (single_task_running()) > + goto no_yield; > + > rcu_read_lock(); > map = rcu_dereference(vcpu->kvm->arch.apic_map); > > -- > 2.25.1 >
On Wed, 19 May 2021 at 03:21, Sean Christopherson <seanjc@google.com> wrote: > > On Tue, May 18, 2021, Wanpeng Li wrote: > > From: Wanpeng Li <wanpengli@tencent.com> > > > > In case of under-committed scenarios, vCPU can get scheduling easily, > > kvm_vcpu_yield_to add extra overhead, we can observe a lot of race > > between vcpu->ready is true and yield fails due to p->state is > > TASK_RUNNING. Let's bail out in such scenarios by checking the length > > of current cpu runqueue, it can be treated as a hint of under-committed > > instead of guarantee of accuracy. The directed_yield_successful/attempted > > ratio can be improved from 50+% to 80+% in the under-committed scenario. > > The "50+% to 80+%" comment will be a bit confusing for future readers now that > the single_task_running() case counts as an attempt. I think the new comment > would be something like "30%+ of directed-yield attempts can avoid the expensive > lookups in kvm_sched_yield() in an under-committed scenario." That would also > provide the real justification, as bumping the success ratio isn't the true goal > of this path. Looks good. Hope Paolo can update the patch description when applying. :) "In case of under-committed scenarios, vCPU can get scheduling easily, kvm_vcpu_yield_to add extra overhead, we can observe a lot of races between vcpu->ready is true and yield fails due to p->state is TASK_RUNNING. Let's bail out in such scenarios by checking the length of current cpu runqueue, it can be treated as a hint of under-committed instead of guaranteeing accuracy. 30%+ of directed-yield attempts can avoid the expensive lookups in kvm_sched_yield() in an under-committed scenario. "
On 19/05/21 04:57, Wanpeng Li wrote: > Looks good. Hope Paolo can update the patch description when applying.:) > > "In case of under-committed scenarios, vCPU can get scheduling easily, > kvm_vcpu_yield_to add extra overhead, we can observe a lot of races > between vcpu->ready is true and yield fails due to p->state is > TASK_RUNNING. Let's bail out in such scenarios by checking the length > of current cpu runqueue, it can be treated as a hint of under-committed > instead of guaranteeing accuracy. 30%+ of directed-yield attempts can > avoid the expensive lookups in kvm_sched_yield() in an under-committed > scenario. " > Here is what I used: In case of under-committed scenarios, vCPUs can be scheduled easily; kvm_vcpu_yield_to adds extra overhead, and it is also common to see when vcpu->ready is true but yield later failing due to p->state is TASK_RUNNING. Let's bail out in such scenarios by checking the length of current cpu runqueue, which can be treated as a hint of under-committed instead of guarantee of accuracy. 30%+ of directed-yield attempts can now avoid the expensive lookups in kvm_sched_yield() in an under-committed scenario. Paolo
On Mon, 24 May 2021 at 21:42, Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 19/05/21 04:57, Wanpeng Li wrote: > > Looks good. Hope Paolo can update the patch description when applying.:) > > > > "In case of under-committed scenarios, vCPU can get scheduling easily, > > kvm_vcpu_yield_to add extra overhead, we can observe a lot of races > > between vcpu->ready is true and yield fails due to p->state is > > TASK_RUNNING. Let's bail out in such scenarios by checking the length > > of current cpu runqueue, it can be treated as a hint of under-committed > > instead of guaranteeing accuracy. 30%+ of directed-yield attempts can > > avoid the expensive lookups in kvm_sched_yield() in an under-committed > > scenario. " > > > > Here is what I used: > > In case of under-committed scenarios, vCPUs can be scheduled easily; > kvm_vcpu_yield_to adds extra overhead, and it is also common to see > when vcpu->ready is true but yield later failing due to p->state is > TASK_RUNNING. > > Let's bail out in such scenarios by checking the length of current cpu > runqueue, which can be treated as a hint of under-committed instead of > guarantee of accuracy. 30%+ of directed-yield attempts can now avoid > the expensive lookups in kvm_sched_yield() in an under-committed scenario. Thanks Paolo! :) Wanpeng
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9b6bca616929..dfb7c320581f 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8360,6 +8360,9 @@ static void kvm_sched_yield(struct kvm_vcpu *vcpu, unsigned long dest_id) vcpu->stat.directed_yield_attempted++; + if (single_task_running()) + goto no_yield; + rcu_read_lock(); map = rcu_dereference(vcpu->kvm->arch.apic_map);