From patchwork Sun Oct 16 03:21:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Michael S. Tsirkin" X-Patchwork-Id: 9377989 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id CEF84600CA for ; Sun, 16 Oct 2016 03:21:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BEA3B28E28 for ; Sun, 16 Oct 2016 03:21:48 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B2DFD28E53; Sun, 16 Oct 2016 03:21:48 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D07FF28E28 for ; Sun, 16 Oct 2016 03:21:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751690AbcJPDVo (ORCPT ); Sat, 15 Oct 2016 23:21:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49744 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751392AbcJPDVn (ORCPT ); Sat, 15 Oct 2016 23:21:43 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 514908123D; Sun, 16 Oct 2016 03:21:42 +0000 (UTC) Received: from redhat.com (vpn-60-47.rdu2.redhat.com [10.10.60.47]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id u9G3LfWs015319; Sat, 15 Oct 2016 23:21:41 -0400 Date: Sun, 16 Oct 2016 06:21:41 +0300 From: "Michael S. Tsirkin" To: Paolo Bonzini Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org, rkrcmar@redhat.com, yang.zhang.wz@gmail.com, feng.wu@intel.com Subject: Re: [PATCH 1/5] KVM: x86: avoid atomic operations on APICv vmentry Message-ID: <20161016060320-mutt-send-email-mst@kernel.org> References: <1476469291-5039-1-git-send-email-pbonzini@redhat.com> <1476469291-5039-2-git-send-email-pbonzini@redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1476469291-5039-2-git-send-email-pbonzini@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Sun, 16 Oct 2016 03:21:42 +0000 (UTC) Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Oct 14, 2016 at 08:21:27PM +0200, Paolo Bonzini wrote: > On some benchmarks (e.g. netperf with ioeventfd disabled), APICv > posted interrupts turn out to be slower than interrupt injection via > KVM_REQ_EVENT. > > This patch optimizes a bit the IRR update, avoiding expensive atomic > operations in the common case where PI.ON=0 at vmentry or the PIR vector > is mostly zero. This saves at least 20 cycles (1%) per vmexit, as > measured by kvm-unit-tests' inl_from_qemu test (20 runs): > > | enable_apicv=1 | enable_apicv=0 > | mean stdev | mean stdev > ----------|-----------------|------------------ > before | 5826 32.65 | 5765 47.09 > after | 5809 43.42 | 5777 77.02 > > Of course, any change in the right column is just placebo effect. :) > The savings are bigger if interrupts are frequent. > > Signed-off-by: Paolo Bonzini > --- > arch/x86/kvm/lapic.c | 6 ++++-- > arch/x86/kvm/vmx.c | 9 ++++++++- > 2 files changed, 12 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c > index 23b99f305382..63a442aefc12 100644 > --- a/arch/x86/kvm/lapic.c > +++ b/arch/x86/kvm/lapic.c > @@ -342,9 +342,11 @@ void __kvm_apic_update_irr(u32 *pir, void *regs) > u32 i, pir_val; > > for (i = 0; i <= 7; i++) { > - pir_val = xchg(&pir[i], 0); > - if (pir_val) > + pir_val = READ_ONCE(pir[i]); > + if (pir_val) { > + pir_val = xchg(&pir[i], 0); > *((u32 *)(regs + APIC_IRR + i * 0x10)) |= pir_val; > + } > } > } > EXPORT_SYMBOL_GPL(__kvm_apic_update_irr); gcc doesn't seem to unroll this loop and it's probably worth unrolling it The following seems to do the trick for me on upstream - I didn't benchmark it though. Is there a kvm unit test for interrupts? ---> kvm: unroll the loop in __kvm_apic_update_irr. This is hot data path in interrupt-rich workloads, worth unrolling. Signed-off-by: Michael S. Tsirkin --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index b62c852..0c3462c 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -337,7 +337,8 @@ static u8 count_vectors(void *bitmap) return count; } -void __kvm_apic_update_irr(u32 *pir, void *regs) +void __attribute__((optimize("unroll-loops"))) +__kvm_apic_update_irr(u32 *pir, void *regs) { u32 i, pir_val;