From patchwork Fri Feb 10 15:43:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 9567013 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 260C4601EA for ; Fri, 10 Feb 2017 15:46:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1339F2858A for ; Fri, 10 Feb 2017 15:46:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 077EF285A2; Fri, 10 Feb 2017 15:46:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 6F8E22858A for ; Fri, 10 Feb 2017 15:46:12 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ccDMD-0000WR-Aq; Fri, 10 Feb 2017 15:43:33 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1ccDMC-0000WL-4O for xen-devel@lists.xenproject.org; Fri, 10 Feb 2017 15:43:32 +0000 Received: from [85.158.137.68] by server-8.bemta-3.messagelabs.com id 67/1D-31649-3AFDD985; Fri, 10 Feb 2017 15:43:31 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrNLMWRWlGSWpSXmKPExsVysWW7jO6i+3M jDHrOalt83zKZyYHR4/CHKywBjFGsmXlJ+RUJrBm7tyYUPNev2PvtJUsDY7dGFyMXh5DAXCaJ 6y8/MHUxcgI5Kxgldn1yBbHZBNQk/tzqZAUpEhFYxyQx4cIfRhCHWWAns8SpC5OYuxg5OIQFw iUeNGSBNLAIqEq82n2ZFcTmFXCUuPuolQXElhDQljh2/ikbSK+EQB+jRNOfP+wTGLkWMDKsYt QoTi0qSy3SNTLVSyrKTM8oyU3MzNE1NDDWy00tLk5MT81JTCrWS87P3cQI9GQ9AwPjDsbWE36 HGCU5mJREeQVOz40Q4kvKT6nMSCzOiC8qzUktPsQow8GhJME7/R5QTrAoNT21Ii0zBxhSMGkJ Dh4lEd7DIGne4oLE3OLMdIjUKUZFKXHeKSAJAZBERmkeXBssjC8xykoJ8zIyMDAI8RSkFuVml qDKv2IU52BUEuZ9BjKFJzOvBG76K6DFTECLr5+eBbK4JBEhJdXAOJGZ77jIHR9Oed7e7kO8r3 NPbrhy/gCHdNv/kqy/uhEcV1d3sWWnmLUs/zq7KEZPqUvqetis4y7/YytDAw8rV/TnJbjNnc+ xaU2/7hJvCYU9/x9/WXGt+8Mpv5A/L//3nJr7jFlRfOaGx+Hzbm3KVW7c8ubg2pn9z49wPX3M azLt6MvHW/Rv3FFiKc5INNRiLipOBAAkStqRXgIAAA== X-Env-Sender: longman@redhat.com X-Msg-Ref: server-14.tower-31.messagelabs.com!1486741409!84988917!1 X-Originating-IP: [209.132.183.28] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMjA5LjEzMi4xODMuMjggPT4gNTQwNjQ=\n X-StarScan-Received: X-StarScan-Version: 9.1.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 3477 invoked from network); 10 Feb 2017 15:43:30 -0000 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by server-14.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 10 Feb 2017 15:43:30 -0000 Received: from smtp.corp.redhat.com (int-mx16.intmail.prod.int.phx2.redhat.com [10.5.11.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9F45AC04B927; Fri, 10 Feb 2017 15:43:28 +0000 (UTC) Received: from llong.com (dhcp-17-128.bos.redhat.com [10.18.17.128]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8A798B95A2; Fri, 10 Feb 2017 15:43:24 +0000 (UTC) From: Waiman Long To: Jeremy Fitzhardinge , Chris Wright , Alok Kataria , Rusty Russell , Peter Zijlstra , Ingo Molnar , Thomas Gleixner , "H. Peter Anvin" Date: Fri, 10 Feb 2017 10:43:09 -0500 Message-Id: <1486741389-8513-1-git-send-email-longman@redhat.com> X-Scanned-By: MIMEDefang 2.74 on 10.5.11.28 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 10 Feb 2017 15:43:29 +0000 (UTC) Cc: linux-arch@vger.kernel.org, Juergen Gross , kvm@vger.kernel.org, =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Pan Xinhui , x86@kernel.org, linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, Waiman Long , Paolo Bonzini , xen-devel@lists.xenproject.org, Boris Ostrovsky Subject: [Xen-devel] [PATCH v2] x86/paravirt: Don't make vcpu_is_preempted() a callee-save function X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP It was found when running fio sequential write test with a XFS ramdisk on a VM running on a 2-socket x86-64 system, the %CPU times as reported by perf were as follows: 69.75% 0.59% fio [k] down_write 69.15% 0.01% fio [k] call_rwsem_down_write_failed 67.12% 1.12% fio [k] rwsem_down_write_failed 63.48% 52.77% fio [k] osq_lock 9.46% 7.88% fio [k] __raw_callee_save___kvm_vcpu_is_preempt 3.93% 3.93% fio [k] __kvm_vcpu_is_preempted Making vcpu_is_preempted() a callee-save function has a relatively high cost on x86-64 primarily due to at least one more cacheline of data access from the saving and restoring of registers (8 of them) to and from stack as well as one more level of function call. As vcpu_is_preempted() is called within the spinlock, mutex and rwsem slowpaths, there isn't much to gain by making it callee-save. So it is now changed to a normal function call instead. With this patch applied on both bare-metal & KVM guest on a 2-socekt 16-core 32-thread system with 16 parallel jobs (8 on each socket), the aggregrate bandwidth of the fio test on an XFS ramdisk were as follows: Bare Metal KVM Guest I/O Type w/o patch with patch w/o patch with patch -------- --------- ---------- --------- ---------- random read 8650.5 MB/s 8560.9 MB/s 7602.9 MB/s 8196.1 MB/s seq read 9104.8 MB/s 9397.2 MB/s 8293.7 MB/s 8566.9 MB/s random write 1623.8 MB/s 1626.7 MB/s 1590.6 MB/s 1700.7 MB/s seq write 1626.4 MB/s 1624.9 MB/s 1604.8 MB/s 1726.3 MB/s The perf data (on KVM guest) now became: 70.78% 0.58% fio [k] down_write 70.20% 0.01% fio [k] call_rwsem_down_write_failed 69.70% 1.17% fio [k] rwsem_down_write_failed 59.91% 55.42% fio [k] osq_lock 10.14% 10.14% fio [k] __kvm_vcpu_is_preempted On bare metal, the patch doesn't introduce any performance regression. On KVM guest, it produces noticeable performance improvement (up to 7%). Signed-off-by: Waiman Long Acked-by: Paolo Bonzini --- v1->v2: - Rerun the fio test on a different system on both bare-metal and a KVM guest. Both sockets were utilized in this test. - The commit log was updated with new performance numbers, but the patch wasn't changed. - Drop patch 2. arch/x86/include/asm/paravirt.h | 2 +- arch/x86/include/asm/paravirt_types.h | 2 +- arch/x86/kernel/kvm.c | 7 ++----- arch/x86/kernel/paravirt-spinlocks.c | 6 ++---- arch/x86/xen/spinlock.c | 4 +--- 5 files changed, 7 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 864f57b..2515885 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -676,7 +676,7 @@ static __always_inline void pv_kick(int cpu) static __always_inline bool pv_vcpu_is_preempted(int cpu) { - return PVOP_CALLEE1(bool, pv_lock_ops.vcpu_is_preempted, cpu); + return PVOP_CALL1(bool, pv_lock_ops.vcpu_is_preempted, cpu); } #endif /* SMP && PARAVIRT_SPINLOCKS */ diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index bb2de45..88dc852 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -309,7 +309,7 @@ struct pv_lock_ops { void (*wait)(u8 *ptr, u8 val); void (*kick)(int cpu); - struct paravirt_callee_save vcpu_is_preempted; + bool (*vcpu_is_preempted)(int cpu); }; /* This contains all the paravirt structures: we get a convenient diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 099fcba..eb3753d 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -595,7 +595,6 @@ __visible bool __kvm_vcpu_is_preempted(int cpu) return !!src->preempted; } -PV_CALLEE_SAVE_REGS_THUNK(__kvm_vcpu_is_preempted); /* * Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present. @@ -614,10 +613,8 @@ void __init kvm_spinlock_init(void) pv_lock_ops.wait = kvm_wait; pv_lock_ops.kick = kvm_kick_cpu; - if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) { - pv_lock_ops.vcpu_is_preempted = - PV_CALLEE_SAVE(__kvm_vcpu_is_preempted); - } + if (kvm_para_has_feature(KVM_FEATURE_STEAL_TIME)) + pv_lock_ops.vcpu_is_preempted = __kvm_vcpu_is_preempted; } #endif /* CONFIG_PARAVIRT_SPINLOCKS */ diff --git a/arch/x86/kernel/paravirt-spinlocks.c b/arch/x86/kernel/paravirt-spinlocks.c index 6259327..da050bc 100644 --- a/arch/x86/kernel/paravirt-spinlocks.c +++ b/arch/x86/kernel/paravirt-spinlocks.c @@ -24,12 +24,10 @@ __visible bool __native_vcpu_is_preempted(int cpu) { return false; } -PV_CALLEE_SAVE_REGS_THUNK(__native_vcpu_is_preempted); bool pv_is_native_vcpu_is_preempted(void) { - return pv_lock_ops.vcpu_is_preempted.func == - __raw_callee_save___native_vcpu_is_preempted; + return pv_lock_ops.vcpu_is_preempted == __native_vcpu_is_preempted; } struct pv_lock_ops pv_lock_ops = { @@ -38,7 +36,7 @@ struct pv_lock_ops pv_lock_ops = { .queued_spin_unlock = PV_CALLEE_SAVE(__native_queued_spin_unlock), .wait = paravirt_nop, .kick = paravirt_nop, - .vcpu_is_preempted = PV_CALLEE_SAVE(__native_vcpu_is_preempted), + .vcpu_is_preempted = __native_vcpu_is_preempted, #endif /* SMP */ }; EXPORT_SYMBOL(pv_lock_ops); diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index 25a7c43..c85bb8f 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -114,8 +114,6 @@ void xen_uninit_lock_cpu(int cpu) per_cpu(irq_name, cpu) = NULL; } -PV_CALLEE_SAVE_REGS_THUNK(xen_vcpu_stolen); - /* * Our init of PV spinlocks is split in two init functions due to us * using paravirt patching and jump labels patching and having to do @@ -138,7 +136,7 @@ void __init xen_init_spinlocks(void) pv_lock_ops.queued_spin_unlock = PV_CALLEE_SAVE(__pv_queued_spin_unlock); pv_lock_ops.wait = xen_qlock_wait; pv_lock_ops.kick = xen_qlock_kick; - pv_lock_ops.vcpu_is_preempted = PV_CALLEE_SAVE(xen_vcpu_stolen); + pv_lock_ops.vcpu_is_preempted = xen_vcpu_stolen; } static __init int xen_parse_nopvspin(char *arg)