From patchwork Tue Aug 29 11:46:35 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Zhang X-Patchwork-Id: 9927213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id AF4DC6022E for ; Tue, 29 Aug 2017 11:49:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9C21827FA3 for ; Tue, 29 Aug 2017 11:49:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 908D8283CB; Tue, 29 Aug 2017 11:49:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3662428405 for ; Tue, 29 Aug 2017 11:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752279AbdH2LtY (ORCPT ); Tue, 29 Aug 2017 07:49:24 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:36366 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752517AbdH2LrW (ORCPT ); Tue, 29 Aug 2017 07:47:22 -0400 Received: by mail-pf0-f195.google.com with SMTP id k3so2182462pfc.3; Tue, 29 Aug 2017 04:47:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=EIEba2nepIvzSrEzyg56ekP64lWtlDAF8Xmxil/0BRA=; b=kq6oaDEKgAPgSD8hkVZvrQBQ4juxYT0pB5qwAjXxL4PmiA1+m9jxLHx9ewHfwtOVD7 Ep9VN0HLjugJ46VgZ70CGcM8mPabmn6+vZkQnq2fTj5iwOw/ldC0S/likYvJRHW9Q27b 1cwQ4FNkNCDPkgv+5nTO0u/UNc19T8cj8mxGGRsHFij9k88LUsZdrHcO/N3wmuO8+0Lv KUyNUTJ/sJfEH/KOLrE0aRYqE0QUGKUSX6WeJkdsIBWxFhRVT+sE0V+XiR3/VewkFyv8 OWZkWlEDyzSQ3tJvb1DoRWI7YjSx0ES6GrW7rn9i1pYm/D6RoVQ0aHKrql0AOpf6NFhI 7XzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=EIEba2nepIvzSrEzyg56ekP64lWtlDAF8Xmxil/0BRA=; b=jEne48i4PJTJvgLyCmfR+hVRPsaFJ1880SpcXlODoAdLjZlSHaAxBlOXxH70JH5O++ 9NoSglfnAb6V9QnPbEDLoVDAMrKjXiOi7BTE2bVLL5E2ExlZksbDO7+s8MOVRKykF6pY w/y24raMRcLtmgM2FZf2mttFPj4J+iyD4fbY5Xsk4CsVIzQND5J00dFNh3lpllNH8dr3 QhDOmkoC/n/klTcl4Gle18sFN66DU4NEolHi5/Jm5o0TZoXwD3QqTd0Di86B4XP22pBJ UAjslbEcECNdl9AGvGt924DbBSBZZfMrAEVfHa2pn+8wgCFFkzHSswZRGccouijNMvaH NEzw== X-Gm-Message-State: AHYfb5jm3jQeJvjqt3TRCSwZ4j+ei52k7ABlJigfXqqYdFKbdRAZBHTX GKQswMSLea2Is74sQKA= X-Received: by 10.84.239.8 with SMTP id w8mr45451plk.79.1504007241644; Tue, 29 Aug 2017 04:47:21 -0700 (PDT) Received: from ip-172-31-39-62.us-west-2.compute.internal (ec2-52-40-99-241.us-west-2.compute.amazonaws.com. [52.40.99.241]) by smtp.googlemail.com with ESMTPSA id e90sm2550747pfb.172.2017.08.29.04.47.20 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Tue, 29 Aug 2017 04:47:21 -0700 (PDT) From: Yang Zhang To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, wanpeng.li@hotmail.com, mst@redhat.com, pbonzini@redhat.com, tglx@linutronix.de, rkrcmar@redhat.com, dmatlack@google.com, agraf@suse.de, peterz@infradead.org, linux-doc@vger.kernel.org, Yang Zhang , Quan Xu , Jeremy Fitzhardinge , Chris Wright , Alok Kataria , Rusty Russell , Ingo Molnar , "H. Peter Anvin" , x86@kernel.org, Andy Lutomirski , "Kirill A. Shutemov" , Pan Xinhui , Kees Cook , virtualization@lists.linux-foundation.org Subject: [RFC PATCH v2 1/7] x86/paravirt: Add pv_idle_ops to paravirt ops Date: Tue, 29 Aug 2017 11:46:35 +0000 Message-Id: <1504007201-12904-2-git-send-email-yang.zhang.wz@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1504007201-12904-1-git-send-email-yang.zhang.wz@gmail.com> References: <1504007201-12904-1-git-send-email-yang.zhang.wz@gmail.com> Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP So far, pv_idle_ops.poll is the only ops for pv_idle. .poll is called in idle path which will polling for a while before we enter the real idle state. In virtualization, idle path includes several heavy operations includes timer access(LAPIC timer or TSC deadline timer) which will hurt performance especially for latency intensive workload like message passing task. The cost is mainly come from the vmexit which is a hardware context switch between VM and hypervisor. Our solution is to poll for a while and do not enter real idle path if we can get the schedule event during polling. Poll may cause the CPU waste so we adopt a smart polling mechanism to reduce the useless poll. Signed-off-by: Yang Zhang Signed-off-by: Quan Xu Cc: Jeremy Fitzhardinge Cc: Chris Wright Cc: Alok Kataria Cc: Rusty Russell Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: x86@kernel.org Cc: Peter Zijlstra Cc: Andy Lutomirski Cc: "Kirill A. Shutemov" Cc: Pan Xinhui Cc: Kees Cook Cc: virtualization@lists.linux-foundation.org Cc: linux-kernel@vger.kernel.org --- arch/x86/include/asm/paravirt.h | 5 +++++ arch/x86/include/asm/paravirt_types.h | 6 ++++++ arch/x86/kernel/paravirt.c | 6 ++++++ 3 files changed, 17 insertions(+) diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h index 9ccac19..6d46760 100644 --- a/arch/x86/include/asm/paravirt.h +++ b/arch/x86/include/asm/paravirt.h @@ -202,6 +202,11 @@ static inline unsigned long long paravirt_read_pmc(int counter) #define rdpmcl(counter, val) ((val) = paravirt_read_pmc(counter)) +static inline void paravirt_idle_poll(void) +{ + PVOP_VCALL0(pv_idle_ops.poll); +} + static inline void paravirt_alloc_ldt(struct desc_struct *ldt, unsigned entries) { PVOP_VCALL2(pv_cpu_ops.alloc_ldt, ldt, entries); diff --git a/arch/x86/include/asm/paravirt_types.h b/arch/x86/include/asm/paravirt_types.h index 9ffc36b..cf45726 100644 --- a/arch/x86/include/asm/paravirt_types.h +++ b/arch/x86/include/asm/paravirt_types.h @@ -324,6 +324,10 @@ struct pv_lock_ops { struct paravirt_callee_save vcpu_is_preempted; } __no_randomize_layout; +struct pv_idle_ops { + void (*poll)(void); +} __no_randomize_layout; + /* This contains all the paravirt structures: we get a convenient * number for each function using the offset which we use to indicate * what to patch. */ @@ -334,6 +338,7 @@ struct paravirt_patch_template { struct pv_irq_ops pv_irq_ops; struct pv_mmu_ops pv_mmu_ops; struct pv_lock_ops pv_lock_ops; + struct pv_idle_ops pv_idle_ops; } __no_randomize_layout; extern struct pv_info pv_info; @@ -343,6 +348,7 @@ struct paravirt_patch_template { extern struct pv_irq_ops pv_irq_ops; extern struct pv_mmu_ops pv_mmu_ops; extern struct pv_lock_ops pv_lock_ops; +extern struct pv_idle_ops pv_idle_ops; #define PARAVIRT_PATCH(x) \ (offsetof(struct paravirt_patch_template, x) / sizeof(void *)) diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index bc0a849..1b5b247 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -128,6 +128,7 @@ static void *get_call_destination(u8 type) #ifdef CONFIG_PARAVIRT_SPINLOCKS .pv_lock_ops = pv_lock_ops, #endif + .pv_idle_ops = pv_idle_ops, }; return *((void **)&tmpl + type); } @@ -312,6 +313,10 @@ struct pv_time_ops pv_time_ops = { .steal_clock = native_steal_clock, }; +struct pv_idle_ops pv_idle_ops = { + .poll = paravirt_nop, +}; + __visible struct pv_irq_ops pv_irq_ops = { .save_fl = __PV_IS_CALLEE_SAVE(native_save_fl), .restore_fl = __PV_IS_CALLEE_SAVE(native_restore_fl), @@ -471,3 +476,4 @@ struct pv_mmu_ops pv_mmu_ops __ro_after_init = { EXPORT_SYMBOL (pv_mmu_ops); EXPORT_SYMBOL_GPL(pv_info); EXPORT_SYMBOL (pv_irq_ops); +EXPORT_SYMBOL (pv_idle_ops);