From patchwork Tue Feb 16 10:14:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 8323741 Return-Path: X-Original-To: patchwork-xen-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 7ED4C9F372 for ; Tue, 16 Feb 2016 10:17:34 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 646C220295 for ; Tue, 16 Feb 2016 10:17:33 +0000 (UTC) Received: from lists.xen.org (lists.xenproject.org [50.57.142.19]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3A87A20270 for ; Tue, 16 Feb 2016 10:17:32 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xen.org) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aVcei-0005A0-DO; Tue, 16 Feb 2016 10:14:52 +0000 Received: from mail6.bemta3.messagelabs.com ([195.245.230.39]) by lists.xen.org with esmtp (Exim 4.72) (envelope-from ) id 1aVceg-00057d-E1 for xen-devel@lists.xenproject.org; Tue, 16 Feb 2016 10:14:50 +0000 Received: from [85.158.137.68] by server-6.bemta-3.messagelabs.com id 3A/E3-08479-996F2C65; Tue, 16 Feb 2016 10:14:49 +0000 X-Env-Sender: JBeulich@suse.com X-Msg-Ref: server-12.tower-31.messagelabs.com!1455617686!22648998!1 X-Originating-IP: [137.65.248.74] X-SpamReason: No, hits=0.0 required=7.0 tests= X-StarScan-Received: X-StarScan-Version: 7.35.1; banners=-,-,- X-VirusChecked: Checked Received: (qmail 42553 invoked from network); 16 Feb 2016 10:14:48 -0000 Received: from prv-mh.provo.novell.com (HELO prv-mh.provo.novell.com) (137.65.248.74) by server-12.tower-31.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 16 Feb 2016 10:14:48 -0000 Received: from INET-PRV-MTA by prv-mh.provo.novell.com with Novell_GroupWise; Tue, 16 Feb 2016 03:14:46 -0700 Message-Id: <56C304A502000078000D2879@prv-mh.provo.novell.com> X-Mailer: Novell GroupWise Internet Agent 14.2.0 Date: Tue, 16 Feb 2016 03:14:45 -0700 From: "Jan Beulich" To: "xen-devel" Mime-Version: 1.0 Cc: Andrew Cooper , Keir Fraser Subject: [Xen-devel] [PATCH v2] x86: avoid flush IPI when possible X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Since CLFLUSH, other than WBINVD, is a cache coherency domain wide flush, there's no need to IPI other CPUs if this is the only flushing being requested. (As a secondary change, move a local variable into the scope where it's actually needed.) Signed-off-by: Jan Beulich --- v2: Adjust the meaning of flush_area_local()'s return value and prefix the function with a respective comment. x86: avoid flush IPI when possible Since CLFLUSH, other than WBINVD, is a cache coherency domain wide flush, there's no need to IPI other CPUs if this is the only flushing being requested. (As a secondary change, move a local variable into the scope where it's actually needed.) Signed-off-by: Jan Beulich --- v2: Adjust the meaning of flush_area_local()'s return value and prefix the function with a respective comment. --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -91,9 +91,13 @@ void write_cr3(unsigned long cr3) local_irq_restore(flags); } -void flush_area_local(const void *va, unsigned int flags) +/* + * The return value of this function is the passed in "flags" argument with + * bits cleared that have been fully (i.e. system-wide) taken care of, i.e. + * namely not requiring any further action on remote CPUs. + */ +unsigned int flush_area_local(const void *va, unsigned int flags) { - const struct cpuinfo_x86 *c = ¤t_cpu_data; unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; unsigned long irqfl; @@ -130,6 +134,7 @@ void flush_area_local(const void *va, un if ( flags & FLUSH_CACHE ) { + const struct cpuinfo_x86 *c = ¤t_cpu_data; unsigned long i, sz = 0; if ( order < (BITS_PER_LONG - PAGE_SHIFT) ) @@ -146,6 +151,7 @@ void flush_area_local(const void *va, un "data16 clflush %0", /* clflushopt */ X86_FEATURE_CLFLUSHOPT, "m" (((const char *)va)[i])); + flags &= ~FLUSH_CACHE; } else { @@ -154,4 +160,6 @@ void flush_area_local(const void *va, un } local_irq_restore(irqfl); + + return flags; } --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -205,26 +205,30 @@ static unsigned int flush_flags; void invalidate_interrupt(struct cpu_user_regs *regs) { + unsigned int flags = flush_flags; ack_APIC_irq(); perfc_incr(ipis); - if ( !__sync_local_execstate() || - (flush_flags & (FLUSH_TLB_GLOBAL | FLUSH_CACHE)) ) - flush_area_local(flush_va, flush_flags); + if ( __sync_local_execstate() ) + flags &= ~FLUSH_TLB; + flush_area_local(flush_va, flags); cpumask_clear_cpu(smp_processor_id(), &flush_cpumask); } void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int flags) { + unsigned int cpu = smp_processor_id(); + ASSERT(local_irq_is_enabled()); - if ( cpumask_test_cpu(smp_processor_id(), mask) ) - flush_area_local(va, flags); + if ( cpumask_test_cpu(cpu, mask) ) + flags = flush_area_local(va, flags); - if ( !cpumask_subset(mask, cpumask_of(smp_processor_id())) ) + if ( (flags & ~FLUSH_ORDER_MASK) && + !cpumask_subset(mask, cpumask_of(cpu)) ) { spin_lock(&flush_lock); cpumask_and(&flush_cpumask, mask, &cpu_online_map); - cpumask_clear_cpu(smp_processor_id(), &flush_cpumask); + cpumask_clear_cpu(cpu, &flush_cpumask); flush_va = va; flush_flags = flags; send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); --- a/xen/include/asm-x86/flushtlb.h +++ b/xen/include/asm-x86/flushtlb.h @@ -87,7 +87,7 @@ void write_cr3(unsigned long cr3); #define FLUSH_CACHE 0x400 /* Flush local TLBs/caches. */ -void flush_area_local(const void *va, unsigned int flags); +unsigned int flush_area_local(const void *va, unsigned int flags); #define flush_local(flags) flush_area_local(NULL, flags) /* Flush specified CPUs' TLBs/caches */ --- a/xen/arch/x86/flushtlb.c +++ b/xen/arch/x86/flushtlb.c @@ -91,9 +91,13 @@ void write_cr3(unsigned long cr3) local_irq_restore(flags); } -void flush_area_local(const void *va, unsigned int flags) +/* + * The return value of this function is the passed in "flags" argument with + * bits cleared that have been fully (i.e. system-wide) taken care of, i.e. + * namely not requiring any further action on remote CPUs. + */ +unsigned int flush_area_local(const void *va, unsigned int flags) { - const struct cpuinfo_x86 *c = ¤t_cpu_data; unsigned int order = (flags - 1) & FLUSH_ORDER_MASK; unsigned long irqfl; @@ -130,6 +134,7 @@ void flush_area_local(const void *va, un if ( flags & FLUSH_CACHE ) { + const struct cpuinfo_x86 *c = ¤t_cpu_data; unsigned long i, sz = 0; if ( order < (BITS_PER_LONG - PAGE_SHIFT) ) @@ -146,6 +151,7 @@ void flush_area_local(const void *va, un "data16 clflush %0", /* clflushopt */ X86_FEATURE_CLFLUSHOPT, "m" (((const char *)va)[i])); + flags &= ~FLUSH_CACHE; } else { @@ -154,4 +160,6 @@ void flush_area_local(const void *va, un } local_irq_restore(irqfl); + + return flags; } --- a/xen/arch/x86/smp.c +++ b/xen/arch/x86/smp.c @@ -205,26 +205,30 @@ static unsigned int flush_flags; void invalidate_interrupt(struct cpu_user_regs *regs) { + unsigned int flags = flush_flags; ack_APIC_irq(); perfc_incr(ipis); - if ( !__sync_local_execstate() || - (flush_flags & (FLUSH_TLB_GLOBAL | FLUSH_CACHE)) ) - flush_area_local(flush_va, flush_flags); + if ( __sync_local_execstate() ) + flags &= ~FLUSH_TLB; + flush_area_local(flush_va, flags); cpumask_clear_cpu(smp_processor_id(), &flush_cpumask); } void flush_area_mask(const cpumask_t *mask, const void *va, unsigned int flags) { + unsigned int cpu = smp_processor_id(); + ASSERT(local_irq_is_enabled()); - if ( cpumask_test_cpu(smp_processor_id(), mask) ) - flush_area_local(va, flags); + if ( cpumask_test_cpu(cpu, mask) ) + flags = flush_area_local(va, flags); - if ( !cpumask_subset(mask, cpumask_of(smp_processor_id())) ) + if ( (flags & ~FLUSH_ORDER_MASK) && + !cpumask_subset(mask, cpumask_of(cpu)) ) { spin_lock(&flush_lock); cpumask_and(&flush_cpumask, mask, &cpu_online_map); - cpumask_clear_cpu(smp_processor_id(), &flush_cpumask); + cpumask_clear_cpu(cpu, &flush_cpumask); flush_va = va; flush_flags = flags; send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); --- a/xen/include/asm-x86/flushtlb.h +++ b/xen/include/asm-x86/flushtlb.h @@ -87,7 +87,7 @@ void write_cr3(unsigned long cr3); #define FLUSH_CACHE 0x400 /* Flush local TLBs/caches. */ -void flush_area_local(const void *va, unsigned int flags); +unsigned int flush_area_local(const void *va, unsigned int flags); #define flush_local(flags) flush_area_local(NULL, flags) /* Flush specified CPUs' TLBs/caches */