From patchwork Mon Jan 12 16:36:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Russell King - ARM Linux X-Patchwork-Id: 5611511 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 518AE9F357 for ; Mon, 12 Jan 2015 16:40:29 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 6EC0820627 for ; Mon, 12 Jan 2015 16:40:28 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5490A20647 for ; Mon, 12 Jan 2015 16:40:27 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YAhzZ-0007PD-ND; Mon, 12 Jan 2015 16:37:25 +0000 Received: from pandora.arm.linux.org.uk ([2001:4d48:ad52:3201:214:fdff:fe10:1be6]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YAhzT-0007IF-Bm for linux-arm-kernel@lists.infradead.org; Mon, 12 Jan 2015 16:37:22 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=arm.linux.org.uk; s=pandora-2014; h=Sender:Content-Type:MIME-Version:Message-ID:Subject:To:From:Date; bh=EpWcZFFT8yaqst4l+R91HUe8MHEpcF+Qs4aujlEAT4s=; b=VubeXZ9OqAWFzQO2MGmhl0drlBYYAtYNo0DZX8LNk/WpEAeDSOhJhoDT/wasBO1ryn5qL2OhN94szf+JaQoEv/J5ELi84Z7NGeTxH8JD0awtExh61s1iCGTi/NtB2mM+ZkoRWadtI/CL4gxYIS4s+HXaA3jSfkEUnE0fDu9b2qg=; Received: from n2100.arm.linux.org.uk ([fd8f:7570:feb6:1:214:fdff:fe10:4f86]:42405) by pandora.arm.linux.org.uk with esmtpsa (TLSv1:DHE-RSA-AES256-SHA:256) (Exim 4.82_1-5b7a7c0-XX) (envelope-from ) id 1YAhz1-00010Z-Mp for linux-arm-kernel@lists.infradead.org; Mon, 12 Jan 2015 16:36:51 +0000 Received: from linux by n2100.arm.linux.org.uk with local (Exim 4.76) (envelope-from ) id 1YAhyy-0005hk-NR for linux-arm-kernel@lists.infradead.org; Mon, 12 Jan 2015 16:36:48 +0000 Date: Mon, 12 Jan 2015 16:36:48 +0000 From: Russell King - ARM Linux To: linux-arm-kernel@lists.infradead.org Subject: CFT: move outer_cache_sync() out of line Message-ID: <20150112163648.GL12302@n2100.arm.linux.org.uk> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150112_083719_855863_CB7CEC39 X-CRM114-Status: GOOD ( 13.77 ) X-Spam-Score: -0.1 (/) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The theory is that moving outer_cache_sync() out of line moves the conditional test from multiple sites to a single location, which in turn means that the branch predictor stands a better chance of making the correct decision. However, I'm a hardware pauper - I have the choice of testing this on iMX6 (which seems to have bus limitations, and seems to produce a wide variation on measured performance which makes it hard to evaluate any differences) or the Versatile Express, which really is not suitable for making IO performance measurements. So, without help from people with "good" hardware, I can't evaluate the performance impact of this change. Anyone willing to do some performance testing on this and feedback any numbers? Obviously, drivers which make use of a lot of non-relaxed IO accessors will be most affected by this - so you really have to know your drivers when deciding how to evaluate this (sorry, I can't say "measure network bandwidth" or "measure SSD sata performance" is a good test.) Theoretically, this should help overall system performance, since the branch predictor should be able to predict this better, but it's entirely possible that trying to benchmark a single workload won't be measurably different. In terms of kernel size figures, this change alone saves almost 17K of 10MB of kernel text on my iMX6 kernels - which is bordering on insignificant since that's not quite a 0.2% saving. So... right now I can't justify this change, but I'm hoping some can come up with some figures which shows that it benefits their workload without causing a performance regression for others. Acked-by: Arnd Bergmann diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h index 891a56b35bcf..807e4e71c8e7 100644 --- a/arch/arm/include/asm/outercache.h +++ b/arch/arm/include/asm/outercache.h @@ -133,11 +133,7 @@ static inline void outer_resume(void) { } * Ensure that all outer cache operations are complete and any store * buffers are drained. */ -static inline void outer_sync(void) -{ - if (outer_cache.sync) - outer_cache.sync(); -} +extern void outer_sync(void); #else static inline void outer_sync(void) { } diff --git a/arch/arm/mm/l2c-common.c b/arch/arm/mm/l2c-common.c index 10a3cf28c362..b1c24c8c1eb9 100644 --- a/arch/arm/mm/l2c-common.c +++ b/arch/arm/mm/l2c-common.c @@ -7,9 +7,17 @@ * published by the Free Software Foundation. */ #include +#include #include #include +void outer_sync(void) +{ + if (outer_cache.sync) + outer_cache.sync(); +} +EXPORT_SYMBOL(outer_sync); + void outer_disable(void) { WARN_ON(!irqs_disabled());