From patchwork Thu Jan 31 20:59:47 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kim Phillips X-Patchwork-Id: 2075381 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 2DB0EDF2E5 for ; Thu, 31 Jan 2013 21:05:42 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1U11HZ-0004ZY-Gf; Thu, 31 Jan 2013 21:02:53 +0000 Received: from db3ehsobe005.messaging.microsoft.com ([213.199.154.143] helo=db3outboundpool.messaging.microsoft.com) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1U11HT-0004Ym-JR for linux-arm-kernel@lists.infradead.org; Thu, 31 Jan 2013 21:02:51 +0000 Received: from mail88-db3-R.bigfish.com (10.3.81.246) by DB3EHSOBE010.bigfish.com (10.3.84.30) with Microsoft SMTP Server id 14.1.225.23; Thu, 31 Jan 2013 21:02:45 +0000 Received: from mail88-db3 (localhost [127.0.0.1]) by mail88-db3-R.bigfish.com (Postfix) with ESMTP id 858D31603C1; Thu, 31 Jan 2013 21:02:45 +0000 (UTC) X-Forefront-Antispam-Report: CIP:70.37.183.190; KIP:(null); UIP:(null); IPV:NLI; H:mail.freescale.net; RD:none; EFVD:NLI X-SpamScore: -3 X-BigFish: VS-3(zz98dI1432I4015Izz1ee6h1de0h1202h1e76h1d1ah1d2ahzz177df4h17326ah8275bh8275dhz2dh2a8h668h839h944hd24he5bhf0ah1220h1288h12a5h12a9h12bdh137ah139eh13b6h1441h1504h1537h162dh1631h1758h1898h18e1h1946h1155h) Received: from mail88-db3 (localhost.localdomain [127.0.0.1]) by mail88-db3 (MessageSwitch) id 1359666163156221_11656; Thu, 31 Jan 2013 21:02:43 +0000 (UTC) Received: from DB3EHSMHS010.bigfish.com (unknown [10.3.81.244]) by mail88-db3.bigfish.com (Postfix) with ESMTP id 1889DA012C; Thu, 31 Jan 2013 21:02:43 +0000 (UTC) Received: from mail.freescale.net (70.37.183.190) by DB3EHSMHS010.bigfish.com (10.3.87.110) with Microsoft SMTP Server (TLS) id 14.1.225.23; Thu, 31 Jan 2013 21:02:31 +0000 Received: from az84smr01.freescale.net (10.64.34.197) by 039-SN1MMR1-005.039d.mgd.msft.net (10.84.1.17) with Microsoft SMTP Server (TLS) id 14.2.318.3; Thu, 31 Jan 2013 21:02:29 +0000 Received: from x9.am.freescale.net (x9.am.freescale.net [10.82.120.9]) by az84smr01.freescale.net (8.14.3/8.14.0) with SMTP id r0VL2Ldo001809; Thu, 31 Jan 2013 14:02:25 -0700 Date: Thu, 31 Jan 2013 14:59:47 -0600 From: Kim Phillips To: Russell King - ARM Linux Subject: Re: [RFC] arm: use built-in byte swap function Message-ID: <20130131145947.f62474a0600848df86548b96@freescale.com> In-Reply-To: <20130131092801.GV23505@n2100.arm.linux.org.uk> References: <20130128193033.8a0b0a871150c99247f05a95@freescale.com> <20130129083522.GA14302@pd.tnic> <1359478014.3529.157.camel@shinybook.infradead.org> <20130129174249.GB25415@pd.tnic> <1359482147.3529.161.camel@shinybook.infradead.org> <20130129181046.GC25415@pd.tnic> <1359541333.3529.186.camel@shinybook.infradead.org> <20130130200900.9d7cf7908caeaef4ecee1d61@freescale.com> <20130131092801.GV23505@n2100.arm.linux.org.uk> Organization: Freescale Semiconductor, Inc. X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.13; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-OriginatorOrg: freescale.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130131_160247_953083_9B87AC2C X-CRM114-Status: GOOD ( 29.46 ) X-Spam-Score: -2.6 (--) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-2.6 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low trust [213.199.154.143 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: "Woodhouse, David" , Rusty Russell , "linux-kernel@vger.kernel.org" , Daniel Santos , Borislav Petkov , David Rientjes , Andrew Morton , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Thu, 31 Jan 2013 09:28:01 +0000 Russell King - ARM Linux wrote: > On Wed, Jan 30, 2013 at 08:09:00PM -0600, Kim Phillips wrote: > > The savings come mostly from device-tree related code, and some > > from drivers. > > You forget that IP networking is all big endian, so these will be using > the byte swapping too (search it for htons/ntohs/htonl/ntohl). As David mentioned, there isn't much gain from net/ code. > > v2: > > - at91 and lpd270 builds fixed by limiting to ARMv6 and above > > (i.e., ARM cores that have support for the 'rev' instruction). > > Otherwise, the compiler emits calls to libgcc's __bswapsi2 on > > these ARMv4/v5 builds (and arch ARM doesn't link with libgcc). > > Which compiler version? gcc 4.5.4 doesn't do this, except for the 16-bit > swap, so I doubt that any later compiler does. I've tried both gcc 4.6.3 [1] and 4.6.4 [2]. If you can point me to a 4.5.x, I'll try that, too, but as it stands now, if one moves the code added to swab.h below outside of its armv6 protection, gcc adds calls to __bswapsi2. > > --- a/arch/arm/include/uapi/asm/swab.h > > +++ b/arch/arm/include/uapi/asm/swab.h > > @@ -50,4 +50,14 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x) > > > > #endif > > > > +#if defined(__KERNEL__) && __LINUX_ARM_ARCH__ >= 6 > > +#if GCC_VERSION >= 40600 > > +#define __HAVE_BUILTIN_BSWAP32__ > > +#define __HAVE_BUILTIN_BSWAP64__ > > +#endif > > +#if GCC_VERSION >= 40800 > > +#define __HAVE_BUILTIN_BSWAP16__ > > +#endif > > +#endif > > If this is __KERNEL__ only, it should not be in a uapi header. UAPI > headers get exported to userland, this is not userland interface code. > IT should be in arch/arm/include/asm/swab.h right, I've fixed this and Boris' remove the help text comment, and made a v3: From 18c86580efba42d2680f2947867722705292f80a Mon Sep 17 00:00:00 2001 From: Kim Phillips Date: Mon, 28 Jan 2013 19:30:33 -0600 Subject: [PATCH] arm: use built-in byte swap function Enable the compiler intrinsic for byte swapping on arch ARM. This allows the compiler to detect and be able to optimize out byte swappings, e.g. in big endian to big endian moves. A ARCH_DEFINES_BUILTIN_BSWAP is added to allow an ARCH to select it when it wants to control HAVE_BUILTIN_BSWAPxx definitions over those in the generic compiler headers. It can be dependent on a combination of byte swapping instruction availability, the instruction set version, and the state of support in different compiler versions. AFAICT, arm gcc got __builtin_bswap{32,64} support in 4.6, and for the 16-bit version in 4.8. This has a tiny benefit on vmlinux text size (gcc 4.6.4): multi_v7_defconfig: text data bss dec hex filename 3135208 188396 203344 3526948 35d124 vmlinux multi_v7_defconfig with builtin_bswap: text data bss dec hex filename 3135112 188396 203344 3526852 35d0c4 vmlinux exynos_defconfig: text data bss dec hex filename 4286605 360564 223172 4870341 4a50c5 vmlinux exynos_defconfig with builtin_bswap: text data bss dec hex filename 4286405 360564 223172 4870141 4a4ffd vmlinux The savings come mostly from device-tree related code, and some from drivers. Signed-off-by: Kim Phillips --- akin to: http://comments.gmane.org/gmane.linux.kernel.cross-arch/16016 based on linux-next-20130128. Depends on commit "compiler-gcc{3,4}.h: Use GCC_VERSION macro" by Daniel Santos , currently in the akpm branch. v3: - moved out of uapi swab.h into arch/arm/include/asm/swab.h - moved ARCH_DEFINES_BUILTIN_BSWAP help text into commit message - moved GCC_VERSION >= 40800 ifdef into GCC_VERSION >= 40600 block v2: - at91 and lpd270 builds fixed by limiting to ARMv6 and above (i.e., ARM cores that have support for the 'rev' instruction). Otherwise, the compiler emits calls to libgcc's __bswapsi2 on these ARMv4/v5 builds (and arch ARM doesn't link with libgcc). All ARM defconfigs now have the same build status as they did without this patch (some are broken on linux-next). - move ARM check from generic compiler.h to arch ARM's swab.h. - pretty sure it should be limited to __KERNEL__ builds - add new ARCH_DEFINES_BUILTIN_BSWAP (see Kconfig help). - if set, generic compiler header does not set HAVE_BUILTIN_BSWAPxx - not too sure about this having to be a new CONFIG_, but it's hard to find a place for it given linux/compiler.h doesn't include any arch-specific files. - move new selects to end of CONFIG_ARM's Kconfig select list, as is done in David Woodhouse's original patchseries for ppc/x86. arch/Kconfig | 4 ++++ arch/arm/Kconfig | 2 ++ arch/arm/include/asm/swab.h | 8 ++++++++ include/linux/compiler-gcc4.h | 3 ++- 4 files changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/Kconfig b/arch/Kconfig index 40e2b12..bc5ed77 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -141,6 +141,10 @@ config ARCH_USE_BUILTIN_BSWAP instructions should set this. And it shouldn't hurt to set it on architectures that don't have such instructions. +config ARCH_DEFINES_BUILTIN_BSWAP + depends on ARCH_USE_BUILTIN_BSWAP + bool + config HAVE_SYSCALL_WRAPPERS bool diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 73027aa..b5868c2 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -57,6 +57,8 @@ config ARM select CLONE_BACKWARDS select OLD_SIGSUSPEND3 select OLD_SIGACTION + select ARCH_USE_BUILTIN_BSWAP + select ARCH_DEFINES_BUILTIN_BSWAP help The ARM series is a line of low-power-consumption RISC chip designs licensed by ARM Ltd and targeted at embedded applications and diff --git a/arch/arm/include/asm/swab.h b/arch/arm/include/asm/swab.h index 537fc9b..e56acff 100644 --- a/arch/arm/include/asm/swab.h +++ b/arch/arm/include/asm/swab.h @@ -34,5 +34,13 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x) } #define __arch_swab32 __arch_swab32 +#if GCC_VERSION >= 40600 +#define __HAVE_BUILTIN_BSWAP32__ +#define __HAVE_BUILTIN_BSWAP64__ +#if GCC_VERSION >= 40800 +#define __HAVE_BUILTIN_BSWAP16__ +#endif +#endif + #endif #endif diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h index 68b162d..fce39cb 100644 --- a/include/linux/compiler-gcc4.h +++ b/include/linux/compiler-gcc4.h @@ -66,7 +66,8 @@ #endif -#ifdef CONFIG_ARCH_USE_BUILTIN_BSWAP +#if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) && \ + !defined(CONFIG_ARCH_DEFINES_BUILTIN_BSWAP) #if GCC_VERSION >= 40400 #define __HAVE_BUILTIN_BSWAP32__ #define __HAVE_BUILTIN_BSWAP64__