diff mbox

[RFC] arm: use built-in byte swap function

Message ID 20130131145947.f62474a0600848df86548b96@freescale.com (mailing list archive)
State New, archived
Headers show

Commit Message

Kim Phillips Jan. 31, 2013, 8:59 p.m. UTC
On Thu, 31 Jan 2013 09:28:01 +0000
Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:

> On Wed, Jan 30, 2013 at 08:09:00PM -0600, Kim Phillips wrote:
> > The savings come mostly from device-tree related code, and some
> > from drivers.
> 
> You forget that IP networking is all big endian, so these will be using
> the byte swapping too (search it for htons/ntohs/htonl/ntohl).

As David mentioned, there isn't much gain from net/ code.

> > v2:
> > - at91 and lpd270 builds fixed by limiting to ARMv6 and above
> >   (i.e., ARM cores that have support for the 'rev' instruction).
> >   Otherwise, the compiler emits calls to libgcc's __bswapsi2 on
> >   these ARMv4/v5 builds (and arch ARM doesn't link with libgcc).
> 
> Which compiler version?  gcc 4.5.4 doesn't do this, except for the 16-bit
> swap, so I doubt that any later compiler does.

I've tried both gcc 4.6.3 [1] and 4.6.4 [2].  If you can point me to
a 4.5.x, I'll try that, too, but as it stands now, if one moves the
code added to swab.h below outside of its armv6 protection,
gcc adds calls to __bswapsi2.

> > --- a/arch/arm/include/uapi/asm/swab.h
> > +++ b/arch/arm/include/uapi/asm/swab.h
> > @@ -50,4 +50,14 @@ static inline __attribute_const__ __u32 __arch_swab32(__u32 x)
> >  
> >  #endif
> >  
> > +#if defined(__KERNEL__) && __LINUX_ARM_ARCH__ >= 6
> > +#if GCC_VERSION >= 40600
> > +#define __HAVE_BUILTIN_BSWAP32__
> > +#define __HAVE_BUILTIN_BSWAP64__
> > +#endif
> > +#if GCC_VERSION >= 40800
> > +#define __HAVE_BUILTIN_BSWAP16__
> > +#endif
> > +#endif
> 
> If this is __KERNEL__ only, it should not be in a uapi header.  UAPI
> headers get exported to userland, this is not userland interface code.
> IT should be in arch/arm/include/asm/swab.h

right, I've fixed this and Boris' remove the help text comment, and
made a v3:

From 18c86580efba42d2680f2947867722705292f80a Mon Sep 17 00:00:00 2001
From: Kim Phillips <kim.phillips@freescale.com>
Date: Mon, 28 Jan 2013 19:30:33 -0600
Subject: [PATCH] arm: use built-in byte swap function

Enable the compiler intrinsic for byte swapping on arch ARM.  This
allows the compiler to detect and be able to optimize out byte
swappings, e.g. in big endian to big endian moves.

A ARCH_DEFINES_BUILTIN_BSWAP is added to allow an ARCH to select
it when it wants to control HAVE_BUILTIN_BSWAPxx definitions over
those in the generic compiler headers.  It can be dependent on a
combination of byte swapping instruction availability, the
instruction set version, and the state of support in different
compiler versions.

AFAICT, arm gcc got __builtin_bswap{32,64} support in 4.6,
and for the 16-bit version in 4.8.

This has a tiny benefit on vmlinux text size (gcc 4.6.4):

multi_v7_defconfig:
   text    data     bss     dec     hex filename
3135208  188396  203344 3526948  35d124 vmlinux
multi_v7_defconfig with builtin_bswap:
   text    data     bss     dec     hex filename
3135112  188396  203344 3526852  35d0c4 vmlinux

exynos_defconfig:
   text    data     bss     dec     hex filename
4286605  360564  223172 4870341  4a50c5 vmlinux
exynos_defconfig with builtin_bswap:
   text    data     bss     dec     hex filename
4286405  360564  223172 4870141  4a4ffd vmlinux

The savings come mostly from device-tree related code, and some
from drivers.

Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
---
akin to: http://comments.gmane.org/gmane.linux.kernel.cross-arch/16016

based on linux-next-20130128.  Depends on commit "compiler-gcc{3,4}.h:
Use GCC_VERSION macro" by Daniel Santos <daniel.santos@pobox.com>,
currently in the akpm branch.

v3:
- moved out of uapi swab.h into arch/arm/include/asm/swab.h
- moved ARCH_DEFINES_BUILTIN_BSWAP help text into commit message
- moved GCC_VERSION >= 40800 ifdef into GCC_VERSION >= 40600 block

v2:
- at91 and lpd270 builds fixed by limiting to ARMv6 and above
  (i.e., ARM cores that have support for the 'rev' instruction).
  Otherwise, the compiler emits calls to libgcc's __bswapsi2 on
  these ARMv4/v5 builds (and arch ARM doesn't link with libgcc).
  All ARM defconfigs now have the same build status as they did
  without this patch (some are broken on linux-next).

- move ARM check from generic compiler.h to arch ARM's swab.h.
  - pretty sure it should be limited to __KERNEL__ builds

- add new ARCH_DEFINES_BUILTIN_BSWAP (see Kconfig help).
  - if set, generic compiler header does not set HAVE_BUILTIN_BSWAPxx
  - not too sure about this having to be a new CONFIG_, but it's hard
    to find a place for it given linux/compiler.h doesn't include any
    arch-specific files.

- move new selects to end of CONFIG_ARM's Kconfig select list,
  as is done in David Woodhouse's original patchseries for ppc/x86.

 arch/Kconfig                  |    4 ++++
 arch/arm/Kconfig              |    2 ++
 arch/arm/include/asm/swab.h   |    8 ++++++++
 include/linux/compiler-gcc4.h |    3 ++-
 4 files changed, 16 insertions(+), 1 deletion(-)

Comments

Borislav Petkov Jan. 31, 2013, 9:33 p.m. UTC | #1
On Thu, Jan 31, 2013 at 02:59:47PM -0600, Kim Phillips wrote:
> - add new ARCH_DEFINES_BUILTIN_BSWAP (see Kconfig help).
>   - if set, generic compiler header does not set HAVE_BUILTIN_BSWAPxx
>   - not too sure about this having to be a new CONFIG_, but it's hard
>     to find a place for it given linux/compiler.h doesn't include any
>     arch-specific files.

Yeah, me neither. It seems to me the whole deal can be simplified even
further without introducing CONFIG_ARCH_DEFINES_BUILTIN_BSWAP.

And, we don't even want to use CONFIG_ARCH_USE_BUILTIN_BSWAP on arm due
to different compiler versions supporting it (correct me if I'm wrong
here) vs the generic thing in include/linux/compiler-gcc4.h which we
want off.

If so, you'd need to simply put the following from below in
arch/arm/include/asm/swab.h

#if GCC_VERSION >= 40600
#define __HAVE_BUILTIN_BSWAP32__
#define __HAVE_BUILTIN_BSWAP64__
#if GCC_VERSION >= 40800
#define __HAVE_BUILTIN_BSWAP16__
#endif /* GCC_VERSION >= 40800 */
#endif /* GCC_VERSION >= 40600 */

and that's it.

Makes sense or am I over-simplifying this?

Thanks.
David Woodhouse Jan. 31, 2013, 10:11 p.m. UTC | #2
On Thu, 2013-01-31 at 14:59 -0600, Kim Phillips wrote:
> 
> - add new ARCH_DEFINES_BUILTIN_BSWAP (see Kconfig help).

Ick, no.

>   - if set, generic compiler header does not set HAVE_BUILTIN_BSWAPxx

It won't do that anyway if !ARCH_USE_BUILTIN_BSWAP. I don't see the
point in adding a new config option just for this.

If you want to define __HAVE_BUILTIN_BSWAPxx__ for yourself manually,
just go ahead and do so. As I said, if lots of architectures end up
doing it then we'll worry about cleaning things up when we've got a
better picture of who needs what.
Russell King - ARM Linux Feb. 1, 2013, 1:17 a.m. UTC | #3
On Thu, Jan 31, 2013 at 02:59:47PM -0600, Kim Phillips wrote:
> On Thu, 31 Jan 2013 09:28:01 +0000
> Russell King - ARM Linux <linux@arm.linux.org.uk> wrote:
> 
> > On Wed, Jan 30, 2013 at 08:09:00PM -0600, Kim Phillips wrote:
> > > v2:
> > > - at91 and lpd270 builds fixed by limiting to ARMv6 and above
> > >   (i.e., ARM cores that have support for the 'rev' instruction).
> > >   Otherwise, the compiler emits calls to libgcc's __bswapsi2 on
> > >   these ARMv4/v5 builds (and arch ARM doesn't link with libgcc).
> > 
> > Which compiler version?  gcc 4.5.4 doesn't do this, except for the 16-bit
> > swap, so I doubt that any later compiler does.
> 
> I've tried both gcc 4.6.3 [1] and 4.6.4 [2].  If you can point me to
> a 4.5.x, I'll try that, too, but as it stands now, if one moves the
> code added to swab.h below outside of its armv6 protection,
> gcc adds calls to __bswapsi2.

Take a look at the message I sent on the 29th towards the beginning of
this thread for details of gcc 4.5.4 behaviour.
David Woodhouse Feb. 1, 2013, 7:33 a.m. UTC | #4
On Fri, 2013-02-01 at 01:17 +0000, Russell King - ARM Linux wrote:
> 
> > I've tried both gcc 4.6.3 [1] and 4.6.4 [2].  If you can point me to
> > a 4.5.x, I'll try that, too, but as it stands now, if one moves the
> > code added to swab.h below outside of its armv6 protection,
> > gcc adds calls to __bswapsi2.
> 
> Take a look at the message I sent on the 29th towards the beginning of
> this thread for details of gcc 4.5.4 behaviour.

I'd like to see a comment (with PR# if appropriate) explaining clearly
*why* it isn't enabled for <ARMv6 even with a bleeding-edge compiler.

Russell's test also seemed to indicate that the 32-bit and 64-bit swap
support was present and functional in GCC 4.5.4 (as indeed it should
have been since 4.4), so I'm still not quite sure why you require 4.6
for that.
diff mbox

Patch

diff --git a/arch/Kconfig b/arch/Kconfig
index 40e2b12..bc5ed77 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -141,6 +141,10 @@  config ARCH_USE_BUILTIN_BSWAP
 	 instructions should set this. And it shouldn't hurt to set it
 	 on architectures that don't have such instructions.
 
+config ARCH_DEFINES_BUILTIN_BSWAP
+       depends on ARCH_USE_BUILTIN_BSWAP
+       bool
+
 config HAVE_SYSCALL_WRAPPERS
 	bool
 
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 73027aa..b5868c2 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -57,6 +57,8 @@  config ARM
 	select CLONE_BACKWARDS
 	select OLD_SIGSUSPEND3
 	select OLD_SIGACTION
+	select ARCH_USE_BUILTIN_BSWAP
+	select ARCH_DEFINES_BUILTIN_BSWAP
 	help
 	  The ARM series is a line of low-power-consumption RISC chip designs
 	  licensed by ARM Ltd and targeted at embedded applications and
diff --git a/arch/arm/include/asm/swab.h b/arch/arm/include/asm/swab.h
index 537fc9b..e56acff 100644
--- a/arch/arm/include/asm/swab.h
+++ b/arch/arm/include/asm/swab.h
@@ -34,5 +34,13 @@  static inline __attribute_const__ __u32 __arch_swab32(__u32 x)
 }
 #define __arch_swab32 __arch_swab32
 
+#if GCC_VERSION >= 40600
+#define __HAVE_BUILTIN_BSWAP32__
+#define __HAVE_BUILTIN_BSWAP64__
+#if GCC_VERSION >= 40800
+#define __HAVE_BUILTIN_BSWAP16__
+#endif
+#endif
+
 #endif
 #endif
diff --git a/include/linux/compiler-gcc4.h b/include/linux/compiler-gcc4.h
index 68b162d..fce39cb 100644
--- a/include/linux/compiler-gcc4.h
+++ b/include/linux/compiler-gcc4.h
@@ -66,7 +66,8 @@ 
 #endif
 
 
-#ifdef CONFIG_ARCH_USE_BUILTIN_BSWAP
+#if defined(CONFIG_ARCH_USE_BUILTIN_BSWAP) && \
+    !defined(CONFIG_ARCH_DEFINES_BUILTIN_BSWAP)
 #if GCC_VERSION >= 40400
 #define __HAVE_BUILTIN_BSWAP32__
 #define __HAVE_BUILTIN_BSWAP64__