From patchwork Fri Feb 22 02:33:27 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kim Phillips X-Patchwork-Id: 2174401 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from merlin.infradead.org (merlin.infradead.org [205.233.59.134]) by patchwork2.kernel.org (Postfix) with ESMTP id 78EA4DF215 for ; Fri, 22 Feb 2013 03:07:57 +0000 (UTC) Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.76 #1 (Red Hat Linux)) id 1U8iwa-0005sf-5h; Fri, 22 Feb 2013 03:05:04 +0000 Received: from tx2ehsobe002.messaging.microsoft.com ([65.55.88.12] helo=tx2outboundpool.messaging.microsoft.com) by merlin.infradead.org with esmtps (Exim 4.76 #1 (Red Hat Linux)) id 1U8iwW-0005sI-TQ for linux-arm-kernel@lists.infradead.org; Fri, 22 Feb 2013 03:05:01 +0000 Received: from mail95-tx2-R.bigfish.com (10.9.14.246) by TX2EHSOBE002.bigfish.com (10.9.40.22) with Microsoft SMTP Server id 14.1.225.23; Fri, 22 Feb 2013 03:04:58 +0000 Received: from mail95-tx2 (localhost [127.0.0.1]) by mail95-tx2-R.bigfish.com (Postfix) with ESMTP id AC1173401E9; Fri, 22 Feb 2013 03:04:57 +0000 (UTC) X-Forefront-Antispam-Report: CIP:70.37.183.190; KIP:(null); UIP:(null); IPV:NLI; H:mail.freescale.net; RD:none; EFVD:NLI X-SpamScore: 0 X-BigFish: VS0(zz98dI936eI1432I4015Ib922lzz1f42h1ee6h1de0h1202h1e76h1d1ah1d2ahzz8275ch8275dhz2dh2a8h668h839h944hd24he5bhf0ah1220h1288h12a5h12a9h12bdh137ah139eh13b6h1441h1504h1537h162dh1631h1758h1898h18e1h1946h19b5h1155h) Received: from mail95-tx2 (localhost.localdomain [127.0.0.1]) by mail95-tx2 (MessageSwitch) id 1361502223161737_29529; Fri, 22 Feb 2013 03:03:43 +0000 (UTC) Received: from TX2EHSMHS036.bigfish.com (unknown [10.9.14.241]) by mail95-tx2.bigfish.com (Postfix) with ESMTP id EEB6E3C0125; Fri, 22 Feb 2013 02:36:40 +0000 (UTC) Received: from mail.freescale.net (70.37.183.190) by TX2EHSMHS036.bigfish.com (10.9.99.136) with Microsoft SMTP Server (TLS) id 14.1.225.23; Fri, 22 Feb 2013 02:36:40 +0000 Received: from tx30smr01.am.freescale.net (10.81.153.31) by 039-SN1MMR1-004.039d.mgd.msft.net (10.84.1.14) with Microsoft SMTP Server (TLS) id 14.2.328.11; Fri, 22 Feb 2013 02:36:38 +0000 Received: from x9.am.freescale.net (x9.am.freescale.net [10.82.120.9]) by tx30smr01.am.freescale.net (8.14.3/8.14.0) with SMTP id r1M2aXP2010549; Thu, 21 Feb 2013 19:36:34 -0700 Date: Thu, 21 Feb 2013 20:33:27 -0600 From: Kim Phillips To: Nicolas Pitre Subject: Re: [RFC] arm: use built-in byte swap function Message-ID: <20130221203327.6558f89277468f7ffffa6506@freescale.com> In-Reply-To: References: <20130129181046.GC25415@pd.tnic> <1360344301.6066.263.camel@shinybook.infradead.org> <1360363233.6066.283.camel@shinybook.infradead.org> <20130208191208.2ef3d78bda71aa7b44d00d7b@freescale.com> <20130219203115.114eab79e8d2099c6306d921@freescale.com> <1361356696.13482.269.camel@i7.infradead.org> <1361367842.13482.279.camel@i7.infradead.org> <1361372008.13482.280.camel@i7.infradead.org> <20130220214943.9b28a5b208da9f081387c55e@freescale.com> <20130221005221.15279b1372501af12c1e4f32@freescale.com> Organization: Freescale Semiconductor, Inc. X-Mailer: Sylpheed 3.2.0 (GTK+ 2.24.13; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-OriginatorOrg: freescale.com X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20130221_220501_128094_170198CA X-CRM114-Status: GOOD ( 27.24 ) X-Spam-Score: -4.2 (----) X-Spam-Report: SpamAssassin version 3.3.2 on merlin.infradead.org summary: Content analysis details: (-4.2 points) pts rule name description ---- ---------------------- -------------------------------------------------- -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium trust [65.55.88.12 listed in list.dnswl.org] -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Cc: Russell King - ARM Linux , "Woodhouse, David" , Rusty Russell , "linux-kernel@vger.kernel.org" , Daniel Santos , Borislav Petkov , David Rientjes , Andrew Morton , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linux-arm-kernel-bounces@lists.infradead.org Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org On Thu, 21 Feb 2013 11:40:54 -0500 Nicolas Pitre wrote: > On Thu, 21 Feb 2013, Kim Phillips wrote: > > > On Wed, 20 Feb 2013 23:29:58 -0500 > > Nicolas Pitre wrote: > > > > > On Wed, 20 Feb 2013, Kim Phillips wrote: > > > > > > > On Wed, 20 Feb 2013 10:43:18 -0500 > > > > Nicolas Pitre wrote: > > > > > > > > > On Wed, 20 Feb 2013, Woodhouse, David wrote: > > > > > > On Wed, 2013-02-20 at 09:06 -0500, Nicolas Pitre wrote: > > > > > > > ... in which case there is no harm shipping a .c file and trivially > > > > > > > enforcing -O2, the rest being equal. > > > > > > > > > > > > For today's compilers, unless the wind changes. > > > > > > > > > > We'll adapt if necessary. Going with -O2 should remain pretty safe anyway. > > > > > > > > Alas, not so for gcc 4.4 - I had forgotten I had tested > > > > Ubuntu/Linaro 4.4.7-1ubuntu2 here: > > > > > > > > https://patchwork.kernel.org/patch/2101491/ > > > > > > > > add -O2 to that test script and gcc 4.4 *always* emits calls to > > > > __bswap[sd]i2, even with -march=armv6k+. > > > > argh, sorry - that script was testing support for > > __builtin_bswap{16,32,64} directly, which isn't the same as testing > > code generation of a byte swap pattern in C. > > Still, I'm not as confident as I was about this. which part exactly? Having -O2 as "protection"? Yes, me neither. > > I'll still try the assembly approach - gcc 4.4's armv6 output looks > > worse than both the pre-armv6 and post-armv6 __arch_swab32 > > implementations currently in use: > > > > mov ip, sp > > push {fp, ip, lr, pc} > > sub fp, ip, #4 > > You should use -fomit-frame-pointer to compile this. We don't need a > frame pointer here, especially for a leaf function that the compiler > decides to call on its own. > > > and r2, r0, #65280 ; 0xff00 > > lsl ip, r0, #24 > > orr r1, ip, r0, lsr #24 > > and r0, r0, #16711680 ; 0xff0000 > > orr r3, r1, r2, lsl #8 > > orr r0, r3, r0, lsr #8 > > Other than that, it is true that the above is slightly suboptimal. Here's the asm version I'm working on now, based on compiler output of the C version. Haven't tested beyond defconfig builds, which pass ok. Is there anything I have to do for thumb mode? If so, how to test? Thanks, Kim diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index dedf02b..e8a41d0 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -59,6 +59,7 @@ config ARM select CLONE_BACKWARDS select OLD_SIGSUSPEND3 select OLD_SIGACTION + select ARCH_USE_BUILTIN_BSWAP help The ARM series is a line of low-power-consumption RISC chip designs licensed by ARM Ltd and targeted at embedded applications and diff --git a/arch/arm/boot/compressed/Makefile b/arch/arm/boot/compressed/Makefile index 5cad8a6..a277e97 100644 --- a/arch/arm/boot/compressed/Makefile +++ b/arch/arm/boot/compressed/Makefile @@ -108,12 +108,12 @@ endif targets := vmlinux vmlinux.lds \ piggy.$(suffix_y) piggy.$(suffix_y).o \ - lib1funcs.o lib1funcs.S ashldi3.o ashldi3.S \ + lib1funcs.o lib1funcs.S ashldi3.o ashldi3.S bswapsdi2.o \ font.o font.c head.o misc.o $(OBJS) # Make sure files are removed during clean extra-y += piggy.gzip piggy.lzo piggy.lzma piggy.xzkern \ - lib1funcs.S ashldi3.S $(libfdt) $(libfdt_hdrs) + lib1funcs.S ashldi3.S bswapsdi2.o $(libfdt) $(libfdt_hdrs) ifeq ($(CONFIG_FUNCTION_TRACER),y) ORIG_CFLAGS := $(KBUILD_CFLAGS) @@ -155,6 +155,12 @@ ashldi3 = $(obj)/ashldi3.o $(obj)/ashldi3.S: $(srctree)/arch/$(SRCARCH)/lib/ashldi3.S $(call cmd,shipped) +# For __bswapsi2, __bswapdi2 +bswapsdi2 = $(obj)/bswapsdi2.o + +$(obj)/bswapsdi2.S: $(srctree)/arch/$(SRCARCH)/lib/bswapsdi2.S + $(call cmd,shipped) + # We need to prevent any GOTOFF relocs being used with references # to symbols in the .bss section since we cannot relocate them # independently from the rest at run time. This can be achieved by @@ -176,7 +182,8 @@ if [ $(words $(ZRELADDR)) -gt 1 -a "$(CONFIG_AUTO_ZRELADDR)" = "" ]; then \ fi $(obj)/vmlinux: $(obj)/vmlinux.lds $(obj)/$(HEAD) $(obj)/piggy.$(suffix_y).o \ - $(addprefix $(obj)/, $(OBJS)) $(lib1funcs) $(ashldi3) FORCE + $(addprefix $(obj)/, $(OBJS)) $(lib1funcs) $(ashldi3) \ + $(bswapsdi2) FORCE @$(check_for_multiple_zreladdr) $(call if_changed,ld) @$(check_for_bad_syms) diff --git a/arch/arm/kernel/armksyms.c b/arch/arm/kernel/armksyms.c index 60d3b73..ba578f7 100644 --- a/arch/arm/kernel/armksyms.c +++ b/arch/arm/kernel/armksyms.c @@ -35,6 +35,8 @@ extern void __ucmpdi2(void); extern void __udivsi3(void); extern void __umodsi3(void); extern void __do_div64(void); +extern void __bswapsi2(void); +extern void __bswapdi2(void); extern void __aeabi_idiv(void); extern void __aeabi_idivmod(void); @@ -114,6 +116,8 @@ EXPORT_SYMBOL(__ucmpdi2); EXPORT_SYMBOL(__udivsi3); EXPORT_SYMBOL(__umodsi3); EXPORT_SYMBOL(__do_div64); +EXPORT_SYMBOL(__bswapsi2); +EXPORT_SYMBOL(__bswapdi2); #ifdef CONFIG_AEABI EXPORT_SYMBOL(__aeabi_idiv); diff --git a/arch/arm/lib/Makefile b/arch/arm/lib/Makefile index af72969..5383df7 100644 --- a/arch/arm/lib/Makefile +++ b/arch/arm/lib/Makefile @@ -13,7 +13,7 @@ lib-y := backtrace.o changebit.o csumipv6.o csumpartial.o \ ashldi3.o ashrdi3.o lshrdi3.o muldi3.o \ ucmpdi2.o lib1funcs.o div64.o \ io-readsb.o io-writesb.o io-readsl.o io-writesl.o \ - call_with_stack.o + call_with_stack.o bswapsdi2.o mmu-y := clear_user.o copy_page.o getuser.o putuser.o diff --git a/arch/arm/lib/bswapsdi2.S b/arch/arm/lib/bswapsdi2.S new file mode 100644 index 0000000..e9c8ca7 --- /dev/null +++ b/arch/arm/lib/bswapsdi2.S @@ -0,0 +1,36 @@ +#include + +#if __LINUX_ARM_ARCH__ >= 6 +ENTRY(__bswapsi2) + rev r0, r0 + bx lr +ENDPROC(__bswapsi2) + +ENTRY(__bswapdi2) + rev r3, r0 + rev r0, r1 + mov r1, r3 + bx lr +ENDPROC(__bswapdi2) +#else +ENTRY(__bswapsi2) + eor r3, r0, r0, ror #16 + lsr r3, r3, #8 + bic r3, r3, #65280 @ 0xff00 + eor r0, r3, r0, ror #8 + mov pc, lr +ENDPROC(__bswapsi2) + +ENTRY(__bswapdi2) + mov ip, r1 + eor r3, ip, ip, ror #16 + eor r1, r0, r0, ror #16 + lsr r1, r1, #8 + lsr r3, r3, #8 + bic r3, r3, #65280 @ 0xff00 + bic r1, r1, #65280 @ 0xff00 + eor r1, r1, r0, ror #8 + eor r0, r3, ip, ror #8 + mov pc, lr +ENDPROC(__bswapdi2) +#endif