From patchwork Tue Nov 12 18:03:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?TcOlbnMgUnVsbGfDpXJk?= X-Patchwork-Id: 3174351 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 79C2D9F461 for ; Tue, 12 Nov 2013 18:04:45 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4C5E8205A8 for ; Tue, 12 Nov 2013 18:04:44 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E0E832057E for ; Tue, 12 Nov 2013 18:04:42 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgIKC-0007AU-NG; Tue, 12 Nov 2013 18:04:28 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgIKA-0006Z2-Az; Tue, 12 Nov 2013 18:04:26 +0000 Received: from unicorn.mansr.com ([2001:8b0:ca0d:8d8e::2]) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgIK6-0006Xt-N7 for linux-arm-kernel@lists.infradead.org; Tue, 12 Nov 2013 18:04:23 +0000 Received: by unicorn.mansr.com (Postfix, from userid 51770) id F2D5E15393; Tue, 12 Nov 2013 18:03:56 +0000 (GMT) From: =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= To: Nicolas Pitre Subject: Re: [PATCH v2] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions References: <1383951632-6090-1-git-send-email-sboyd@codeaurora.org> <528193A0.7050505@codeaurora.org> <20131112140436.GK16735@n2100.arm.linux.org.uk> <52823877.60806@codethink.co.uk> Date: Tue, 12 Nov 2013 18:03:56 +0000 In-Reply-To: (Nicolas Pitre's message of "Tue, 12 Nov 2013 09:55:09 -0500 (EST)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (gnu/linux) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131112_130423_172277_9EE97593 X-CRM114-Status: GOOD ( 18.78 ) X-Spam-Score: -1.9 (-) Cc: =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= , Russell King - ARM Linux , Stephen Boyd , linux-kernel@vger.kernel.org, Ben Dooks , Christopher Covington , Jean-Christophe PLAGNIOL-VILLARD , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Nicolas Pitre writes: > On Tue, 12 Nov 2013, Måns Rullgård wrote: > >> Nicolas Pitre writes: >> >> > On Tue, 12 Nov 2013, Ben Dooks wrote: >> > >> >> Given these are single instructoins for ARM, is it possible we could >> >> make a table of all the callers and fix them up when we initialise >> >> as we do for the SMP/UP case and for page-offset? >> > >> > Not really. Calls to those functions are generated by the compiler >> > implicitly when a divisor operand is used and therefore we cannot >> > annotate those calls. We'd have to use special accessors everywhere to >> > replace the standard division operand (like we do for 64 by 32 bit >> > divisions) but I doubt that people would accept that. >> >> It might be possible to extract this information from relocation tables. > > True, but only for individual .o files. Once the linker puts them > together the information is lost, and trying to infer what the linker > has done is insane. > > Filtering the compiler output to annotate idiv calls before it is > assembled would probably be a better solution. OK, here's an extremely ugly hootenanny of a patch. It seems to work on an A7 Cubieboard2. I would never suggest actually doing this, but maybe it can be useful for comparing performance against the more palatable solutions. diff --git a/arch/arm/Makefile b/arch/arm/Makefile index 7397db6..cf1cd30 100644 --- a/arch/arm/Makefile +++ b/arch/arm/Makefile @@ -113,7 +113,7 @@ endif endif # Need -Uarm for gcc < 3.x -KBUILD_CFLAGS +=$(CFLAGS_ABI) $(CFLAGS_THUMB2) $(arch-y) $(tune-y) $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) -msoft-float -Uarm +KBUILD_CFLAGS +=$(CFLAGS_ABI) $(CFLAGS_THUMB2) $(arch-y) $(tune-y) $(call cc-option,-mshort-load-bytes,$(call cc-option,-malignment-traps,)) -msoft-float -Uarm -include asm/divhack.h KBUILD_AFLAGS +=$(CFLAGS_ABI) $(AFLAGS_THUMB2) $(arch-y) $(tune-y) -include asm/unified.h -msoft-float CHECKFLAGS += -D__arm__ diff --git a/arch/arm/include/asm/divhack.h b/arch/arm/include/asm/divhack.h new file mode 100644 index 0000000..c750b78 --- /dev/null +++ b/arch/arm/include/asm/divhack.h @@ -0,0 +1,23 @@ +__asm__ (".macro dobl tgt \n" + " .ifc \\tgt, __aeabi_idiv \n" + " .L.sdiv.\\@: \n" + " .pushsection .sdiv_tab.init, \"a\", %progbits \n" + " .word .L.sdiv.\\@ \n" + " .popsection \n" + " .endif \n" + " .ifc \\tgt, __aeabi_uidiv \n" + " .L.udiv.\\@: \n" + " .pushsection .udiv_tab.init, \"a\", %progbits \n" + " .word .L.udiv.\\@ \n" + " .popsection \n" + " .endif \n" + " bl \\tgt \n" + ".endm \n" + ".macro defbl \n" + " .macro bl tgt \n" + " .purgem bl \n" + " dobl \\tgt \n" + " defbl \n" + " .endm \n" + ".endm \n" + "defbl \n"); diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 067815c1..b3a3fe1 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -375,6 +375,18 @@ static void __init cpuid_init_hwcaps(void) case 1: elf_hwcap |= HWCAP_IDIVT; } + + if (!IS_ENABLED(CONFIG_THUMB2_KERNEL) && (elf_hwcap & HWCAP_IDIVA)) { + extern u32 __sdiv_tab_start, __sdiv_tab_end; + extern u32 __udiv_tab_start, __udiv_tab_end; + u32 *div; + + for (div = &__sdiv_tab_start; div < &__sdiv_tab_end; div++) + *(u32 *)*div = 0xe710f110; + + for (div = &__udiv_tab_start; div < &__udiv_tab_end; div++) + *(u32 *)*div = 0xe730f110; + } } static void __init feat_v6_fixup(void) diff --git a/arch/arm/kernel/vmlinux.lds.S b/arch/arm/kernel/vmlinux.lds.S index 43a31fb..3d5c103 100644 --- a/arch/arm/kernel/vmlinux.lds.S +++ b/arch/arm/kernel/vmlinux.lds.S @@ -176,6 +176,8 @@ SECTIONS CON_INITCALL SECURITY_INITCALL INIT_RAM_FS + __sdiv_tab_start = .; *(.sdiv_tab.init); __sdiv_tab_end = .; + __udiv_tab_start = .; *(.udiv_tab.init); __udiv_tab_end = .; } #ifndef CONFIG_XIP_KERNEL .exit.data : {