From patchwork Tue Nov 12 14:01:16 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Pitre X-Patchwork-Id: 3172711 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 87D5B9F68F for ; Tue, 12 Nov 2013 14:02:04 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id E44A420190 for ; Tue, 12 Nov 2013 14:01:59 +0000 (UTC) Received: from casper.infradead.org (casper.infradead.org [85.118.1.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 578872018E for ; Tue, 12 Nov 2013 14:01:58 +0000 (UTC) Received: from merlin.infradead.org ([2001:4978:20e::2]) by casper.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgEXJ-0003OY-Dl; Tue, 12 Nov 2013 14:01:45 +0000 Received: from localhost ([::1] helo=merlin.infradead.org) by merlin.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgEXG-0006dp-S6; Tue, 12 Nov 2013 14:01:42 +0000 Received: from mail-qc0-f175.google.com ([209.85.216.175]) by merlin.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1VgEXE-0006ch-Dp for linux-arm-kernel@lists.infradead.org; Tue, 12 Nov 2013 14:01:41 +0000 Received: by mail-qc0-f175.google.com with SMTP id e16so5244884qcx.34 for ; Tue, 12 Nov 2013 06:01:18 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version:content-type; bh=uqeXJut+Fgvwygo3Shxdwp6yDhOZAmFt3EoyIz1H2ys=; b=KQjusJMdx0ry0hS364KqaQ1lOewcpqzcu8YOCn3z0AZqG6MGPE2tH33grcX0AKgPuY KLCkJRM4bUNhTl/esctK5Zxx1aScbp9dh1sXzBMOkxKO57Jl/DpAIhUgw6N6FD4N8s+y B1CFd3WZyNlzo+1YEEOVbRJWWL/wzSuWCYOLcl8CTK7imV1f+EsUs7A+ZIWv4SFdT+F/ yR/7vIaAPhoc3qrZ8Y8EVwVFn0k7l1srI7gdCZFt0AdZk7c9c7/KGtI9KL1icDOKNTvv rgN67+LjAGmalBn8p+CHmIYTIH05gezAtL22UaevuGupE6hYvVnshgu0ZocQxsj/2J4A lX8w== X-Gm-Message-State: ALoCoQnmbx7KfSMtSbr5uv/jqAUcfT/65VsYZBMmtnE4NkaGFXFe4zHcRK82OHpAkdvL+U6Mhc/W X-Received: by 10.49.116.210 with SMTP id jy18mr57011714qeb.65.1384264878809; Tue, 12 Nov 2013 06:01:18 -0800 (PST) Received: from xanadu.home (modemcable177.143-130-66.mc.videotron.ca. [66.130.143.177]) by mx.google.com with ESMTPSA id kz8sm62170746qeb.0.2013.11.12.06.01.17 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 12 Nov 2013 06:01:18 -0800 (PST) Date: Tue, 12 Nov 2013 09:01:16 -0500 (EST) From: Nicolas Pitre To: Stephen Boyd Subject: Re: [PATCH v2] ARM: Use udiv/sdiv for __aeabi_{u}idiv library functions In-Reply-To: <528193A0.7050505@codeaurora.org> Message-ID: References: <1383951632-6090-1-git-send-email-sboyd@codeaurora.org> <528193A0.7050505@codeaurora.org> User-Agent: Alpine 2.10 (LFD 1266 2009-07-14) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20131112_090140_529304_A70B5FD6 X-CRM114-Status: GOOD ( 27.48 ) X-Spam-Score: -2.6 (--) Cc: =?ISO-8859-15?Q?M=E5ns_Rullg=E5rd?= , Russell King - ARM Linux , linux-kernel@vger.kernel.org, Christopher Covington , Jean-Christophe PLAGNIOL-VILLARD , linux-arm-kernel@lists.infradead.org X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Mon, 11 Nov 2013, Stephen Boyd wrote: > On 11/09/13 21:03, Nicolas Pitre wrote: > > Bah..... NAK. We are doing runtime patching of the kernel for many > > many things already. So why not do the same here? > > static keys are a form of runtime patching, albeit not as extreme as > you're suggesting. > > > > > The obvious strategy is to simply overwrite the start of the existing > > __aeabi_idiv code with the "sdiv r0, r0, r1" and "bx lr" opcodes. > > > > Similarly for the unsigned case. > > I was thinking the same thing when I wrote this, but I didn't know how > to tell the compiler to either inline this function or to let me inilne > an assembly stub with some section magic. > > > > > That let you test the hardware capability only once during boot instead > > of everytime a divide operation is performed. > > The test for hardware capability really isn't done more than once during > boot. The assembly is like so at compile time > > 00000000 <__aeabi_idiv>: > 0: nop {0} > 4: b 0 <___aeabi_idiv> > 8: sdiv r0, r0, r1 > c: bx lr > > and after we test and find support for the instruction it will be > replaced with > > 00000000 <__aeabi_idiv>: > 0: b 8 > 4: b 0 <___aeabi_idiv> > 8: sdiv r0, r0, r1 > c: bx lr > > Unfortunately we still have to jump to this function. It would be great > if we could inline this function at the call site but as I already said > I don't know how to do that. What about this patch which I think is currently your best option. Note it would need to use the facilities from asm/opcodes.h to make it endian agnostic. Nicolas diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c index 6a1b8a81b1..379cffe4ab 100644 --- a/arch/arm/kernel/setup.c +++ b/arch/arm/kernel/setup.c @@ -383,6 +383,34 @@ static void __init cpuid_init_hwcaps(void) elf_hwcap |= HWCAP_IDIVT; } + /* + * Patch our division routines with the corresponding opcode + * if the hardware supports it. + */ + if (IS_ENABLED(CONFIG_THUMB2_KERNEL) && (elf_hwcap & HWCAP_IDIVT)) { + extern char __aeabi_uidiv, __aeabi_idiv; + u16 *uidiv = (u16 *)&__aeabi_uidiv; + u16 *idiv = (u16 *)&__aeabi_idiv; + + uidiv[0] = 0xfbb0; /* udiv r0, r0, r1 */ + uidiv[1] = 0xf0f1; + uidiv[2] = 0x4770; /* bx lr */ + + idiv[0] = 0xfb90; /* sdiv r0, r0, r1 */ + idiv[1] = 0xf0f1; + idiv[2] = 0x4770; /* bx lr */ + } else if (!IS_ENABLED(CONFIG_THUMB2_KERNEL) && (elf_hwcap & HWCAP_IDIVA)) { + extern char __aeabi_uidiv, __aeabi_idiv; + u32 *uidiv = (u32 *)&__aeabi_uidiv; + u32 *idiv = (u32 *)&__aeabi_idiv; + + uidiv[0] = 0xe730f110; /* udiv r0, r0, r1 */ + uidiv[1] = 0xe12fff1e; /* bx lr */ + + idiv[0] = 0xe710f110; /* sdiv r0, r0, r1 */ + idiv[1] = 0xe12fff1e; /* bx lr */ + } + /* LPAE implies atomic ldrd/strd instructions */ vmsa = (read_cpuid_ext(CPUID_EXT_MMFR0) & 0xf) >> 0; if (vmsa >= 5)